George
George

Reputation: 189

Using SPSS Reference Variables and an Index to Create a New Variable

Essentially, I have a log which contains a Unique identifier for a subject which is tracked through multiple cases. I then used the following code, suggested previously through the great community here, to create an Index. Unfortunately, I've run into a new challenge that I can't seem to figure out. Here is a sample of the current data set to provide perspective.

Indexing function

sort cases by Unique_Modifier.
if $casenum=1 or Unique_Modifier<>lag(Unique_Modifier) Index=1.
if Unique_Modifier=lag(Unique_Modifier) Index=lag(Index)+1.
format Index(f2).
execute. 

Unique Identifier   Index   Variable of interest
A                    1          101
A                    2          101
A                    3          607
A                    4          607
A                    5          101
A                    6          101
B                    1          108
B                    2          210
C                    1          610
C                    2          987
C                    3         1100
C                    4          610

What I'd like to do is create a new variable which contains the number of discrete, different entries in the variable of interest column. The expected output would be as the following:

Unique Identifier   Index   Variable of interest    Intended Output
A                       1               101               1
A                       2               101               1
A                       3               607               2
A                       4               607               2
A                       5               101               2
A                       6               101               2
B                       1               108               1
B                       2               210               2
C                       1               610               1
C                       2               987               2
C                       3               1100              3
C                       4               610               3

I've tried a few different ways to do it, one was to use a similar index function, but it fails as if the variable of interest is different in subsequent lines, it works but, sometimes, we have a recurrence of a variable like 5 lines later. My next idea was to use the AGGREGATE function, but I looked through the IBM manual and it doesn't seem like there is a function within aggregate that would produce the intended output here. Anyone have any ideas? I think a loop is the best bet, but loops within SPSS are a bit funky and hard to get working.

Upvotes: 1

Views: 986

Answers (1)

eli-k
eli-k

Reputation: 11310

Try this:

data list list/Unique_Identifier   Index   VOI (3f)  .
begin data.
1                       1               101               
1                       2               101               
1                       3               607               
1                       4               607               
1                       5               101               
1                       6               101               
2                       1               108               
2                       2               210               
3                       1               610               
3                       2               987               
3                       3               1100             
3                       4               610               
end data.

string voiT (a1000).
compute voiT=concat(ltrim(string(VOI,f10)),",").
compute Intended_Output=1.
do if index>1.
   do if index(lag(voiT), rtrim(voiT))>0.
      compute Intended_Output=lag(Intended_Output).
      compute voiT=lag(voiT).
   else.
      compute Intended_Output=lag(Intended_Output)+1.
      compute voiT=concat(rtrim(lag(voiT)), rtrim(voiT)).
   end if.
end if .
exe.

Upvotes: 1

Related Questions