How to compute the mean over rows till a variable changes and repeat?

Question

Given a very huge table of the following format (e.g. snippet):

Subject, Condition, VPH, Task, Round, Item, Decision, Self, Other, RT 1, 1, 1, SVO, 0, 0, 4, 2.5, 2.0, 8.598 1, 1, 1, SVO, 1, 5, 3, 4.1, 3.4, 7.785 1, 1, 1, SVO, 2, 4, 3, 3.2, 3.4, 15.713 2, 2, 1, SVO, 0, 0, 4, 2.5, 2.0, 15.439 2, 2, 1, SVO, 1, 2, 7, 4.9, 2.3, 30.777 2, 2, 1, SVO, 2, 3, 8, 4.3, 4.3, 13.549 3, 3, 1, SVO, 0, 0, 5, 2.8, 1.5, 9.066 ... (And so on)

Needed: Compute the mean over all rounds for self and others for each subject.

What i have so far: I sorted the about 100mb .txt file using bash sort so the subject and the related rounds appear after each other (like the example shows). After that i imported the .txt file into SPSS24. Right now i have no idea to write a function that computes for each subject the mean of variable self and others over the three rounds. E.g.: (some pseudo-code)

for n = 1 to last_subject do: get row self where lines have line_subject as n compute mean over these content write result as new variable self_mean as new variable after variabel RT at line n increase n by one

As i am totally new to SPSS i really appreciate detailed help. I am also satisfied with references that specifically attend to computation over rows (i found lots of stuff over columns).

Thank you very much!

Edit: example output After computing the table should look like this:

Subject, Mean_Self, Mean_Others
 1,       3.27,      2.9
 2,       ...,       ...
 3,

 ...

(And so on) So now we computed the Mean_Self from the top example like so: mean(2.5 + 4.1 + 3.2) where: 2.5 was used from line 1 of Variable Self 4.1 was used from line 2 of Variable Self 3.2 was used from line 3 of Variable Self

2.5 was not used from line 4 of Variable Self because Variable Subject changed, there for we want to repeat the process with the new Subject (here 2) until it changes again. The results should create a table like the one above. Same procedure for Variable Other.

eli-k · Accepted Answer

If I understand right what you need is the aggregate command. aggregate can create a new dataset/file with your aggregated data, or add the aggregated data to your active dataset, like you described above:

AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=Subject
  /Self_mean=MEAN(Self) 
  /Other_mean=MEAN(Other).

In order to get the new variables in a new, separate tabe, look up other AGGREGATE options, e.g. /OUTFILE=* (removing MODE=ADDVARIABLES) will result in the new aggregated data replacing the original file in the window, while /OUTFILE="path/filename" will save the aggregated data to a file.

How to compute the mean over rows till a variable changes and repeat?

Answers (1)

Related Questions