How to split data in SPSS based on percentage

Question

I have a 7G file in SPSS format. It has some survey data and has comment level scores and sentence level scores. One comment can have multiple sentences, and one survey has up to 4 comments.

I am trying to do random sampling in SPSS so I can use the smaller file in R, but if I do Simple Random Sampling then I am not able to keep the whole survey and comment together.

What I want is to take a sample from this big file and only pick 5% of the surveyIds, so the rows for the whole survey stays together.

Surv_ID  Sentence_ID Comment_ID Sentence_Score Comment_Score
A001         001       1            3.5             2
A001         002       1            2.8             2
A001         001       2            1.4            -1
A001         002       2           -2.9            -1
A001         003       2           -3.1            -1
A002         001       1            2.3             3
A002         002       1            4.3             3
A002         001       2            1.2             1
A002         002       2            0.85            1
A002         003       2            0.79            1
A002         001       3            3.5             2
A002         002       3           -3.1             2
A002         003       3            2.8             2
A003         001       1             1              1
A003         001       2           -0.9            -3
A003         002       2           -4.3            -3
A003         003       2           -4.0            -3
A003         001       3            3.4             3
A003         002       3            4.4             3
A003         001       4            2.8             2

How to split data in SPSS based on percentage

Answers (1)

Related Questions