Reputation: 619
I am supposed to create a summary data set containing the mean, median, and standard deviation broken down by gender and group (using the CLASS statement). Using this summary data set, create four other data sets (in one DATA step) as follows:
(1) grand mean (2) stats broken down by gender (3) stats broken down by group (4) stats broken down by gender and group
Given the hint to use the CHARTYPE option.
I provided my attempted solution, but I don't think I did it in the way asked.
DATA CLINICAL;
*Use LENGTH statement to control the order of
variables in the data set;
LENGTH PATIENT VISIT DATE_VISIT 8;
RETAIN DATE_VISIT WEIGHT;
DO PATIENT = 1 TO 25;
IF RANUNI(135) LT .5 THEN GENDER = 'Female';
ELSE GENDER = 'Male';
X = RANUNI(135);
IF X LT .33 THEN GROUP = 'A';
ELSE IF X LT .66 THEN GROUP = 'B';
ELSE GROUP = 'C';
DO VISIT = 1 TO INT(RANUNI(135)*5);
IF VISIT = 1 THEN DO;
DATE_VISIT = INT(RANUNI(135)*100) + 15800;
WEIGHT = INT(RANNOR(135)*10 + 150);
END;
ELSE DO;
DATE_VISIT = DATE_VISIT + VISIT*(10 + INT(RANUNI(135)*50));
WEIGHT = WEIGHT + INT(RANNOR(135)*10);
END;
OUTPUT;
IF RANUNI(135) LT .2 THEN LEAVE;
END;
END;
DROP X;
FORMAT DATE_VISIT DATE9.;
RUN;
PROC MEANS DATA=CLINICAL;
CLASS GENDER GROUP;
OUTPUT OUT=SUMMARY
MEAN=
MEDIAN=
STDDEV= / AUTONAME;
RUN;
Upvotes: 1
Views: 107
Reputation: 63424
No, what they're asking you to do is:
OUTPUT
statement in PROC MEANS
to create a summary dataset. Choose the appropriate TYPES
and CLASS
values in PROC MEANS
such that all four sets of data are represented on the output.data
statement, selectively output those rows to the correct dataset. You would use the _TYPE_
variable to determine which dataset a row would be output to.CHARTYPES
just means your _TYPE_
variable will look like 1001
instead of 9
(the binary representation, basically). 1001
indicates which class variable is used (the first and the fourth) to create that breakout. (With only two class variables, you would have values 00
, 01
, 10
, 11
possible). This is sometimes easier for non-programmers who aren't used to thinking in binary (these values would be 0
, 1
, 2
, and 3
in decimal without CHARTYPES
and thus might be more difficult for you to tell which corresponds to which variable).
Upvotes: 1