How to align data that are on a diagonal in SAS

Question

I'm not sure of the best way to describe this, and I'll admit that the code I wrote to recreate the problem in a smaller format isn't quite accurate.

I have 7 data sets that have the same number of columns (122) but a different number of rows. The labels for these columns are identical except for an underscore and an integer. Example: first column of each data set is "study_id_1" "study_id_2" ... "study_id_7"

I am trying to stack each of these data sets, in numerical order, on top of each other AND drop the underscore and integer.

However, if I use this code, all of the values are in chunks but along a diagonal.

data all;
set PT_BS1_all PT_BS2_all PT_BS3_all PT_BS4_all PT_BS5_all PT_BS6_all PT_BS7_all;
run;

The following code (written in SAS Studio) pretty much illustrates the problem and the "diagonal." However, in my actual data (working in SAS EG), all of the missing values are periods, regardless of variable type. In the example below, I could only get periods to appear for missing values for the numerical variables.

data have;
input study_id_1 $ variable1_1 $ variable2_1 variable3_1 study_id_2 $ variable1_2 $ variable2_2 variable3_2 study_id_3 $ variable1_3 $ variable2_3 variable3_3;
cards;
A treatment 35 24 . . . . . . . .
B placebo 24 44 . . . . . . . .
C treatment 66 77 . . . . . . . .
D placebo 73 45 . . . . . . . .
. . . . A treatment 23 34 . . . .
. . . . B placebo 43 56 . . . .
. . . . C treatment 34 34 . . . .
. . . . D placebo 54 67 . . . .
. . . . . . . . A treatment 22 66
. . . . . . . . B placebo 33 67
. . . . . . . . C treatment 23 48
. . . . . . . . D placebo 69 70
;
run;

proc print data=have;
run;

data want;
input study_id $ variable1 $ variable2 variable3;
cards;
A treatment 35 24
B placebo 24 44
C treatment 66 77
D placebo 73 45
A treatment 23 34
B placebo 43 56
C treatment 34 34
D placebo 54 67
A treatment 22 66
B placebo 33 67
C treatment 23 48
D placebo 69 70
;
run;

proc print data=want;
run;

I hope I've described the problem sufficiently and thanks for any help.

Richard · Accepted Answer

The first non-missing from a list of values is returned by the COALESCE and COALESCEC functions.

A list of variables is very simple in your data set because alike variables have a common prefix (and 1,2,3 suffixes). The syntax for specifying the alike variables is :

Example:

data want;
  set have;

* coalesce during stacking;
* set PT_BS1_all PT_BS2_all PT_BS3_all PT_BS4_all PT_BS5_all PT_BS6_all PT_BS7_all;

  length study_id $8 variable1 $9;

  study_id  = coalesceC(of study_id_:);
  variable1 = coalesceC(of variable1_:);
  variable2 = coalesce (of variable2_:);
  variable3 = coalesce (of variable3_:);

  drop study_id_: variable1_: variable2_: variable3_:;
run;

How to align data that are on a diagonal in SAS

Answers (2)

Related Questions