M Darblade
M Darblade

Reputation: 343

Counting number of ovservations with proc transpose

In SAS, i have a table like this :

ID   ACTE 
1    A   
2    B  
1    A  
1    A  
2    C  
1    B  

and I would like to have the following table :

ID    A    B    C
 1    3    1    0
 2    0    1    1

The second table is a table with the ID and, for each value possible in the column ACTE, the amount of time it occured for this ID.

I would like to know if it is possible to do it using proc transpose.

I tried this code but it produce errors:

proc transpose data=OriginalTable out=TestTranspose;
    by ID;
    var Acte;
    id Acte;
 run;

Upvotes: 4

Views: 3617

Answers (2)

Longfish
Longfish

Reputation: 7602

A variation of @user102890 answer. I've used Proc Freq to get the counts, then transposed to a wide dataset. The sparse option gives you zeros instead of missing values where the combination of id and acte doesn't exist.

data have;
input ID   ACTE $;
datalines;
1    A 
2    B 
1    A 
1    A 
2    C 
1    B 
;
run;

proc freq data=have noprint;
table ID*ACTE / out=temp sparse;
run;

proc transpose data=temp out=want (drop=_:);
by ID;
id ACTE;
var COUNT;
run;

Upvotes: 3

user1509107
user1509107

Reputation:

You could use an intermediate step which calculates the counts and pass that to proc transpose. Ex:

PROC SQL;
CREATE VIEW OriginalTable_v as 
SELECT
ID
,ACTE 
,COUNT(*) AS FREQ
FROM OriginalTable
GROUP BY ID ,ACTE 
ORDER BY ID ,ACTE ;
QUIT;

proc transpose data=OriginalTable_v out=TestTranspose;
    by ID;
    var FREQ;
    id Acte;
 run;

Note: the above approach, will return a missing value if a ID & ACTE combination does not exist in the data. In other words, the output above will be identical to your desired output except it will have missing values where you have 0s.

Another way you can deal with this problem in a single step- assuming you know the categories in ACTE before hand and they are only a small no. of them would be as follows:

PROC SQL;
CREATE table TestTranspose2 as 
SELECT
ID
,sum(case when acte='A' then 1 else 0 end) as count_A
,sum(case when acte='B' then 1 else 0 end) as count_B
,sum(case when acte='C' then 1 else 0 end) as count_C
,sum(case when acte NOT IN ('A', 'B', 'C') then 1 else 0 end) as count_Other
,COUNT(*) AS FREQ_of_ids
FROM OriginalTable
GROUP BY ID
ORDER BY ID ;
QUIT;

Above, produces identical output to your's with 0s where an ID & ACTE combo does not exist.

Upvotes: 4

Related Questions