Reputation: 3
The data is as follows:
data have;
input id exercise_date:mmddyy10. total_shares_exercised;
format exercise_date date9.;
datalines;
00001 01/05/2001 2350
00001 01/05/2002 2350
00001 01/05/2003 2350
00001 01/05/2004 2350
;
run;
I need to get it into this format:
ID Date1 Date2 Date3... Shares_exercised_date1
Shares_exercised_date2...
00001 1/05/2001 1/05/2002 1/05/2003 2350
2350
I currently have
data want;
set have;
by ID;
array array[40] date_array1-date_array40;
retain counter;
do;
if first.ID then counter = 1;
else counter = counter + 1;
if last.ID;
array[counter]=date;
end;
run;
This gives me the values for the last date but not for the first date or anything in between. Again, dont use proc transpose please!
Upvotes: 0
Views: 122
Reputation: 51566
Why do you have that DO/END block? That is not really doing anything.
Why are you stopping the data step iteration before writing the date into the array? Only the last observation for an ID will execute that assignment statement. Move the subsetting IF statement after the assignment statement.
To do it properly you need to know the maximum number of dates that appear for any given ID. You could just use some arbitrary upper bound, like the 40 in your example code. But then what do you want to do when some ID has more?
Here is one simple method. Place the DO loop around the SET statement.
data want;
do counter=1 to 40 until(last.id);
set have;
by ID;
array x date1-date40 ;
x[counter] = exercise_date ;
end;
format date1-date40 date9.;
drop exercise_date;
run;
Now if some ID has 50 dates instead of 40 or less it will get two observations. One with the first 40 dates and a second with the next 10.
You could also use an extra pass through the data to find the maximum number of dates per id and put it into a macro variable.
proc sql noprint;
select max(count) format=32. into :ndates trimmed
from (select id,count(*) as count from have group by 1)
;
quit;
And then use the data step above replacing the 40 text everywhere it appears with &ndates.
to make the array's upper bound match the input dataset.
Upvotes: 2
Reputation: 12909
Well that's rude of them to say you can't use proc transpose
. Challenge: accepted. You had the right idea, but you did not retain your array. The code below will transpose it.
data want;
set have;
by id exercise_date;
array date[4];
array shares_exercised_date[4];
retain date: shares_exercised:;
if(first.id) then n = 0;
n+1;
date[n] = exercise_date;
shares_exercised_date[n] = total_shares_exercised;
if(last.id);
format date: date9.;
keep id date: shares_exercised:;
run;
FYI your data shows 01/05/2003 twice, but your output shows 01/05/2002. This means you either need to pre-process your data first so that there are no duplicates or there's a typo.
In the future, please post your data in datalines
or another easy-to-read format rather than an image.
Upvotes: 1