Reputation: 57
I'm running into some issues while trying to reshape
a data set from long
to wide
. Here's an example, since I think that explains it best:
Say I wanted to take this long
data set...
|study_id |event_date |code |
|--------------------------------|
|1 |09 June 15 |546 |
|1 |09 June 15 |643 |
|2 |23 May 13 |324 |
|2 |12 May 13 |435 |
And shape it into a wide
one like this...
|study_id |event_date_1 |event_date_1_code1 |event_date_1code2| event_date_2 |event_date_2_code1 | event_date_2_code2|
|-------------------------------------------------------------------------------------------------------------------------|
|1 |09 June 15 |546 |643 | | | |
|2 |23 May 15 |324 | |12 May 13 |435 | |
What would be the best method of doing this? I imagine I would have to create some sort of j
variable, but am not certain how to make it so each event_date
could have multiple codes
, and each study_id
multiple event_date
s.
I already tried doing making a j
variable and reshaping, using the following code:
//Sort by id (just in case)
sort study_id event_date code
//Create j variable
quietly by study_id: gen code_num = cond(_N==1, 1, _n)
//Reshape data
reshape wide event_date code, i(study_id) j(code_num)
This, however, did not account for each event_date having multiple potential codes.
I am attempting to convert the data to wide so that I can merge it with another wide data set, and then run analysis over both. An observation in either set is an unique study_id.
Upvotes: 1
Views: 983
Reputation:
Let me start by saying that I would not ever choose to organize my data in the requested fashion, so this should not be taken as support for doing so.
Having said that, something like the following seems to do the trick. The data is similar yours but I'm too lazy to deal with full dates, I just read in the day of the month. I'm posting this as a curiosity, because I've never before seen a need to do reshape wide
twice in succession.
clear
input study_id date code
1 09 546
1 09 643
2 23 324
2 12 435
end
list
bysort study_id date (code): generate codenum = _n
reshape wide code, i(study_id date) j(codenum)
rename code* code_*_
list
bysort study_id (date): generate eventnum = _n
reshape wide date code_*, i(study_id) j(eventnum)
list
Upvotes: 2