Ashish Bandhu
Ashish Bandhu

Reputation: 25

How to reshape from long to wide

I have a single variable in Stata called random_variable containing random numbers, say, from 1 to 100 but may not be sequentially ( maybe 1, 7, 2). I want to create 15 new variables in the data set each one containing the first 7 (7 may be arbitrary, example is for weekdays) entries from random_variable except the 15th one which will contain only 2 entries. (14*7=98 + 2 in last column)

Upvotes: 0

Views: 80

Answers (1)

Matthew G
Matthew G

Reputation: 111

You can accomplish this by creating a unique identifier for each observation within each day. Setting up some random data, we have:

*create data
clear
set obs 100
gen random_variable = _n

*randomly sort
set seed 12345
gen sortorder = runiform()
sort sortorder
drop sortorder

*create groups using the mod function
gen day = mod(_n-1, 7) + 1

The random_variable variable is a non-sequential number between 1 and 100, and the day variable is the identifier of a day numbered 1 to 7. You could change the number of groups by editing the second argument of the mod function. Next, create an id variable for each observation within each day.

*create a sequential id for each value for each day
sort day
gen id = .
replace id = 1 if day != day[_n-1]
replace id = id[_n-1] + 1 if id == .

This sorts the data, identifies the first observation in each day grouping based on the day variable, and then creates an id for each observation in the day grouping. Finally, you can reshape the data using the day and id variables:

reshape wide random_variable, i(day) j(id)

This yields 16 columns: the 15 columns you requested as well as 1 column to identify the original day that each value was observed.

Upvotes: 1

Related Questions