Reputation: 11

How to merge and duplicate observations and variables in Stata?

I need to convert my data set to a panel dataset, but I couldn't find anything how to do it both in Stata help and Google. My data looks like this:

[A] [B] [C] [D] [E]

[1] [1] [0] [10] [12]

[2] [0] [0] [13] [14]

[3] [1] [1] [15] [17]

A is student id and D and E are their test scores in two different years. So, I need the data to look like this:

[A] [B] [C] [(D and E)]

[(D)1] [1] [0] [10]

[(E)1] [1] [0] [12]

[(D)2] [0] [0] [13]

[(E)2] [0] [0] [14]

[(D)3] [1] [1] [15]

[(E)3] [1] [1] [17]

Upvotes: 0

Answers (1)

Nick Cox

Reputation: 37278

It's a good idea to skim the headings of the data management manual, [D] or https://www.stata.com/manuals/d.pdf, to find relevant commands. The immediate small problem here is poorly chosen variable names -- at least in your data example; we can't tell if you're using more sensible names in your real data. Then your new data layout is a simple application of reshape long.

clear 
input A  B  C  D  E
1  1  0  10 12
2 0 0 13 14
3 1 1 15 17
end 

rename A id 
rename (D E) (mark2015 mark2016) 

reshape long mark, i(id) j(year) 

list, sepby(id) 

     +--------------------------+
     | id   year   B   C   mark |
     |--------------------------|
  1. |  1   2015   1   0     10 |
  2. |  1   2016   1   0     12 |
     |--------------------------|
  3. |  2   2015   0   0     13 |
  4. |  2   2016   0   0     14 |
     |--------------------------|
  5. |  3   2015   1   1     15 |
  6. |  3   2016   1   1     17 |
     +--------------------------+

Suitable variable names for time-dependent data will have a common prefix and a numeric suffix giving time, say mark and 2015 and 2016.

Upvotes: 1