A.Gue
A.Gue

Reputation: 23

Reshaping long from long

I am trying to reshape a dataset in Stata to a long format.

However, the original dataset is already in a long format with multiple observations for each given individual. Each row contains all tests taken in a given year by a given student.

What I want is each line to contain data from a specific test in a given year.

For example, from this dataset:

  +---------------------------------------------------------------------------+
  | student_id | year | score_math | date_math | score_english | date_english |
  |------------+------+------------+-----------+---------------+--------------|
  |        111 | 2011 |          . |           |             . |              |
  |        111 | 2013 |        259 |       apr |           250 |          apr |
  |        222 | 2012 |        645 |       mar |           645 |          mar |
  |        222 | 2014 |        640 |       dec |             . |              |
  |        333 | 2016 |          . |           |           358 |          apr |
  |        333 | 2017 |          . |           |           299 |          jan |
  +---------------------------------------------------------------------------+

I would like to get the one below:

  +----------------------------------------------+
  | student_id | year | test_name | score | date |
  |------------+------+-----------+-------+------|
  |        111 | 2011 |   english |     . |      |
  |        111 | 2011 |      math |     . |      |
  |        111 | 2013 |   english |   250 |  apr |
  |        111 | 2013 |      math |   259 |  apr |
  |        222 | 2012 |   english |   645 |  mar |
  |        222 | 2012 |      math |   645 |  mar |
  |        222 | 2014 |   english |     . |      |
  |        222 | 2014 |      math |   640 |  dec |
  |        333 | 2016 |   english |   358 |  apr |
  |        333 | 2016 |      math |     . |      |
  |        333 | 2017 |   english |   299 |  jan |
  |        333 | 2017 |      math |     . |      |
  +----------------------------------------------+

I have tried running the following command but it does not work because the data is long:

reshape long score* date*, i(student_id) j(test_name)

What I get is the following message:

variable test_name contains all missing values

So I tried instead:

reshape wide score* date*, i(student_id) j(year)

Next, to 'pretend' my data was long (so that I could be able to reshape long) I used:

egen new_id = group(student_id year)

After having a unique new_id for each observation I then tried:

reshape long score* date*, i(new_id) j(test_name)

This did not work either.

Any suggestions on how I could get the desired output?

Upvotes: 1

Views: 430

Answers (1)

user8682794
user8682794

Reputation:

The following produces the desired output:

clear

input student_id year score_math str3 date_math score_english str3 date_english
111 2011 . " " . " "
111 2013 259 "apr" 250 "apr"
222 2012 645 "mar" 645 "mar"
222 2014 640 "dec" . " "
333 2016 . " " 358 "apr"
333 2017 . " " 299 "jan"
end

reshape long score_ date_, i(student_id year) j(test_name) string
rename *_ *

list, sepby(student_id) abbreviate(30)

     +----------------------------------------------+
     | student_id   year   test_name   score   date |
     |----------------------------------------------|
  1. |        111   2011     english       .        |
  2. |        111   2011        math       .        |
  3. |        111   2013     english     250    apr |
  4. |        111   2013        math     259    apr |
     |----------------------------------------------|
  5. |        222   2012     english     645    mar |
  6. |        222   2012        math     645    mar |
  7. |        222   2014     english       .        |
  8. |        222   2014        math     640    dec |
     |----------------------------------------------|
  9. |        333   2016     english     358    apr |
 10. |        333   2016        math       .        |
 11. |        333   2017     english     299    jan |
 12. |        333   2017        math       .        |
     +----------------------------------------------+

Upvotes: 1

Related Questions