Ahmad Yacout
Ahmad Yacout

Reputation: 35

How can I reshape my data with two j variables one of which is a string?

My data looks a bit like this. The first row is variable names and the rest are my values. year describes the year of an election, nutsid is a sort of regional identifier and nutsname is the name of the region. Now I want to focus on the next three: spo, ovp, and fpo are the names of parties in the election. I want to combine them all under one variable called party and keep the long format I have now.

   | year     nutsid    nutsname        spo      ovp      fpo|
1. | 2008     AT11      Burgenland    73565    52531    29812|
2. | 1990     AT11      Burgenland    88547    62675    19508|
3. etc

What I have tried so far is to use the reshape command to reshape it all into wide format first and then renaming them something like p_spo, p_ovp, p_fpo using

reshape long p_, i(nutsid) j(year party) string

I can't say that this was a smart idea or that it worked because it just gives me a new id called year with the value "party" written under it over and over again.

But I was wondering if there was another command I should be using to get from what I have to:

   | year     nutsid    nutsname      party  votes|
1. | 2008     AT11      Burgenland    spo    73565|    
2. | 2008     AT11      Burgenland    ovp    52531|    
3. | 2008     AT11      Burgenland    fpo    29812|    
4. | 1990     AT11      Burgenland    spo    88547|
5. | 1990     AT11      Burgenland    ovp    62675|
6. | 1990     AT11      Burgenland    fpo    19508|
7. etc

Upvotes: 0

Views: 411

Answers (1)

Nick Cox
Nick Cox

Reputation: 37183

Some minor details here are confused or unclear:

  1. In Stata variable names are not to be considered or described as the "first row" of the data, although they will appear as headers in (e.g.) the Data Editor. Stata is not a spreadsheet application.

  2. The reshape command you mention requires that spo ovp fpo be renamed to p_spo p_ovp p_fpo before the reshape; this renaming does not follow the reshape.

  3. Exactly what you did is unclear as you give only part of your syntax.

That said, what you want is a simple reshape:

clear 
input year str4 nutsid str10  nutsname        spo      ovp      fpo
2008     AT11      Burgenland    73565    52531    29812
1990     AT11      Burgenland    88547    62675    19508
end 
rename (spo ovp fpo) (votes=) 
reshape long votes, i(nutsid year) j(party) string 
list, sepby(nutsid year) 

     +--------------------------------------------+
     | nutsid   year   party     nutsname   votes |
     |--------------------------------------------|
  1. |   AT11   1990     fpo   Burgenland   19508 |
  2. |   AT11   1990     ovp   Burgenland   62675 |
  3. |   AT11   1990     spo   Burgenland   88547 |
     |--------------------------------------------|
  4. |   AT11   2008     fpo   Burgenland   29812 |
  5. |   AT11   2008     ovp   Burgenland   52531 |
  6. |   AT11   2008     spo   Burgenland   73565 |
     +--------------------------------------------+

In this view of the data, you have two so-called i variables and one j variable.

Note the use here of input code to give a data example that will run without the engineering that your example requires. You can install the command dataex using ssc inst dataex to make this easy on yourself too.

Upvotes: 1

Related Questions