CJ12
CJ12

Reputation: 495

Stata - Reshaping data from long to wide for analysis

I have data in the following form

firm    month_year    sales    competitor   competitor_location      competitor_branch_1       competitor_branch_2
  1       1_2014    25          XYZ             US                      EEE                       RRR
  1       2_2014    21          XYZ             US                      FFF
  1       2_2014    21          ABC             UK                      GGG
 ...
  21     1_2009    11          LKS            UK                       AAA
  21     1_2009    11          AIS            UK                       BBB
  21     1_2009    11          AJS            US                       CCC
  21     2_2009    12          LKS            UK                       AAA

I still want an entry for every firm at the month_year level but do not want separate rows for other variables, just columns. I am trying to turn it into this format.

firm    month_year    sales    competitor_1   competitor_2     competitor_3        competitor_1_location     competitor_2_location     competitor_3_location            competitor_1_branch_1        competitor_2_branch_1           competitor_3_branch_1           competitor_1_branch_2        competitor_2_branch_2           competitor_3_branch_2                  competitor_1_branch_3        competitor_2_branch_3           competitor_3_branch_3

I thought reshape wide sales competitor competitor_location competitor_branch_1 competitor_branch_2, i(firm) j(month_year)

Upvotes: 0

Views: 688

Answers (1)

Roberto Ferrer
Roberto Ferrer

Reputation: 11112

Most of the code is just setting up the example data (however inefficient that may be). The encodes I believe are not necessary, but recommended.

The code gives only one observation per firm (as stated in my comment).

clear all
set more off

*----- example data -----

input ///
firm    str7 month_year    sales    str3 competitor   str3 competitor_location  str3 competitor_branch_1       str3 competitor_branch_2
  1       "1_2014"    25          "XYZ"            "US"                     "EEE"                      "RRR"
  1       "2_2014"    21          "XYZ"            "US"                    "FFF"
  1       "2_2014"    21          "ABC"            "UK"                     "GGG"
  21     "1_2009"    11          "LKS"           "UK"                      "AAA"
  21     "1_2009"    11          "AIS"            "UK"                      "BBB"
  21     "1_2009"    11          "AJS"            "US"                      "CCC"
  21     "2_2009"    12          "LKS"            "UK"                      "AAA"
end

encode competitor, gen(comp)
encode competitor_location, gen(comploc)
encode competitor_branch_1, gen(compbr1)
encode competitor_branch_1, gen(compbr2)

gen date = ym( real(substr(month_year,3,.)), real(substr(month_year,1,1)) )
format date %tm

drop competitor* month*

list

*----- what you want ?? -----

bysort firm: gen j = _n // this sorting is not unique

reshape wide date sales comp comploc compbr1 compbr2, i(firm) j(j)

Upvotes: 2

Related Questions