Adi Jahic
Adi Jahic

Reputation: 15

Changing organization of data so that each observation represents a new variable (I tried)

I am working in Stata with a dataset on electric vehicle charging stations. Variables include

station_name name of charging station

review_text all of the customer reviews for a specific station delimited by }{

num_reviews number of customer reviews.

I'm trying to make a new file where each observation represents one customer review in a new variable customer_review and another variable station_id has the name of the corresponding station. So, if the original dataset had 100 observations (one per station) with 5 reviews each, the new file should have 500 observations.

How can I do this? I would include some code I have tried but I have no idea how to start.

Upvotes: 0

Views: 71

Answers (1)

langtang
langtang

Reputation: 24722

If your data look like this:

       station              reviews   n  
  1.         1   {good}{bad}{great}   3  
  2.         2    {poor}{excellent}   2  

Then the following:

split(reviews), parse(}{)
drop reviews n
reshape long reviews, i(station) j(review_num)
drop if reviews==""
replace reviews = subinstr(reviews, "}","",.)
replace reviews = subinstr(reviews, "{","",.)

will produce:

       station   review~m     reviews  
  1.         1          1        good  
  2.         1          2         bad  
  3.         1          3       great  
  4.         2          1        poor  
  5.         2          2   excellent  

Upvotes: 2

Related Questions