gitcanzo
gitcanzo

Reputation: 129

How to rank observations in panel data?

I have a panel dataset in Stata with several countries and each country containing groups. I would like to rank the groups by country, according to the variable var1.

The structure of my dataset is as follows (the rank column is what I would like to achieve). Note that var1 is indeed constant within groups (it is just the within group average of another variable).

--country--|--groupId--|---time----|---var1----|---rank---
     1     |     1     |     1     |    50     |    3
     1     |     1     |     2     |    50     |    3
     1     |     1     |     3     |    50     |    3
     1     |     2     |     1     |    90     |    1
     1     |     2     |     2     |    90     |    1
     1     |     2     |     3     |    90     |    1
     1     |     3     |     1     |    60     |    2
     1     |     3     |     2     |    60     |    2
     1     |     3     |     3     |    60     |    2
     2     |     4     |     1     |    15     |    2
     2     |     4     |     2     |    15     |    2
     2     |     4     |     3     |    15     |    2
     2     |     5     |     1     |    10     |    3
     2     |     5     |     2     |    10     |    3
     2     |     5     |     3     |    10     |    3
     2     |     6     |     1     |    80     |    1
     2     |     6     |     2     |    80     |    1
     2     |     6     |     3     |    80     |    1

Among the options I have tried is:

sort country groupId
by country (groupId): egen rank = rank(var1)

However, I cannot achieve the desired result.

Upvotes: 0

Views: 774

Answers (1)

Nick Cox
Nick Cox

Reputation: 37208

Thanks for the data example. There are two problems with your code. One is that as you want to rank from highest to lowest, you need to negate the argument to rank(). The second is that given the repetitions, you need to rank on one time only and then copy those ranks to other times.

This works with your data example, here edited to be input code. (See also the Stata tag wiki for that principle.)

clear 
input   country    groupId     time       var1       rank   
     1          1          1         50         3
     1          1          2         50         3
     1          1          3         50         3
     1          2          1         90         1
     1          2          2         90         1
     1          2          3         90         1
     1          3          1         60         2
     1          3          2         60         2
     1          3          3         60         2
     2          4          1         15         2
     2          4          2         15         2
     2          4          3         15         2
     2          5          1         10         3
     2          5          2         10         3
     2          5          3         10         3
     2          6          1         80         1
     2          6          2         80         1
     2          6          3         80         1
end 

bysort country : egen wanted = rank(-var) if time == 1
bysort country groupId (time) : replace wanted = wanted[1]
assert rank == wanted 

Upvotes: 1

Related Questions