user68
user68

Reputation: 3

Generate a lead variable in a panel

I want to generate a lead income variable V1 by CMA, Province and Year as follows:

CMA  Province  Year  Income  V1
TO      A      1990   $5     $8
TO      A      1991   $8     .
TO      A      1991   $8     .
MT      A      1990   $12    $7
MT      A      1991   $7     .
MT      A      1991   $7     .
AB      B      1990   $5     $15
AB      B      1991   $15    .
AB      B      1991   $15    .

I am using the following code, but I am getting only missing values:

gsort CMA Province -Year Income
gen ld_inc = Income[_n-1] if Year == Year[_n-1] + 1
replace ld_inc = ld_inc[_n-1] if Year == Year[_n-1] & missing(ld_inc)

Is there something wrong with my code and what I am getting wrong?

Upvotes: 0

Views: 920

Answers (1)

user8682794
user8682794

Reputation:

The following works for your example:

clear

input str2 CMA str2 Province Year Income V1
"TO"  "A" 1990  5  8
"."   "." 1991  8  .
"MT"  "A" 1990 12  7
"."   "." 1991  7  .
"AB"  "B" 1990  5 15
"."   "." 1990 15  .
end

generate V2 = Income[_n+1] if CMA != "."

list, separator(0)

     +------------------------------------------+
     | CMA   Province   Year   Income   V1   V2 |
     |------------------------------------------|
  1. |  TO          A   1990        5    8    8 |
  2. |   .          .   1991        8    .    . |
  3. |  MT          A   1990       12    7    7 |
  4. |   .          .   1991        7    .    . |
  5. |  AB          B   1990        5   15   15 |
  6. |   .          .   1991       15    .    . |
     +------------------------------------------+

However, normally it should be done using an identifier as follows:

egen id = seq(), block(2)
bysort id (Year Province): generate V3 = Income[_n+1]

order id
list, separator(0)

     +----------------------------------------------------+
     | id   CMA   Province   Year   Income   V1   V2   V3 |
     |----------------------------------------------------|
  1. |  1    TO          A   1990        5    8    8    8 |
  2. |  1     .          .   1991        8    .    .    . |
  3. |  2    MT          A   1990       12    7    7    7 |
  4. |  2     .          .   1991        7    .    .    . |
  5. |  3    AB          B   1990        5   15   15   15 |
  6. |  3     .          .   1991       15    .    .    . |
     +----------------------------------------------------+

EDIT:

Using your new example, things are even more straightforward:

clear

input str2 CMA  str1 Province  Year  Income  V1
TO      A      1990   5     8
TO      A      1991   8     .
TO      A      1991   8     .
MT      A      1990   12    7
MT      A      1991   7     .
MT      A      1991   7     .
AB      B      1990   5     15
AB      B      1991   15    .
AB      B      1991   15    .
end

bysort CMA (Year Province): generate V2 = Income[_n+1] if _n == 1

list, sepby(CMA)

     +------------------------------------------+
     | CMA   Province   Year   Income   V1   V2 |
     |------------------------------------------|
  1. |  AB          B   1990        5   15   15 |
  2. |  AB          B   1991       15    .    . |
  3. |  AB          B   1991       15    .    . |
     |------------------------------------------|
  4. |  MT          A   1990       12    7    7 |
  5. |  MT          A   1991        7    .    . |
  6. |  MT          A   1991        7    .    . |
     |------------------------------------------|
  7. |  TO          A   1990        5    8    8 |
  8. |  TO          A   1991        8    .    . |
  9. |  TO          A   1991        8    .    . |
     +------------------------------------------+

Upvotes: 1

Related Questions