Reputation: 13
I have panel data and have missing information on birthyear in some observations. As the birthyear does not differ per ID throughout the time-series I want to fill in the blank spots using a command I do not know of (else I have to do it manually)
Here an example
+--------------------------------+
| ID wave birthy~r |
|--------------------------------|
1. | 010104101001 1 1965 |
2. | 010104101001 2 1965 |
3. | 010104101001 3 1965 |
4. | 010104101001 4 1965 |
5. | 010104101002 1 . |
|--------------------------------|
6. | 010104101002 2 . |
7. | 010104101002 3 1963 |
8. | 010104101002 4 1963 |
9. | 010104102001 1 1954 |
10. | 010104102001 2 . |
+--------------------------------+
In this case I want to automatically replace the missing birthyear values of line 5 and 6 with the information of line 6 or 7. And paste the birthyear value from line 9 into 10.
Upvotes: 0
Views: 965
Reputation: 37358
bysort ID : egen min = min(birthyear)
by ID: egen max = max(birthyear)
list if min != max
bysort ID (birthyear) : replace birthyear = birthyear[1] if max == min
Most of this code is just checking that there isn't contradictory information on birth year (and not overwriting any such).
Upvotes: 1