Reputation: 11
I have a panel data ranging from year 1917 to 1922 with various variables (for example Leverage) for 200 firms. It looks something like this:
I want to copy the values for 1917 to all other years (1918, 1919,...) for the variables per FirmID. As in my example, not all years are present (so I cannot say that the value is there every X row). The result must be something like:
Upvotes: 0
Views: 31
Reputation: 37278
Borrowing @Cybernike's helpful data example, here are two ways to do it in one line
clear
input year id leverage
1917 1 0.1
1918 1 0.2
1919 1 0.3
1917 2 0.4
1918 2 0.5
1917 3 0.6
1918 3 0.7
1919 3 0.8
1920 3 0.9
end
egen wanted1 = mean(cond(year == 1917, leverage, .)), by(id)
egen wanted2 = mean(leverage / (year == 1917)), by(id)
list, sepby(id)
+------------------------------------------+
| year id leverage wanted1 wanted2 |
|------------------------------------------|
1. | 1917 1 .1 .1 .1 |
2. | 1918 1 .2 .1 .1 |
3. | 1919 1 .3 .1 .1 |
|------------------------------------------|
4. | 1917 2 .4 .4 .4 |
5. | 1918 2 .5 .4 .4 |
|------------------------------------------|
6. | 1917 3 .6 .6 .6 |
7. | 1918 3 .7 .6 .6 |
8. | 1919 3 .8 .6 .6 |
9. | 1920 3 .9 .6 .6 |
+------------------------------------------+
For detailed discussion of both methods, see Sections 9 and 10 of this paper.
I don't overwrite the original data, contrary to your request. Often you decide later that you need them after all, or someone asks to see them.
This isn't necessarily better than the solution of @Cybernike. The division method behind wanted2
has struck some experienced users as too tricksy, and I tend to recommend the cond()
device behind wanted1
.
Upvotes: 1
Reputation: 1103
The following works for me:
clear
input year id leverage
1917 1 0.1
1918 1 0.2
1919 1 0.3
1917 2 0.4
1918 2 0.5
1917 3 0.6
1918 3 0.7
1919 3 0.8
1920 3 0.9
end
gen leverage1917 = leverage if year == 1917
bysort id: egen min = min(leverage1917)
replace leverage = min
drop min leverage1917
. list, sepby(id)
+----------------------+
| year id leverage |
|----------------------|
1. | 1917 1 .1 |
2. | 1918 1 .1 |
3. | 1919 1 .1 |
|----------------------|
4. | 1917 2 .4 |
5. | 1918 2 .4 |
|----------------------|
6. | 1917 3 .6 |
7. | 1918 3 .6 |
8. | 1919 3 .6 |
9. | 1920 3 .6 |
+----------------------+
EDIT NJC
This could be simplified to
generate leverage1917 = leverage if year == 1917
bysort id (leverage1917) : replace leverage1917 = leverage1917[1]
thus cutting out the egen
call and the generation of another variable you then need to drop
. This works properly even if there is no value for 1917 for some values of id
.
Upvotes: 1