Reputation: 13
I have a dataset in Stata on housing transactions. Now I have a dataset for which every row is a year in the holding period for each transaction. I am looking to research the probability of a house being sold in a certain year with a probit model. The dummy indicates whether the house was sold in that year or not, 1 being sold.
Now I want to add another variable to my data which contains the holding period of that specific transaction. This is (an example) of what I have now:
dummy | year bought | current year |
---|---|---|
0 | 1620 | 1621 |
0 | 1620 | 1622 |
0 | 1620 | 1623 |
1 | 1620 | 1624 |
0 | 1622 | 1623 |
0 | 1622 | 1624 |
0 | 1622 | 1625 |
0 | 1622 | 1626 |
0 | 1622 | 1627 |
1 | 1622 | 1628 |
Then this is what I need it to become
dummy | year bought | current year | holding period |
---|---|---|---|
0 | 1620 | 1621 | 4 |
0 | 1620 | 1622 | 4 |
0 | 1620 | 1623 | 4 |
1 | 1620 | 1624 | 4 |
0 | 1622 | 1623 | 6 |
0 | 1622 | 1624 | 6 |
0 | 1622 | 1625 | 6 |
0 | 1622 | 1626 | 6 |
0 | 1622 | 1627 | 6 |
1 | 1622 | 1628 | 6 |
Upvotes: 0
Views: 41
Reputation: 9858
Assuming you have some kind of id variable for each house:
egen sold_year = max(current_year), by(house_id)
gen holding_period = sold_year - year_bought
Upvotes: 2