user6158130
user6158130

Reputation: 5

(Stata) Find values from earlier years with multiple observations within a year

I'm using Stata. I have a dataset of multiple firms and their banks within a given year for multiple years. Since firms often have more than one bank there's multiple observations for a firm-year. I have a variable "bank_exityear" which contains the last year a bank is in the sample. I would like to create a variable that for each firm within a given year contains the minimum of "bank_exityear" from the previous year (and for the same firm).

An example data-set is attached here:

enter image description here

The variable I'd like to create is the bold "want". The data starts in 2008.

What would be the best way to create this variable?

Upvotes: 0

Views: 324

Answers (3)

Robert Picard
Robert Picard

Reputation: 1051

Here's a solution using rangestat (from SSC). To install it, type in Stata's Command window:

ssc install rangestat

For the problem at hand, this requires finding the minimum bank_exityear across all observations of the same firmid whose year is one less than the year of the current observation:

clear
input year firmid bankid bank_exityear want
2008 1 1 2008 .
2008 1 2 2015 .
2009 1 2 2015 2008
2009 1 3 2015 2008
2010 1 2 2015 2015
2010 1 3 2015 2105
end

rangestat (min) bank_exityear, interval(year -1 -1) by(firmid)
list

and the results:

. list, sepby(firmid)

     +-----------------------------------------------------+
     | year   firmid   bankid   bank_e~r   want   bank_e~n |
     |-----------------------------------------------------|
  1. | 2008        1        1       2008      .          . |
  2. | 2008        1        2       2015      .          . |
  3. | 2009        1        2       2015   2008       2008 |
  4. | 2009        1        3       2015   2008       2008 |
  5. | 2010        1        2       2015   2015       2015 |
  6. | 2010        1        3       2015   2105       2015 |
     +-----------------------------------------------------+

Upvotes: 2

dimitriy
dimitriy

Reputation: 9470

This sort of strategy might do the trick:

clear
input year firmid bankid bank_exityear want
2008 1 1 2008 .
2008 1 2 2015 .
2009 1 2 2015 2008
2009 1 3 2015 2008
2010 1 2 2015 2015
2010 1 3 2015 2105
end

tempfile min_year
preserve
    collapse (min) want2 = bank_exityear, by(firmid year)
    save `min_year' 
restore

replace year = year - 1
merge m:1 firmid year using "`min_year'", nogen keep(master match)
replace year = year + 1

This assumes that there are no gaps in year.

Upvotes: 2

ℕʘʘḆḽḘ
ℕʘʘḆḽḘ

Reputation: 19405

your question is a little bit unclear but I believe some combination of

bysort bank_id (year) : gen lag_exit = bank_exit_year[_n-1]
bysort bank_id : egen min_var = min(lag_exit )

should work

Upvotes: 0

Related Questions