Duna
Duna

Reputation: 735

Computing rate of worker turnover in SAS

Using a SAS dataset similar to the following:

firm  worker  year
------------------
AA    John   2000
AA    Dani   2000
------------------
AA    John   2001
AA    Dani   2001
AA    Robi   2001
AA    Kate   2001
-----------------
AA    Robi   2002
AA    Niki   2002
AA    Theo   2002
AA    Lucy   2002
AA    Kim    2002

how can I compute the rate of worker turnover at the firm level noting that the data is collected on 31 Dec. each year)?

The rate of worker turnover in year t is the ratio of x and y where:

I want to get:

firm   worker year  old new y   x   rate
-------------------------------------------
AA     John   2000  .   .   .   .   .
AA     Dani   2000  .   .   .   .   .
-------------------------------------------
AA     John   2001  2   2   4   0   0
AA     Dani   2001  2   2   4   0   0
AA     Robi   2001  2   2   4   0   0
AA     Kate   2001  2   2   4   0   0
------------------------------------------
AA     Robi   2002  4   4   8   3   0.375
AA     Niki   2002  4   4   8   3   0.375
AA     Theo   2002  4   4   8   3   0.375
AA     Lucy   2002  4   4   8   3   0.375
AA     Kim    2002  4   4   8   3   0.375

Upvotes: 0

Views: 100

Answers (1)

DWal
DWal

Reputation: 2762

If you want to make this table with proc sql, start with a full outer join of one year to the following year matching the names:

proc sql;
create table want as
select
  *
from
  have t1
  full outer join have t2
  on t1.year=t2.year+1
  and t1.worker=t2.worker
;
quit;

Form here, you can pretty much just look at the results, calculate each of your variables based on what you see, and limit to the rows and columns you need.

firm    worker  year    firm    worker  year
AA      Dani    2000                    .
AA      John    2000                    .
AA      Dani    2001    AA      Dani    2000
AA      John    2001    AA      John    2000
AA      Kate    2001                    .
AA      Robi    2001                    .
                .       AA      Dani    2001
                .       AA      John    2001
                .       AA      Kate    2001
AA      Kim     2002                    .
AA      Lucy    2002                    .
AA      Niki    2002                    .
AA      Robi    2002    AA      Robi    2001
AA      Theo    2002                    .
                .       AA      Kim     2002
                .       AA      Lucy    2002
                .       AA      Niki    2002
                .       AA      Robi    2002
                .       AA      Theo    2002

The first two columns are just combined between the two tables. The last five columns use aggregate functions while grouping by the year to calculate the totals for each year. The having clause restricts the rows reported to the rows in the original (t1) table.

proc sql;
select
  coalesce(t1.firm,t2.firm) as firm,
  coalesce(t1.year,t2.year+1) as year,
  t1.worker,
  sum(t2.worker is not null) as old,
  sum(t1.worker is not null and t2.worker is null) as new,
  calculated old + calculated new as y,
  sum(t2.worker is not null and t1.worker is null) as x,
  calculated x/calculated y as rate
from
  have t1
  full outer join have t2
  on t1.year=t2.year+1
  and t1.worker=t2.worker
group by
  calculated year
having
  worker is not null
;
quit;

Upvotes: 1

Related Questions