Jorge Paredes
Jorge Paredes

Reputation: 1078

How to graph a lagged variable in a panel data r

I got a dataset of firms (unbalanced panel) that goes like this:

id   year   tfp    c_sales    
A    2012    1.52   14.56
A    2013    1.82   15.6
A    2014    1.67   16.3
A    2015    1.72   18.36
...   ...    ...    ...
B    2012    1.58   17.56
B    2013    1.83   12.6
B    2014    1.62   19.3
B    2015    1.96   14.36
...   ...    ...    ... 
C    2012    1.2   13.4
C    2013    1.6   16.3
...   ...    ...    ...

And so on... till 2019.

How can I plot tfp from 2014 vs c_sales in 2015?

I want to have a scatter plot, that in the horizontal axis shows me the tfp values for 2014 and in the vertical axis shows me the c_sales values of 2015.

Since tfp is a measure of productivity I'd like to see a scatter plot, that tells me that firms that were productive in 2014, had greater or lesser sales in 2015.

I was trying to make a plot with ggplot, but I don't have a clear idea of how to do it.

(Additionally, how can I make a regression like that? with a year-fixed independent variable)

Upvotes: 0

Views: 208

Answers (1)

Marek Fiołka
Marek Fiołka

Reputation: 4949

You can do like this

(Although the data would be really useful!)

library(tidyverse)

df=tribble(
~id, ~year, ~tfp, ~c_sales, 
"A", 2012, 1.52, 14.56, 
"A", 2013, 1.82, 15.6, 
"A", 2014, 1.67, 16.3, 
"A", 2015, 1.72, 18.36, 
"B", 2012, 1.58, 17.56, 
"B", 2013, 1.83, 12.6, 
"B", 2014, 1.62, 19.3, 
"B", 2015, 1.96, 14.36, 
"C", 2012, 1.2, 13.4, 
"C", 2013, 1.6, 16.3, 
"C", 2014, 1.7, 17.3, 
"C", 2015, 1.82, 20.33
) 

f = function(data, group, xYear, yYear)(
  tibble(
    xYear = xYear,
    yYear = yYear,
    tfp = data %>% filter(year==xYear) %>% pull(tfp),
    c_sales = data %>% filter(year==yYear) %>% pull(c_sales)
  )
)


df = df %>% 
  group_by(id) %>% 
  group_modify(f, xYear=2014, yYear=2015) 

df

output

# A tibble: 3 x 5
# Groups:   id [3]
  id    xYear yYear   tfp c_sales
  <chr> <dbl> <dbl> <dbl>   <dbl>
1 A      2014  2015  1.67    18.4
2 B      2014  2015  1.62    14.4
3 C      2014  2015  1.7     20.3

And next

df %>% ggplot(aes(tfp, c_sales))+
  geom_point()

enter image description here

Upvotes: 2

Related Questions