Reputation: 3
We currently have made a column called Change with 0 at certain rows, indicating the change of a new manager within a firm .
Firm | Manager | Quarter | Change |
---|---|---|---|
1 | 1 | 31.12.2018 | |
1 | 1 | 31.03.2019 | |
1 | 1 | 30.06.2019 | |
1 | 1 | 30.09.2019 | |
1 | 2 | 31.12.2019 | 0 |
1 | 2 | 31.03.2020 | |
1 | 2 | 30.06.2020 | |
1 | 2 | 30.09.2020 |
Is there some type of function in R that can fill out the blank rows, as depicted in the table below? The length of how long the different manager works within a certain firm varies.
Firm | Manager | Quarter | Eventdate |
---|---|---|---|
1 | 1 | 31.12.2018 | -4 |
1 | 1 | 31.03.2019 | -3 |
1 | 1 | 30.06.2019 | -2 |
1 | 1 | 30.09.2019 | -1 |
1 | 2 | 31.12.2019 | 0 |
1 | 2 | 31.03.2020 | 1 |
1 | 2 | 30.06.2020 | 2 |
1 | 2 | 30.09.2020 | 3 |
Apologies for the poor explanation. Any help would be extremely appreciated.
Upvotes: 0
Views: 67
Reputation: 146164
Using Wimpel's nicely copy/pasteable sample data, here are dplyr
and data.table
solutions based on subtracting row numbers by group. It should work when there is a single row where Change
is 0 per Firm
.
DT[, Eventdate := .I - which(Change == 0), by = .(Firm)]
DT
# Firm Manager Quarter Change Eventdate
# 1: 1 1 31.12.2018 NA -4
# 2: 1 1 31.03.2019 NA -3
# 3: 1 1 30.06.2019 NA -2
# 4: 1 1 30.09.2019 NA -1
# 5: 1 2 31.12.2019 0 0
# 6: 1 2 31.03.2020 NA 1
# 7: 1 2 30.06.2020 NA 2
# 8: 1 2 30.09.2020 NA 3
library(dplyr)
DT %>% group_by(Firm) %>%
mutate(Eventdate = row_number() - which(Change == 0))
# # A tibble: 8 x 5
# # Groups: Firm [1]
# Firm Manager Quarter Change Eventdate
# <int> <int> <chr> <int> <int>
# 1 1 1 31.12.2018 NA -4
# 2 1 1 31.03.2019 NA -3
# 3 1 1 30.06.2019 NA -2
# 4 1 1 30.09.2019 NA -1
# 5 1 2 31.12.2019 0 0
# 6 1 2 31.03.2020 NA 1
# 7 1 2 30.06.2020 NA 2
# 8 1 2 30.09.2020 NA 3
Upvotes: 1
Reputation: 27792
Here is a data.table
approach using a rolling join.. It seems to work fine on the sample data provided. change2
is the column you are looking for.
library( data.table )
DT <- fread("Firm Manager Quarter Change
1 1 31.12.2018
1 1 31.03.2019
1 1 30.06.2019
1 1 30.09.2019
1 2 31.12.2019 0
1 2 31.03.2020
1 2 30.06.2020
1 2 30.09.2020 ")
#add id column
DT[, id := .I ]
#set key to join on
setkey( DT, id )
#perform rolling join to nearest Change == 0, get id and subtract
DT[, change2 := id - DT[Change == 0, ][DT, x.id, roll = "nearest" ] ]
# Firm Manager Quarter Change id change2
# 1: 1 1 31.12.2018 NA 1 -4
# 2: 1 1 31.03.2019 NA 2 -3
# 3: 1 1 30.06.2019 NA 3 -2
# 4: 1 1 30.09.2019 NA 4 -1
# 5: 1 2 31.12.2019 0 5 0
# 6: 1 2 31.03.2020 NA 6 1
# 7: 1 2 30.06.2020 NA 7 2
# 8: 1 2 30.09.2020 NA 8 3
I left in the temporary columns, so you can see what is going on here.
Upvotes: 1