katdataecon
katdataecon

Reputation: 185

How to merge by date on R with a period between two years?

So I have this first dataset

Company Director Dir_Date
AB Alexander 2014
AB Justin 2020

We then understand that between 2014 and 2019, the director of AB was Alexander.

My second dataset is like that

Company Results Date
AB Good 2014
AB Good 2015
AB Bad 2016

And so on for each year. I'd like to obtain this output by merging by company and date :

Company Results Date Director
AB Good 2014 Alexander
AB Good 2015 Alexander
AB Bad 2016 Alexander

Upvotes: 0

Views: 101

Answers (1)

AnilGoyal
AnilGoyal

Reputation: 26238

On an extended sample, showing how to hardcode the other possible year values

df_m <- read.table(text = 'Company  Director    Dir_Date
AB  Alexander   2014
AB  Justin  2020', header = T)

df <- read.table(text = ' Company   Results Date
AB  Good    2014
AB  Good    2015
AB  Bad 2016
AB Something  2021', header = T)

library(tidyverse)
df %>% left_join(df_m %>% group_by(Company) %>% complete(Dir_Date = 2000:2021) %>%
                   fill(Director, .direction = 'down'),
                 by = c('Company' = 'Company', 'Date' = 'Dir_Date'))


  Company   Results Date  Director
1      AB      Good 2014 Alexander
2      AB      Good 2015 Alexander
3      AB       Bad 2016 Alexander
4      AB Something 2021    Justin

OR

df_m <- read.table(text = 'Company  Director    Dir_Date
AB  Alexander   2014
AB  Justin  2020', header = T)

df <- read.table(text = ' Company   Results Date
AB  Good    2012
AB  Good    2015
AB  Bad 2016
AB Something  2021', header = T)

library(tidyverse)
df %>% left_join(df_m %>% group_by(Company) %>% complete(Dir_Date = 2000:2021) %>%
                   fill(Director, .direction = 'down'),
                 by = c('Company' = 'Company', 'Date' = 'Dir_Date'))
#>   Company   Results Date  Director
#> 1      AB      Good 2012      <NA>
#> 2      AB      Good 2015 Alexander
#> 3      AB       Bad 2016 Alexander
#> 4      AB Something 2021    Justin

Created on 2021-05-20 by the reprex package (v2.0.0)


df_m <- read.table(text = 'Company  Director    Dir_Date
AB  Alexander   2014
AB  Justin  2020', header = T)

df <- read.table(text = ' Company   Results Date
AB  Good    2014
AB  Good    2015
AB  Bad 2016', header = T)

library(tidyverse)
df %>% left_join(df_m %>% group_by(Company) %>% complete(Dir_Date = seq(min(Dir_Date), max(Dir_Date), 1)) %>%
                   fill(Director, .direction = 'down'),
                 by = c('Company' = 'Company', 'Date' = 'Dir_Date'))
#>   Company Results Date  Director
#> 1      AB    Good 2014 Alexander
#> 2      AB    Good 2015 Alexander
#> 3      AB     Bad 2016 Alexander

Created on 2021-05-20 by the reprex package (v2.0.0)

Upvotes: 2

Related Questions