Reputation: 43
I gotta a somewhat simple problem that I'm not being able to grasp correctly.
I have two data frames, the first one containing just dates (every month for a bunch of years), the second one also with dates and some other data, but just the months for which there have been changes in the second variable. Like bellow:
df1 <- data.frame(Dates.1 = seq.Date(as.Date('1999/1/1'), as.Date('2001/5/1'), 'month'))
Dates.2 <- c(seq.Date(as.Date('1999/1/1'), as.Date('2001/5/1'), by = '5 months'))
Vals <- c(10, 20, 15, 44, 70, 50)
df2 <- data.frame(Dates.2, Vals)
What I need to do is to join df1 and df2, associating the corresponding values in "Vals" for each date in df1 which is less than or eaqual to the dates in df2. The output should be as bellow (I want to find a way to do it in a vectorized fashion):
df3 <- cbind(df1,Vals3. = c(10,10,10,10,10,20,20,20,20,20,15,15,15,15,15,
44,44,44,44,44,70,70,70,70,70,50,50,50,50))
I've tried using the dplyr's joins and the fuzzyjoin package, but I couldn't manage to get it properly (I'm a beginner in R). Of course, if anyone can come up with a solution using these packages I'll be more than glad. Tks!
Upvotes: 2
Views: 138
Reputation: 7592
A combination of dplyr
and tidyr
:
dplyr::left_join(df1,df2,by=c(Dates.1="Dates.2")) %>%
tidyr::fill(Vals,.direction="down")
Result:
Dates.1 Vals
1 1999-01-01 10
2 1999-02-01 10
3 1999-03-01 10
4 1999-04-01 10
5 1999-05-01 10
6 1999-06-01 20
7 1999-07-01 20
8 1999-08-01 20
9 1999-09-01 20
10 1999-10-01 20
(...)
An alternative, by the way, will be to skip creating df1
in the first place by using complete
(from tidyr
):
tidyr::complete(df2,Dates.2=seq.Date(as.Date('1999/1/1'), as.Date('2001/5/1'), by = 'month')) %>%
tidyr::fill(Vals,.direction="down")
This will give the same result.
Upvotes: 3