Reputation: 8557
I would like to merge two dataframes together based on the columns id
and time
. I think an example is more efficient to show what I want to do:
library(dplyr)
df1 <- tibble(id = c("a", "a", "b", "b", "c", "c"),
time = c(1, 2, 1, 2, 1, 2),
val1 = c(2000:2005))
id time val1
<chr> <dbl> <int>
1 a 1 2000
2 a 2 2001
3 b 1 2002
4 b 2 2003
5 c 1 2004
6 c 2 2005
df2 <- tibble(id = c("a", "a", "b", "b", "c", "c"),
time = c(1, 2, 1, 2, 2, 3),
val2 = c(2000:2005))
id time val2
<chr> <dbl> <int>
1 a 1 2000
2 a 2 2001
3 b 1 2002
4 b 2 2003
5 c 2 2004
6 c 3 2005
id time val1 val2
<chr> <dbl> <dbl> <dbl>
1 a 1 2000 2000
2 a 2 2001 2001
3 a 3 NA NA
4 b 1 2002 2002
5 b 2 2003 2003
6 b 3 NA NA
7 c 1 2004 NA
8 c 2 2005 2004
9 c 3 NA 2005
Basically, I want to generate rows with time
equal to 3 for each id
, not just the one who has one period equal to 3. I would like the solution to be as short as possible (and preferably based on dplyr
functions, but the compacity of the code comes first).
Surely this question has already been asked but I can't find a solution.
Any ideas?
Upvotes: 1
Views: 30
Reputation: 886978
After a full_join
, we can use complete
library(dplyr)
library(tidyr)
full_join(df1, df2,, by = c('id', 'time')) %>%
complete(id, time)
# A tibble: 9 x 4
# id time val1 val2
# <chr> <dbl> <int> <int>
#1 a 1 2000 2000
#2 a 2 2001 2001
#3 a 3 NA NA
#4 b 1 2002 2002
#5 b 2 2003 2003
#6 b 3 NA NA
#7 c 1 2004 NA
#8 c 2 2005 2004
#9 c 3 NA 2005
Upvotes: 2