bretauv
bretauv

Reputation: 8557

Generate all id-time pairs when merging two dataframes

I would like to merge two dataframes together based on the columns id and time. I think an example is more efficient to show what I want to do:

library(dplyr)

df1 <- tibble(id = c("a", "a", "b", "b", "c", "c"),
              time = c(1, 2, 1, 2, 1, 2),
              val1 = c(2000:2005))

  id     time  val1
  <chr> <dbl> <int>
1 a         1  2000
2 a         2  2001
3 b         1  2002
4 b         2  2003
5 c         1  2004
6 c         2  2005

df2 <- tibble(id = c("a", "a", "b", "b", "c", "c"),
              time = c(1, 2, 1, 2, 2, 3),
              val2 = c(2000:2005))

  id     time  val2
  <chr> <dbl> <int>
1 a         1  2000
2 a         2  2001
3 b         1  2002
4 b         2  2003
5 c         2  2004
6 c         3  2005
  id     time  val1  val2
  <chr> <dbl> <dbl> <dbl>
1 a         1  2000  2000
2 a         2  2001  2001
3 a         3    NA    NA
4 b         1  2002  2002
5 b         2  2003  2003
6 b         3    NA    NA
7 c         1  2004    NA
8 c         2  2005  2004
9 c         3    NA  2005

Basically, I want to generate rows with time equal to 3 for each id, not just the one who has one period equal to 3. I would like the solution to be as short as possible (and preferably based on dplyr functions, but the compacity of the code comes first).

Surely this question has already been asked but I can't find a solution.

Any ideas?

Upvotes: 1

Views: 30

Answers (1)

akrun
akrun

Reputation: 886978

After a full_join, we can use complete

library(dplyr)
library(tidyr)
full_join(df1, df2,, by = c('id', 'time')) %>% 
       complete(id, time)
# A tibble: 9 x 4
#  id     time  val1  val2
#  <chr> <dbl> <int> <int>
#1 a         1  2000  2000
#2 a         2  2001  2001
#3 a         3    NA    NA
#4 b         1  2002  2002
#5 b         2  2003  2003
#6 b         3    NA    NA
#7 c         1  2004    NA
#8 c         2  2005  2004
#9 c         3    NA  2005

Upvotes: 2

Related Questions