Reputation: 107
I have two different dataframes:
df1 <- tibble(group = c(rep(1, 3), rep(2, 4), rep(1, 3)),
id = paste0("minutesPrompt", c(1, 2, 3, 1, 2, 3, 4, 1, 2, 3)),
number = c(rep("a", 3), rep("b", 4), rep("c", 3)),
minutesPrompt = c(1, 2, 4, 9, 18, 27, 36, 2, 3, 5),
timestamp = rep("xxxxxx", 10),
text1 = c("String", rep(NA_character_, 6), rep("String", 3)),
text2 = c(NA_character_, "String", rep(NA_character_, 5), "String", rep(NA_character_, 2)),
text3 = c(rep(NA_character_, 2), "String", rep(NA_character_, 7)))
df2 <- tibble(group = rep(2, 7),
id = paste0("minutesPrompt", c(1, 2, 3, 4, 1, 2, 3)),
number = c(rep("b", 4), rep("x", 3)),
minutesPrompt = NA,
timestamp = rep("xxxxxx", 7),
text1 = c("String", rep(NA_character_, 6)),
text2 = c(rep(NA_character_, 2), "String", rep(NA_character_, 4)),
text3 = c(NA_character_, "String", rep(NA_character_, 5)))
df1
(first picture) which is really big: This dataframe consists of a lot of variables, and includes the values of 3 different groups. It further has 7 rows for each participant expressed by id.df2
(second picture) contrarily consists only of the variables shown for one group only. The difference between the datasets is also that df1 has some missing values (yellow). The strings that should be transferred into those empty cells are included in df2 (orange).My plan is to conduct a full join so that I can replace the missing information in df1 on "timestamp", "text1", "text2", until "text7" by the provided values of df2. I have tried this:
full_join(df1, df2) %>%
group_by("id", "number")
However this does not replace my missing cells (highlighted in yellow) by the strings in df2.
Upvotes: 1
Views: 472
Reputation: 29109
We can use data.table
package:
library(data.table)
setDT(df1)[setDT(df2), `:=` ( timestamp = i.timestamp,
text1 = i.text1,
text2 = i.text2,
text3 = i.text3 ),
on = .(id, number)][] ## may wanna add `group` column to `on` arguments
#> group id number minutesPrompt timestamp text1 text2 text3
#> 1: 1 minutesPrompt1 a 1 xxxxxx String <NA> <NA>
#> 2: 1 minutesPrompt2 a 2 xxxxxx <NA> String <NA>
#> 3: 1 minutesPrompt3 a 4 xxxxxx <NA> <NA> String
#> 4: 2 minutesPrompt1 b 9 xxxxxx String <NA> <NA>
#> 5: 2 minutesPrompt2 b 18 xxxxxx <NA> <NA> String
#> 6: 2 minutesPrompt3 b 27 xxxxxx <NA> String <NA>
#> 7: 2 minutesPrompt4 b 36 xxxxxx <NA> <NA> <NA>
#> 8: 1 minutesPrompt1 c 2 xxxxxx String String <NA>
#> 9: 1 minutesPrompt2 c 3 xxxxxx String <NA> <NA>
#> 10: 1 minutesPrompt3 c 5 xxxxxx String <NA> <NA>
Upvotes: 1