Reputation: 384
I have an example dataframe that has a column which stores 3 letters of the alphabet per row. The dataframe also has 2 additional columns: date and colour:
Alphabet Date Colour
ABC 2018-09-10 green
DEF 2017-06-11 red
GHI 2016-05-12 blue
JKL NA yellow
MNO NA orange
PQR Unknown brown
Some of the dates are missing / unknown in this dataframe. I have another dataframe which also has a alphabet and a date column. This second dataframe includes the dates for the missing dates in the first dataframe:
Alphabet Date
JKL 2017-06-07
MNO 2018-08-03
PQR 2019-10-07
STU 2019-11-08
VWX 2019-12-08
I want to fill in the missing dates from the first dataframe by matching the Alphabet records from both dataframes and then insert the dates from the second dataframe into the first dataframe.
Desired output:
Alphabet Date Colour
ABC 2018-09-10 green
DEF 2017-06-11 red
GHI 2016-05-12 blue
JKL 2017-06-07 yellow
MNO 2018-08-03 orange
PQR 2019-10-07 brown
Any help is appreciated.
Upvotes: 2
Views: 230
Reputation: 389235
Using dplyr
we can left_join
df1
and df2
and then use coalesce
to fill in missing values.
library(dplyr)
left_join(df1, df2, by = "Alphabet") %>%
mutate(Date = coalesce(Date.y, Date.x)) %>%
select(-Date.x, -Date.y)
# Alphabet Colour Date
#1 ABC green 2018-09-10
#2 DEF red 2017-06-11
#3 GHI blue 2016-05-12
#4 JKL yellow 2017-06-07
#5 MNO orange 2018-08-03
#6 PQR brown 2019-10-07
Upvotes: 1
Reputation: 887851
One option is a join with data.table
library(data.table)
setDT(df1)[df2, Date := i.Date, on = .(Alphabet)]
df1
# Alphabet Date Colour
#1: ABC 2018-09-10 green
#2: DEF 2017-06-11 red
#3: GHI 2016-05-12 blue
#4: JKL 2017-06-07 yellow
#5: MNO 2018-08-03 orange
#6: PQR 2019-10-07 brown
Using the new 'df2n' dataset
i1 <- is.na(df1$Date)|df1$Date %in% "Unknown"
setDT(df1)[df2n[df2n$Alphabet %in% df1$Alphabet[i1],],
Date := i.Date, on = .(Alphabet)]
df1
# Alphabet Date Colour
#1: ABC 2018-09-10 green
#2: DEF 2017-06-11 red
#3: GHI 2016-05-12 blue
#4: JKL 2017-06-07 yellow
#5: MNO 2018-08-03 orange
#6: PQR 2019-10-07 brown
Or using match
from base R
i1 <- match(df2$Alphabet, df1$Alphabet)
df1$Date[i1] <- df2$Date
df1 <- structure(list(Alphabet = c("ABC", "DEF", "GHI", "JKL", "MNO",
"PQR"), Date = c("2018-09-10", "2017-06-11", "2016-05-12", NA,
NA, "Unknown"), Colour = c("green", "red", "blue", "yellow",
"orange", "brown")), class = "data.frame", row.names = c(NA,
-6L))
df2 <- structure(list(Alphabet = c("JKL", "MNO", "PQR"), Date = c("2017-06-07",
"2018-08-03", "2019-10-07")), class = "data.frame", row.names = c(NA,
-3L))
df2a <- structure(list(Alphabet = c("JKL", "MNO", "PQR", "STU", "VWX"
), Date = c("2017-06-07", "2018-08-03", "2019-10-07", "2019-11-08",
"2019-12-08")), class = "data.frame", row.names = c(NA, -5L))
Upvotes: 1