Reputation: 323
I need to replace the values in one data frame with values from another data frame. I am trying to find the simplest way to do so but I may be over thinking it.
Here is a sample of the data from DF1:
season article 1st_booking 2nd_booking
SS20 EF0647 2019-06-25 2019-07-09
SS20 059611 2019-07-30 2019-08-13
SS20 EG3208 2019-10-29 <NA>
SS20 EF9348 2019-10-29 2019-11-12
SS20 EE4609 2019-08-27 2019-10-29
SS20 EF7610 2019-09-24 2019-10-29
SS20 EH1307 2019-09-24 2019-10-29
SS20 EH1308 2019-09-24 2019-10-29
SS20 EH1309 2019-09-24 2019-10-29
SS20 EH1310 2019-09-24 2019-10-29
And from DF2:
season article order_cutoff booking_deadline
SS20 EF0647 2019-06-25 2019-06-07
SS20 EF0647 2019-07-09 2019-06-07
SS20 EF0647 2019-12-10 2019-11-08
SS20 059611 2019-07-30 2019-07-12
SS20 059611 2019-08-13 2019-07-12
SS20 059611 2019-10-08 2019-09-06
SS20 EG3208 2019-10-29 2019-10-11
SS20 EF9348 2019-10-29 2019-10-11
SS20 EF9348 2019-11-12 2019-10-11
SS20 EF9348 2019-11-26 2019-11-08
Note that 1st_booking & 2nd_booking from DF1 are called 'order_cutoff' in DF2. What I would like to do is in DF1, replace the values in columns 1st_booking & 2nd_booking with the related booking_deadline from DF2. Tried to do a merge but I don't want to create a new column - just replace the values in DF1 with the values in DF2
Upvotes: 0
Views: 99
Reputation: 389135
I am not exactly sure about the expected output. If you want to match 1st_booking
and 2nd_booking
to order_cutoff
for each article
and season
, we can get the data in long format do a left_join
matching the corresponding columns and get the data in wide format again.
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(cols = ends_with("booking")) %>%
left_join(df2, by = c('season' = 'season', 'article' = 'article',
'value' = 'order_cutoff')) %>%
select(-value) %>%
pivot_wider(names_from = name, values_from = booking_deadline)
# A tibble: 10 x 4
# season article `1st_booking` `2nd_booking`
# <fct> <chr> <fct> <fct>
# 1 SS20 EF0647 2019-06-07 2019-06-07
# 2 SS20 059611 2019-07-12 2019-07-12
# 3 SS20 EG3208 2019-10-11 NA
# 4 SS20 EF9348 2019-10-11 2019-10-11
# 5 SS20 EE4609 NA NA
# 6 SS20 EF7610 NA NA
# 7 SS20 EH1307 NA NA
# 8 SS20 EH1308 NA NA
# 9 SS20 EH1309 NA NA
#10 SS20 EH1310 NA NA
If you want to only combine by dates and not season
and article
you can use match
transform(df1,
`1st_booking` = df2$booking_deadline[match(`1st_booking`, df2$order_cutoff)],
`2nd_booking` = df2$booking_deadline[match(`2nd_booking`, df2$order_cutoff)])
data
df1 <- structure(list(season = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = "SS20", class = "factor"), article = structure(c(3L,
1L, 6L, 5L, 2L, 4L, 7L, 8L, 9L, 10L), .Label = c("059611", "EE4609",
"EF0647", "EF7610", "EF9348", "EG3208", "EH1307", "EH1308", "EH1309",
"EH1310"), class = "factor"), `1st_booking` = structure(c(1L,
2L, 5L, 5L, 3L, 4L, 4L, 4L, 4L, 4L), .Label = c("2019-06-25",
"2019-07-30", "2019-08-27", "2019-09-24", "2019-10-29"), class = "factor"),
`2nd_booking` = structure(c(2L, 3L, 1L, 5L, 4L, 4L, 4L, 4L,
4L, 4L), .Label = c("<NA>", "2019-07-09", "2019-08-13", "2019-10-29",
"2019-11-12"), class = "factor")), class = "data.frame", row.names = c(NA, -10L))
df2 <- structure(list(season = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = "SS20", class = "factor"), article = structure(c(2L,
2L, 2L, 1L, 1L, 1L, 4L, 3L, 3L, 3L), .Label = c("059611", "EF0647",
"EF9348", "EG3208"), class = "factor"), order_cutoff = structure(c(1L,
2L, 9L, 3L, 4L, 5L, 6L, 6L, 7L, 8L), .Label = c("2019-06-25",
"2019-07-09", "2019-07-30", "2019-08-13", "2019-10-08", "2019-10-29",
"2019-11-12", "2019-11-26", "2019-12-10"), class = "factor"),
booking_deadline = structure(c(1L, 1L, 5L, 2L, 2L, 3L, 4L,
4L, 4L, 5L), .Label = c("2019-06-07", "2019-07-12", "2019-09-06",
"2019-10-11", "2019-11-08"), class = "factor")), class = "data.frame",
row.names = c(NA, -10L))
Upvotes: 1