Reputation: 1
I have just been given 2 sales data frames; the main data frame houses all the information while the second has the latest sales for the month.
I need to merge the information from one data frame into the other based on a client's unique ID. Specifically, I would like to update information in the main data frame based on the new df.
I have tried using the merge function in R but have errors in replacing the information. I have done this merge in excel (but it takes a while since I have a lot of information).
The data frames looks a little like this...
MainDF
UID Status Gender SaleType
136273 inactive M Repair
182732 inactive F Product
298372 quit F Repair
283715 inactive M Product
NewDFtoAdd
UID Gender
136273 M
293827 F
283715 F
167643 M
I want to take the existing IDs in the main df and rewrite their status as 'active' if they appear in the NewDFtoAdd.
MainDF
UID Status Gender SaleType
136273 **active** M Repair
182732 inactive F Product
298372 quit F Repair
283715 **active** F Product
I really want to code this in R; so as new data is given I can just update accordingly and my analysis code can continue.
Upvotes: 0
Views: 1046
Reputation: 2332
If you love dplyr
and want to go that route, you can do this:
library(dplyr)
MainDF%>%
mutate_at(vars(Status),
list(~case_when(
UID %in% NewDFtoAdd$UID ~ "active",
TRUE ~ Status
)))
# UID Status Gender SaleType
#1 136273 active M Repair
#2 182732 inactive F Product
#3 298372 quit F Repair
#4 283715 active M Product
Upvotes: 0
Reputation: 886938
We can use a data.table
join to replace the 'Status' to 'active' by joining on
the 'UID'
library(data.table)
setDT(MainDF)[NewDFtoAdd, Status := "active", on = .(UID)]
MainDF
# UID Status Gender SaleType
#1: 136273 active M Repair
#2: 182732 inactive F Product
#3: 298372 quit F Repair
#4: 283715 active M Product
Or using base R
i1 <- MainDF$UID %in% NewDFtoAdd$UID
MainDF$UID[i1] <- "active"
NOTE: This gives the expected output.
MainDF <- structure(list(UID = c(136273L, 182732L, 298372L, 283715L), Status = c("inactive",
"inactive", "quit", "inactive"), Gender = c("M", "F", "F", "M"
), SaleType = c("Repair", "Product", "Repair", "Product")), class = "data.frame", row.names = c(NA,
-4L))
NewDFtoAdd <- structure(list(UID = c(136273L, 293827L, 283715L, 167643L), Gender = c("M",
"F", "F", "M")), class = "data.frame", row.names = c(NA, -4L))
Upvotes: 1