Reputation: 61
I want to update values from table df1 with values from df2, only updating null values or zeros. I can do it with data.table or dplyr, but I can´t automate to all columns.
#data.table
df1 <- data.frame(x1=1:4, x2=c('a','b', NA, 'd'), x3=c(0,0,2,2), stringsAsFactors=FALSE)
df2 <- data.frame(x1=2:3, x2=c("zz", "qq"),x3=6:7, stringsAsFactors=FALSE)
require(data.table)
setDT(df1); setDT(df2)
df1[df2, on = .(x1), x2 := ifelse(is.na(x2) | x2 == 0 ,i.x2,x2)]
#dplyr
require(dplyr)
require(dplyr)
inner_join(df1,df2,by = c("x1" = "x1")) %>%
transmute(x1 = x1,
x2 =ifelse(is.na(x2.x) | x2.x == 0,x2.y,x2.x),
x3 =ifelse(is.na(x3.x) | x3.x == 0,x3.y,x3.x))
With dplyr at least I can manually adding columns getting the expected output, the problem is real dataframe has so much columns. Therefore I want to iterate across columns to achieved the task.
What I´ve tried:
# dplyr + apply
inner_join(df1,df2,by = c("x1" = "x1")) %>%
cbind(.$x1,
apply(.[-1],2, function(cname) ifelse(is.na(cname) | cname == 'b',paste(cname, ".x", collapse = ""),paste(cname, ".y", collapse = "")))
)
# data.table with for
for (cname in names(df1)[!names(df1) %in% c("x1")]) {
df1[i = df2, on = .(x1), j = cname := {function (x) ifelse(is.na(x) | x == 'b',i.x,x)} (cname)
, with = FALSE]
}
# data.table + lapply
df1[i = df2, on = .(x1) ,names(df1)[!names(df1) %in% c("x1")] := lapply(df1[,names(df1)[!names(df1) %in% c("x1")],with=FALSE],
function(x) ifelse(is.na(x) | x == 0,df2.x,df1.x))]
Upvotes: 2
Views: 823
Reputation: 27792
Here is a pure data.table
approach...
The melting process takes care of all columns you wish to 'fill, putting them all in one single set of columns (variable and value). Then fill in all the 0/NA values using an update join (=fast!) Finally, recast everything back to it's original shape.
library(data.table)
#set to data.table
setDT(df1)
setDT(df2)
#melt to long
melt1 <- melt(df1, id.vars = "x1" )
melt2 <- melt(df2, id.vars = "x1" )
#join all values with value NA or 0
melt1[ is.na(value) | value == 0,
value := melt1[ is.na( value) | value == 0,][ melt2, value := i.value, on = .(x1, variable) ]$value][]
#cast to original wide format
dcast( melt1, x1 ~ variable )
output
# x1 x2 x3
# 1: 1 a 0
# 2: 2 b 6
# 3: 3 qq 2
# 4: 4 d 2
Upvotes: 0
Reputation: 25223
For data.table
, you can use:
for (x in setdiff(names(df1), "x1")) {
df1[is.na(get(x)) ! get(x)==0, (x) := df2[.SD, on=.(x1), get(x)]]
}
Upvotes: 1
Reputation: 389325
Using base R, you can create a function to replace NA
and 0 with corresponding values from another column
replace_na_0 <- function(x) {
ifelse(is.na(x[[1]]) | x[[1]] == 0,x[[2]],x[[1]])
}
Do merge and pass group of columns by removing their postfix (.x
, .y)
to replace_na_0
function
temp_df <- merge(df1, df2, by = "x1")
cbind(temp_df[1], sapply(split.default(temp_df[-1],
sub("\\..*", "", names(temp_df)[-1])), replace_na_0))
# x1 x2 x3
#1 2 b 6
#2 3 qq 2
Upvotes: 2