Reputation: 65
I have following data frames:
> df1 = data.frame(ind = 1:4, x=c('a', 'b', NA, 'd'))
> df2 = data.frame(ind = 1:4, x=c(NA, NA, 'c', NA))
> df1
ind x
1 1 a
2 2 b
3 3 <NA>
4 4 d
> df2
ind x
1 1 <NA>
2 2 <NA>
3 3 c
4 4 <NA>
I want combine them filling missing values in df1 by numeric values from df2. How can I do that? I cannot do that neither with merge nor with join commands:
> merge(df1, df2, by='ind', all=T)
ind x.x x.y
1 1 a <NA>
2 2 b <NA>
3 3 <NA> c
4 4 d <NA>
Upvotes: 0
Views: 364
Reputation: 263421
The way you constructed the test case creates factors and that imposes extra barriers to compact solutions, because the levels are not congruent. You can either create the factors with levels= the union of their unique values or preferably use character vectors:
df1 = data.frame(ind = 1:4, x=c('a', 'b', NA, 'd'), stringsAsFactors=FALSE)
df2 = data.frame(ind = 1:4, x=c(NA, NA, 'c', NA), stringsAsFactors=FALSE)
df1[is.na(df1)] <- df2[is.na(df1)] # the key is same index on both sides
df1
#---------
ind x
1 1 a
2 2 b
3 3 c
4 4 d
The arguably less preferred method (but one that might be better for a pair of in place datasets you did not want to reprocess) would be:
df1$x <- factor(df1$x, levels=union(levels(df1$x), levels(df2$x) ) )
df2$x <- factor(df2$x, levels=union(levels(df1$x), levels(df2$x) ) )
df1[is.na(df1)] <- df2[is.na(df1)]
Upvotes: 3
Reputation: 69201
What do you do if x
is NA
in both datasets? Does this do what you want?
x <- merge(df1, df2, all = TRUE, by = "ind")
x <- transform(x, newcol = ifelse(is.na(x.x), as.character(x.y), as.character(x.x)))
> x
ind x.x x.y newcol
1 1 a <NA> a
2 2 b <NA> b
3 3 <NA> c c
4 4 d <NA> d
Upvotes: 1
Reputation: 173647
How about this:
rbind(df1[complete.cases(df1),],df2[complete.cases(df2),])
index x
1 1 a
2 2 b
3 3 c
4 4 d
Upvotes: 1