Reputation: 65

Augment data frame missed values by another data frame

I have following data frames:

> df1 = data.frame(ind = 1:4, x=c('a', 'b', NA, 'd'))
> df2 = data.frame(ind = 1:4, x=c(NA, NA, 'c', NA))
> df1
  ind    x
1   1    a
2   2    b
3   3 <NA>
4   4    d
> df2
  ind    x
1   1 <NA>
2   2 <NA>
3   3    c
4   4 <NA>

I want combine them filling missing values in df1 by numeric values from df2. How can I do that? I cannot do that neither with merge nor with join commands:

> merge(df1, df2, by='ind', all=T)
  ind  x.x  x.y
1   1    a <NA>
2   2    b <NA>
3   3 <NA>    c
4   4    d <NA>

Upvotes: 0

Answers (3)

IRTFM

Reputation: 263421

The way you constructed the test case creates factors and that imposes extra barriers to compact solutions, because the levels are not congruent. You can either create the factors with levels= the union of their unique values or preferably use character vectors:

df1 = data.frame(ind = 1:4, x=c('a', 'b', NA, 'd'), stringsAsFactors=FALSE)
df2 = data.frame(ind = 1:4, x=c(NA, NA, 'c', NA), stringsAsFactors=FALSE)
df1[is.na(df1)] <- df2[is.na(df1)] # the key is same index on both sides
 df1
#---------
  ind x
1   1 a
2   2 b
3   3 c
4   4 d

The arguably less preferred method (but one that might be better for a pair of in place datasets you did not want to reprocess) would be:

 df1$x <- factor(df1$x, levels=union(levels(df1$x), levels(df2$x) ) )
 df2$x <- factor(df2$x, levels=union(levels(df1$x), levels(df2$x) ) )
 df1[is.na(df1)] <- df2[is.na(df1)]

Upvotes: 3

Chase

Reputation: 69201

What do you do if x is NA in both datasets? Does this do what you want?

x <- merge(df1, df2, all = TRUE, by = "ind")
x <- transform(x, newcol = ifelse(is.na(x.x), as.character(x.y), as.character(x.x)))

> x
  ind  x.x  x.y newcol
1   1    a <NA>      a
2   2    b <NA>      b
3   3 <NA>    c      c
4   4    d <NA>      d

Upvotes: 1

joran

Reputation: 173647

How about this:

rbind(df1[complete.cases(df1),],df2[complete.cases(df2),])
  index x
1     1 a
2     2 b
3     3 c
4     4 d

Upvotes: 1

Augment data frame missed values by another data frame

Answers (3)

Related Questions