R combine two data frames by NA

Question

I have a data frame (DF) and like to combine two columns in the first of them by replacing the NAs in the first column with the values in the second. Here is an example DF:

structure(list(A = structure(c(3L, 5L, 4L, 2L, 1L, NA, NA, NA, 
NA, NA), .Label = c("five", "four", "one", "three", "two"), class = "factor"), 
B = structure(c(4L, NA, NA, 2L, NA, 6L, 5L, 1L, 3L, 7L), .Label = c("eight", 
"four", "nine", "one", "seven", "six", "ten"), class = "factor")), .Names = c("A", 
"B"), row.names = c(NA, -10L), class = "data.frame")

As you can see the DF contains the numbers from one to ten in two columns.

I want the NAs in column A to be be replaced by the values in column B. But only the NAs of A!

I tried:

X$A[is.na(X$A)] <- X$B[is.na(X$A)]

But this gives me an invalid factor level error warning.

Solutions I found mostly deal with merge() or paste(), but I think that's not going to help here. Your suggestions are welcome, as always :)

Thanks a lot!

Mikko · Accepted Answer

The problem is that you use factors. This should work:

X$A <- as.character(X$A)
X$B <- as.character(X$B)
X$A[is.na(X$A)] <- X$B[is.na(X$A)]

If you want to avoid data.frame() function converting everything to factors before you have modified your data, use stringsAsFactors = FALSE option. For example: data.frame(apply(X, 2, as.character), stringsAsFactors = F).

R combine two data frames by NA

Answers (1)

Related Questions