Reputation: 781
My goal is merging 2 data sets together but I need to keep track "NA", I need to "NA" when I merge df1 and df2, and "NA" of df1 assign something like 9999. The problem is my data have so many variables, it includes day, numeric, character..., so when I try to set "NA" of df1 by df1[is.na(df1)] <- 9999
, it works only numeric, there is any way I can approach my solution to have different NA seperately.
df1 <- data.frame(ID= c(1:10),
Value=c(3,NA,7,2:8),
Group = c("A",NA,"C","D",NA,"B",NA,"C","D",NA))
df2 <- data.frame(ID = c(5:14),Count =c(1:9,NA),
School = c("A",NA,"C","D",NA,"B","NA","C","D",NA))
df1[is.na(df1)] <- 9999
data <- merge(df1,df2,all = TRUE,by= "ID")
ID Value Group Count School
1 1 3 A NA <NA>
2 2 9999 <NA>* NA <NA>
3 3 7 C NA <NA>
4 4 2 D NA <NA>
5 5 3 <NA>* 1 A
6 6 4 B 2 <NA>
7 7 5 <NA>* 3 C
8 8 6 C 4 D
9 9 7 D 5 <NA>
10 10 8 <NA>* 6 B
11 11 NA <NA> 7 NA
12 12 NA <NA> 8 C
13 13 NA <NA> 9 D
14 14 NA <NA> NA <NA>
The ones *
suppose 9999
Upvotes: 1
Views: 274
Reputation: 23574
I'd like to contribute a bit more to this question. If you have, say, 100 columns in various classes and try to replace all NAs, you could try the following. The idea is that you convert all columns to character and replace all NAs with 9999. Then, you want to convert the classes of the columns back to the original classes. Finally, you merge df1 and df2.
library(dplyr)
# Save original classes.
original <- unlist(lapply(df1, class))
# Convert all columns to character and replace NAs with 9999
mutate_all(df1, as.character) %>%
mutate_each(funs(recode(., .missing = "9999"))) -> df1
# http://stackoverflow.com/questions/7680959/convert-type-of-multiple-columns-of-a-dataframe-at-once
# Credit to joran for this function.
convert.magic <- function(obj,types){
for (i in 1:length(obj)){
FUN <- switch(types[i],character = as.character,
numeric = as.numeric,
factor = as.factor,
integer = as.integer,
logical = as.logical)
obj[,i] <- FUN(obj[,i])
}
obj
}
out <- convert.magic(df1, original) %>%
full_join(df2, by = "ID")
out
# ID Value Group Count School
#1 1 3 A NA <NA>
#2 2 9999 9999 NA <NA>
#3 3 7 C NA <NA>
#4 4 2 D NA <NA>
#5 5 3 9999 1 A
#6 6 4 B 2 <NA>
#7 7 5 9999 3 C
#8 8 6 C 4 D
#9 9 7 D 5 <NA>
#10 10 8 9999 6 B
#11 11 NA <NA> 7 NA
#12 12 NA <NA> 8 C
#13 13 NA <NA> 9 D
#14 14 NA <NA> NA <NA>
Upvotes: 1
Reputation: 520918
You can try replacing the NA
values in df1$Group
before doing the merge:
df1$Group <- as.character(df1$Group)
df1$Group[is.na(df1$Group)] <- 9999
But I have the feeling that you know this already, but were being thrown off because df1$Group
is a factor, meaning the above code won't work as expected unless you first cast using as.character
. You could also do this substitution after the merge.
Upvotes: 1