Reputation: 671
For a list of data frames, I would like to check if a column is present and if it's not, add that column with NA's to all data frames. Most importantly, I am trying to overwrite the old data frames.
Datasets:
df1 <- data.frame(a=c(1,2), b=c(3,NA))
df2 <- data.frame(b=c(1,2), c=c(3,NA))
df_list=list(df1, df2)
name <- "a"
My attempt:
df_list <- lapply(df_list, function(x) x[name[!(name %in% colnames(x))]] = NA)
I am looking for this result:
> df_list
[[1]]
a b
1 1 3
2 2 NA
[[2]]
b c a
1 1 3 NA
2 2 NA NA
Upvotes: 1
Views: 58
Reputation: 39717
Modifying you code - what was missing was to return the updated x
or using setdiff
.
#lapply(df_list, function(x) x[name[!(name %in% colnames(x))]] = NA) #Your original code
lapply(df_list, function(x) {x[name[!(name %in% colnames(x))]] = NA; x}) #Modified
lapply(df_list, function(x) {x[,setdiff(name, names(x))] <- NA; x}) #Alternative
#[[1]]
# a b
#1 1 3
#2 2 NA
#
#[[2]]
# b c a
#1 1 3 NA
#2 2 NA NA
Upvotes: 1
Reputation: 39613
I would suggest a similar approach like @GregorThomas but using vectors to save the results of those dataframes which do not contain the variable and then with lapply()
you can create the desired variable:
#Data
df1 <- data.frame(a=c(1,2), b=c(3,NA))
df2 <- data.frame(b=c(1,2), c=c(3,NA))
df_list=list(df1, df2)
name <- "a"
#Check
x <- sapply(df_list,function(x) length(which(names(x)==name)))
y <- which(x==0)
#Format new list
df_list[y] <- lapply(df_list[y],function(x) {x[[name]]<-NA;return(x)})
Output:
df_list
[[1]]
a b
1 1 3
2 2 NA
[[2]]
b c a
1 1 3 NA
2 2 NA NA
Upvotes: 1
Reputation: 146040
I would use a for
loop to modify the data frames in place:
for(i in seq_along(df_list)) {
if(!name %in% names(df_list[[i]])) {
df_list[[i]][[name]] = NA
}
}
You could take a similar approach with lapply
, but in this case I find the for loop easier to understand. We need to make sure the lapplied function returns the data frame--either modified or as-is (this is the main difference from your attempt).
df_list = lapply(df_list, function(x) {
if(! name %in% names(x)) {
x[[name]] = NA
}
return(x)
})
Upvotes: 1