Reputation: 13
Let's say I have three data sets:
df1 <- data.frame(var1 = c(1,2,3), var2 = c(1,2,3))
df2 <- data.frame(var1 = c(1,2,3), var2 = c(1,2,3))
df3 <- data.frame(var1 = c(1,2,3), var2 = c(1,2,3), var3 = c(1,2,3))
I would like to check to see if a variable "var3" exists within each dataset. If it doesn't, I would like to generate an empty variable called "var3". Here is what I am trying:
dframes <- list(df1,df2,df3)
lapply(dframes, function(df) {
ifelse("var3" %in% colnames(df), print("var3 exists"), df$var3 <- NA)
})
The output comes out as:
[[1]]
[1] NA
[[2]]
[1] NA
[[3]]
[1] "var3 exists"
And the desired "var3" variable isn't generated for the first two data sets - they still only contain "var1" and "var2".
You're help is appreciated.
Upvotes: 1
Views: 1430
Reputation: 6913
Just putting what everyone has said into a full answer:
df1 <- data.frame(var1 = c(1,2,3), var2 = c(1,2,3))
df2 <- data.frame(var1 = c(1,2,3), var2 = c(1,2,3))
df3 <- data.frame(var1 = c(1,2,3), var2 = c(1,2,3), var3 = c(1,2,3))
dframes <- list(df1,df2,df3)
dfframes_fmt <- lapply(dframes, function(df) {
if(! "var3" %in% colnames(df)) {
df$var3 <- NA
}
df
})
> dfframes_fmt
[[1]]
var1 var2 var3
1 1 1 NA
2 2 2 NA
3 3 3 NA
[[2]]
var1 var2 var3
1 1 1 NA
2 2 2 NA
3 3 3 NA
[[3]]
var1 var2 var3
1 1 1 1
2 2 2 2
3 3 3 3
In order to update to the original names, you can do this:
dfnames <- c("df1", "df2", "df3")
# assemble the list of data frames
dframes <- eval(parse(text = paste0("list(", paste0(dfnames, collapse = ","), ")")))
for(k in seq_along(dframes)){
set <- dframes[[k]]
if(! "var3" %in% colnames(set)) {
set$var3 <- NA
}
# assign the df back to the original name
eval(parse(text = paste0(dfnames[k], " = set")))
}
> df1
var1 var2 var3
1 1 1 NA
2 2 2 NA
3 3 3 NA
> df2
var1 var2 var3
1 1 1 NA
2 2 2 NA
3 3 3 NA
> df3
var1 var2 var3
1 1 1 1
2 2 2 2
3 3 3 3
Upvotes: 1