Reputation: 207
I have three data frames, that could be stored as such
dfs <- list("ibu_819", "ibu_1121", "ibu_1022")
and a list of variables for which I need to complete a very simple operation: changing all the 2s to 0s (an incorrectly coded dummy variable)
vars <- list("bene_lastyear", "bene_nextyear", "child_death","citychild")
I have done so successfully using this clunky code
ibu_819 <- ibu_819 %>%
mutate(bene_lastyear = if_else(bene_lastyear == 2, 0,1),
bene_nextyear = if_else(bene_nextyear == 2, 0,1),
child_death = if_else(child_death == 2, 0,1),
citychild = if_else(citychild == 2, 0,1))
ibu_1121 <- ibu_1121 %>%
mutate(bene_lastyear = if_else(bene_lastyear == 2, 0,1),
bene_nextyear = if_else(bene_nextyear == 2, 0,1),
child_death = if_else(child_death == 2, 0,1),
citychild = if_else(citychild == 2, 0,1))
ibu_1022 <- ibu_1022 %>%
mutate(bene_lastyear = if_else(bene_lastyear == 2, 0,1),
bene_nextyear = if_else(bene_nextyear == 2, 0,1),
child_death = if_else(child_death == 2, 0,1),
citychild = if_else(citychild == 2, 0,1))
I have always performed my data cleaning in stata, where I would certainly want to take care of this task in one tidy loop, but I can't figure out how to do so in R. I'd love if someone could show me how to do exactly what I have done by looping over the two lists provided above, and only writing the actual mutate function once.
(also open to suggestions for a prettier solution than my if_else
strategy. I'm sure there's a more fluid way to change my 2s to 0s, but I just did what I did because I knew how.)
ALSO, I should note that I do not want to append my data frames just yet, so please don't solve this by combining the data frames and then looping through the variables.
Upvotes: 0
Views: 431
Reputation: 5336
Keeping data frames names as a list of strings is a bit odd, having a list of the dataframes themselves would be better. That is:
dfs <- list(ibu_819, ibu_11211, ibu_1022)
Then you could use:
for(d in dfs){
for(v in vars) d[[v]][d[[v]]==2] <- 0
}
Note only the copies inside the list would be updated. To copy them back into the main environment you'd need to use a named list and then the list2env
function. So the whole thing would be:
dfs <- list("iby_819"=ibu_819, "ibu_11211"=ibu_11211, "ibu_1022"=ibu_1022)
for(d in dfs){
for(v in vars) d[[v]][d[[v]]==2] <- 0
}
list2env(dfs, globalenv())
If you want to do it using the list of dataframe names, (ie dfs is the list of strings you currently have) then I think you have to make a copy of the data frame inside the loop, then assign it back when you're done. This isn't good practice though.
for (d in dfs){
df <- get(d)
for(v in vars) df[[v]][df[[v]]==2] <- 0
assign(d, df)
}
Finally, that pattern:
x[x==2] <- 0
Is how I would replace all the 2s with 0s in a vector. Does the same as replace x=0 if x==2
in Stata.
Upvotes: 0
Reputation: 2584
Another option using Map
#create dummy data
l <- list(df1 <- data.frame(a=1:10),
df2 <- data.frame(b=1:10),
df3 <- data.frame(c=1:10)
)
var <- c("a","b","c")
#function to replace old values with new one
myfun <- function(df,var){
df[df[[var]]==2,var] <- 0
return(df)
}
res <- Map(myfun,l,var)
Here the original list of data.frame is preserved and all values =2 are update to 0 in the new list of data.frame, called res
Upvotes: 1