orville jackson
orville jackson

Reputation: 1878

How to use lapply to transform specific values in a list of dataframes

I'm looking for help to transform a for loop into an lapply or similar function.

I have a list of similar data.frames, each containing

I want to invert the values in column b for each data frame, but only for specific indicators. For example, invert all values in 'b' that have an indicator of 2 in column a.

Here are some sample data:

x = data.frame(a = c(1, 2, 3, 2),  b = (seq(from = .1, to = 1, by = .25)))
y = data.frame(a = c(1, 2, 3, 2),  b = (seq(from = 1, to = .1, by = -.25)))
my_list <- list(x = , y = y)

my_list
$x
  a    b
1 1 0.10
2 2 0.35
3 3 0.60
4 2 0.85

$y
  a    b
1 1 1.00
2 2 0.75
3 3 0.50
4 2 0.25

My desired output looks like this:

my_list
$x
  a    b
1 1 0.10
2 2 0.65
3 3 0.60
4 2 0.15

$y
  a    b
1 1 1.00
2 2 0.25
3 3 0.50
4 2 0.75

I can achieve the desired output with the following for loop.

for(i in 1:length(my_list)){
    my_list[[i]][my_list[[i]]['a'] == 2, 'b'] <-
        1 - my_list[[i]][my_list[[i]]['a'] == 2, 'b']
}

BUT. When I try to roll this into lapply form like so:

    invertfun <- function(inputDF){
    inputDF[inputDF['a'] == 2, 'b'] <- 1 - inputDF[inputDF['a'] == 2, 'b']
    }
resultList <- lapply(X = my_list, FUN = invertfun)

I get a list with only the inverted values:

resultList
$x
[1] 0.65 0.15

$y
[1] 0.25 0.75

What am I missing here? I've tried to apply (pun intended) the insights from:

how to use lapply instead of a for loop, to perform a calculation on a list of dataframes in R

I'd appreciate any insights or alternative solutions. I'm trying to take my R skills to the next level and apply and similar functions seem to be the key.

Upvotes: 1

Views: 606

Answers (3)

dardisco
dardisco

Reputation: 5274

lapply is typically not the best way to iteratively modify a list. lapply is going to generate a loop internally in any case, so usually easier to read if you do something more explicit:

for (i in seq_along(my_list)) {
    my_list[[i]] <- within(my_list[[i]], {
        b[a==2] <- 1 - b[a==2]
    })}

If we replace within with with in the example above, we get the output from your initial solution, i.e. lapply(X = my_list, FUN = invertfun).

That is, instead of modifying the list in place the latter solutions replace the list elements with new vectors.

Upvotes: 0

orville jackson
orville jackson

Reputation: 1878

See Ronak's answer above for a fairly elegant solution using transform() or map(), but for those who are following in my footsteps, my original solution would work if I added a line in the custom function to return the full data frame like so:

invertfun <- function(inputDF){
    inputDF[inputDF['a'] == 2, 'b'] <- 1 - inputDF[inputDF['a'] == 2, 'b']
return(inputDF)    
}

resultList <- lapply(X = my_list, FUN = invertfun)

UPDATE - On further testing, this solution throws an Error in x[[jj]][iseq] <- vjj : replacement has length zero when the desired 'a' value doesn't exist in one of the data frames. So best not to go down this road and use the accepted answer above.

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 389235

We could use lapply to loop over each list and change the b column based on value in a column.

my_list[] <- lapply(my_list, function(x) transform(x, b = ifelse(a==2, 1-b, b)))

my_list
#[[1]]
#  a    b
#1 1 0.10
#2 2 0.65
#3 3 0.60
#4 2 0.15

#[[2]]
#  a    b
#1 1 1.00
#2 2 0.25
#3 3 0.50
#4 2 0.75

The same could be done using map from purrr

library(purrr)
map(my_list, function(x) transform(x, b = ifelse(a==2, 1-b, b)))

Upvotes: 1

Related Questions