Gabriel Hernandez
Gabriel Hernandez

Reputation: 85

Running a conditional through several dataframes stored in a list R

I have a list of dataframes with the following format that I want to run a conditional through:

IDn = c("ChrM", "ChrM" ,"ChrM" ,"ChrM" ,"ChrM")   
posn = c(2,5,7,8,9)
met = c(2,0,4,1,0)
nmet = c(2,1,0,2,0)
bd = c(3,3,0,8,10)
dfp = data.frame(IDn,posn,met,nmet,bd)

      IDn     posn met  nmet bd
    1 ChrM    2    2    2    3
    2 ChrM    5    0    1    3
    3 ChrM    7    4    0    0
    4 ChrM    8    1    2    8
    5 ChrM    9    0    0    10

dfp[crit] <- (dfp[met]+dfp[nmet]>=4) & (dfp[met]>=dfp[bd])

The thing is that every df within the list has a different name, stored under names2

names2[crit] <- as.numeric((names2[met]+names2[nmet]>=4) & (names2[met]>=names2[bd]))

[crit] being a new column to store a 0 or 1 value. I tried to run this with lapply, but had had no luck thus far. Any advice?

Upvotes: 2

Views: 49

Answers (2)

akrun
akrun

Reputation: 886938

We can use transform without any anonymous function

 lapply(dflist, transform, crit = (met + nmet)>=4 & (met >=bd))
#  $d1
#   IDn posn met nmet bd  crit
#1 ChrM    2   2    2  3 FALSE
#2 ChrM    5   0    1  3 FALSE
#3 ChrM    7   4    0  0  TRUE
#4 ChrM    8   1    2  8 FALSE
#5 ChrM    9   0    0 10 FALSE

#$d2
#   IDn posn met nmet bd  crit
#1 ChrM    2   2    2  3 FALSE
#2 ChrM    5   0    1  3 FALSE
#3 ChrM    7   4    0  0  TRUE
#4 ChrM    8   1    2  8 FALSE
#5 ChrM    9   0    0 10 FALSE

Another option using dplyr/purrr would be

library(dplyr)
library(purrr)
dflist %>%
        map(~mutate(., crit=(met+nmet)>=4 & (met >=bd)))
#$d1
#   IDn posn met nmet bd  crit
#1 ChrM    2   2    2  3 FALSE
#2 ChrM    5   0    1  3 FALSE
#3 ChrM    7   4    0  0  TRUE
#4 ChrM    8   1    2  8 FALSE
#5 ChrM    9   0    0 10 FALSE

#$d2
#   IDn posn met nmet bd  crit
#1 ChrM    2   2    2  3 FALSE
#2 ChrM    5   0    1  3 FALSE
#3 ChrM    7   4    0  0  TRUE
#4 ChrM    8   1    2  8 FALSE
#5 ChrM    9   0    0 10 FALSE

data

dflist <- list(d1=dfp, d2=dfp)

Upvotes: 0

Jaap
Jaap

Reputation: 83215

Not sure what is going wrong with your lapply-code (it is always good to include the code you tried into your question), but the following should work:

# creating a list
dflist <- list(d1=dfp, d2=dfp)

# updating the dataframes in your list
dflist <- lapply(dflist, function(x) {x$crit <- (x$met + x$nmet >= 4) & (x$met>=x$bd); x})

# or:
dflist <- lapply(dflist, function(x) {cbind(x, crit = (x$met + x$nmet >= 4) & (x$met>=x$bd))})

which results in the following list:

> dflist
$d1
   IDn posn met nmet bd  crit
1 ChrM    2   2    2  3 FALSE
2 ChrM    5   0    1  3 FALSE
3 ChrM    7   4    0  0  TRUE
4 ChrM    8   1    2  8 FALSE
5 ChrM    9   0    0 10 FALSE

$d2
   IDn posn met nmet bd  crit
1 ChrM    2   2    2  3 FALSE
2 ChrM    5   0    1  3 FALSE
3 ChrM    7   4    0  0  TRUE
4 ChrM    8   1    2  8 FALSE
5 ChrM    9   0    0 10 FALSE

In response to your comment:

As you are working with data.table, you could also use:

dflist <- lapply(dflist, function(x) x[, crit := (met + nmet >= 4) & (met>=bd)])

Upvotes: 3

Related Questions