Reputation: 661

apply() function to only certain columns

I have a data frame that looks like the following (with reproducible code):

# create the table
name <- c("Mary", "John", "Peter")
id1 <- c(50, 30, 25)
id2 <- c(8, 12, 90)
id3 <- c(14, 17, 34)
id4 <- c(9, 67, 89)
id5 <- c(20, 21, 22)
beep <- c(15, 20, 23)

# combine the df
df <- data.frame(name, id1, id2, id3, id4, id5, beep)

# show df
df
   name id1 id2 id3 id4 id5 beep
1  Mary  50   8  14   9  20   15
2  John  30  12  17  67  21   20
3 Peter  25  90  34  89  22   23

I want to re-code each cell with an "id#" less than the "beep" variable to 1 and 0 otherwise. I've tried the following:

apply(df, 2, function(x) {
 ifelse(x < df$beep, 1, 0)})

This produces the following vector:

     name id1 id2 id3 id4 id5 beep
[1,]    0   0   1   1   1   0    0
[2,]    0   0   1   1   0   0    0
[3,]    0   0   0   0   0   1    0

The issue with the above vector is that I don't want the "name" or "beep" variable to change. Any suggestions?

Upvotes: 1

Answers (3)

G. Grothendieck

Reputation: 270195

1) mutate/across With dplyr one can use mutate/across. The first argument of across defines which columns to use and the second is the function to apply to each such column. The right hand side of the formula is the body of the function and dot is the argument to the function. We use + to convert the logical result to numeric.

library(dplyr)

df %>% mutate(across(starts_with("id"), ~ +(. < beep)))
##    name id1 id2 id3 id4 id5 beep
## 1  Mary   0   1   1   1   0   15
## 2  John   0   1   1   0   0   20
## 3 Peter   0   0   0   0   1   23

2) modify_if The purrr package has a function which will modify only columns satisfying the condition defined by the second argument. It supports the same shorthand for functions as in (1).

library(purrr)

modify_if(df, startsWith(names(df), "id"), ~ +(. < df$beep))

##    name id1 id2 id3 id4 id5 beep
## 1  Mary   0   1   1   1   0   15
## 2  John   0   1   1   0   0   20
## 3 Peter   0   0   0   0   1   23

3) replace This is basically the same as another answer but uses grep and replace instead. No packages are used.

ix <- grep("^id", names(df))
replace(df, ix, +(df[ix] < df$beep))
##    name id1 id2 id3 id4 id5 beep
## 1  Mary   0   1   1   1   0   15
## 2  John   0   1   1   0   0   20
## 3 Peter   0   0   0   0   1   23

4) modifyList Its modifyList replaces the columns in the first argument by the columns in the second argument using name matching. Both arguments must be lists or data frames (not matrices).

ix <- grep("^id", names(df))
modifyList(df, +as.data.frame(df[ix] < df$beep))
##    name id1 id2 id3 id4 id5 beep
## 1  Mary   0   1   1   1   0   15
## 2  John   0   1   1   0   0   20
## 3 Peter   0   0   0   0   1   23

(This used to be in the lattice package but now it is in utils which is part of base R.)

Upvotes: 2

Ronak Shah

Reputation: 389235

You might have NA's in your data which would return NA if you compare with <. You can do an additional check with is.na to handle NA values.

cols <- grep('id', names(df))
df[cols] <- +(df[cols] < df$beep & !is.na(df[cols]))

Upvotes: 1

ThomasIsCoding

Reputation: 102625

You don't need apply, you can try the code below

df[startsWith(names(df), "id")] <- +(df[startsWith(names(df), "id")] < df$beep)

which gives

> df
   name id1 id2 id3 id4 id5 beep
1  Mary   0   1   1   1   0   15
2  John   0   1   1   0   0   20
3 Peter   0   0   0   0   1   23

If you really want to use apply, below is one option

idx <- grep("^id", names(df))
df[idx] <- apply(df[idx], 2, function(x) ifelse(x < df$beep, 1, 0))

Upvotes: 2

apply() function to only certain columns

Answers (3)

Related Questions