Omry Atia
Omry Atia

Reputation: 2443

mutate_at only vars which satisfy a certain condition

I have the following data frame:

db <- structure(list(x = c(0, 1, 2, 4, 0, 3, 5, 8), y = c(0, 0, 3, 
4, 8, 9, 1, 5), z = c(3, 2, 0, 1, 4, 6, 9, 8)), row.names = c(NA, 
-8L), class = c("tbl_df", "tbl", "data.frame"))

I would like to create a column with the mean of all the other columns which are greater than 0 (a different set each time.

I have tried the following:

db %>% mutate_at(vars(.)>0, rowMeans(.))

What am I doing wrong?

The output in the last column should be 3, 1.5, 2.5, etc.

Upvotes: 2

Views: 143

Answers (1)

Dan Chaltiel
Dan Chaltiel

Reputation: 8484

I couldn't find any good option with dplyr, except using this trick that replace all negative or zero value with NA, which are excluded from the mean calculation:

db %>% 
  mutate_all(~ifelse(.>0,.,NA_integer_)) %>%
  mutate(
    positivemean = rowMeans(., na.rm=TRUE)
  )

Note that this trick is destructive as you lose the values of those values.

Without dplyr though, you can use an apply loop on lines to get the expected output:

db$positivemean = db %>% select(x,y,z) %>% apply(1, function(line){
  mean(line[line>0])
})

Upvotes: 1

Related Questions