Reputation: 2443
I have the following data frame:
db <- structure(list(x = c(0, 1, 2, 4, 0, 3, 5, 8), y = c(0, 0, 3,
4, 8, 9, 1, 5), z = c(3, 2, 0, 1, 4, 6, 9, 8)), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))
I would like to create a column with the mean of all the other columns which are greater than 0 (a different set each time.
I have tried the following:
db %>% mutate_at(vars(.)>0, rowMeans(.))
What am I doing wrong?
The output in the last column should be 3, 1.5, 2.5, etc.
Upvotes: 2
Views: 143
Reputation: 8484
I couldn't find any good option with dplyr
, except using this trick that replace all negative or zero value with NA, which are excluded from the mean calculation:
db %>%
mutate_all(~ifelse(.>0,.,NA_integer_)) %>%
mutate(
positivemean = rowMeans(., na.rm=TRUE)
)
Note that this trick is destructive as you lose the values of those values.
Without dplyr
though, you can use an apply
loop on lines to get the expected output:
db$positivemean = db %>% select(x,y,z) %>% apply(1, function(line){
mean(line[line>0])
})
Upvotes: 1