Reputation: 1587
Sample data:
df <- tibble(x = c(0.1, 0.2, 0.3, 0.4),
y = c(0.1, 0.1, 0.2, 0.3),
z = c(0.1, 0.2, 0.2, 0.2))
df
# A tibble: 4 x 3
x y z
<dbl> <dbl> <dbl>
1 0.1 0.1 0.1
2 0.2 0.1 0.2
3 0.3 0.2 0.2
4 0.4 0.3 0.2
I want to sum across rows and I want to only add up the "cells" that meet a certain logical condition. In this example, I want to add up, rowwise, only cells that contain a equal to or greater than a specified threshold.
Desired Output
threshold <- 0.15
# A tibble: 4 x 4
x y z cond_sum
<dbl> <dbl> <dbl> <dbl>
1 0.1 0.1 0.1 0
2 0.2 0.1 0.2 0.4
3 0.3 0.2 0.2 0.7
4 0.4 0.3 0.2 0.9
Pseudo-code
This is the wrangling idea I have in mind.
df %>%
rowwise() %>%
mutate(cond_sum = sum(c_across(where(~ "cell" >= threshold))))
Tidy solutions appreciated!
Upvotes: 2
Views: 338
Reputation: 886938
An efficient option is replace the values that are below the threshold to NA and make use of na.rm
in rowSums
instead of rowwise/c_across
library(dplyr)
df %>%
mutate(cond_sum = rowSums(replace(., . < threshold, NA), na.rm = TRUE))
-output
# A tibble: 4 x 4
# x y z cond_sum
# <dbl> <dbl> <dbl> <dbl>
#1 0.1 0.1 0.1 0
#2 0.2 0.1 0.2 0.4
#3 0.3 0.2 0.2 0.7
#4 0.4 0.3 0.2 0.9
Or with c_across
df %>%
rowwise %>%
mutate(cond_sum = {val <- c_across(everything())
sum(val[val >= threshold])}) %>%
ungroup
Or base R
df$cond_sum <- rowSums(replace(df, df < threshold, NA), na.rm = TRUE)
Upvotes: 4
Reputation: 39858
An option with dplyr
and purrr
could be:
df %>%
mutate(cond_sum = pmap_dbl(across(x:z), ~ sum(c(...)[c(...) > threshold])))
x y z cond_sum
<dbl> <dbl> <dbl> <dbl>
1 0.1 0.1 0.1 0
2 0.2 0.1 0.2 0.4
3 0.3 0.2 0.2 0.7
4 0.4 0.3 0.2 0.9
Or just using dplyr
:
df %>%
mutate(cond_sum = Reduce(`+`, across(x:z) * (across(x:z) > threshold)))
Upvotes: 2