Reputation: 1907
id | 85| 291| 5680| 41
---+---+----+-----+----
597| 1 | 1 | 1 | 1
672| 1 | 0 | 0 | 0
680| 1 | 1 | 1 | 0
683| 1 | 1 | 1 | 1
I have a table that looks something like above. I want to make a flag each row where the 1 values account for 90% of the row (not including the id column) So for this example only row 1 and 4 would be flagged.
intended output:
id | 85| 291| 5680| 41 | flag |
---+---+----+-----+----+------+
597| 1 | 1 | 1 | 1 | yes |
672| 1 | 0 | 0 | 0 | no |
680| 1 | 1 | 1 | 0 | no |
683| 1 | 1 | 1 | 1 | yes |
how can i do this in R using tidyverse syntax? I tried some stuff dealing with rowSums()
, but i can't come up with a solution.
Upvotes: 0
Views: 73
Reputation: 887991
An option in tidyverse
, would be to reshape to 'long' format, get the mean
and bind with the original dataset
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(cols = -id) %>%
group_by(id) %>%
summarise(flag = mean(value) > 0.9) %>%
right_join(df1) %>%
select(names(df1), everything())
# A tibble: 4 x 6
# id `85` `291` `5680` `41` flag
# <dbl> <dbl> <dbl> <dbl> <dbl> <lgl>
#1 597 1 1 1 1 TRUE
#2 672 1 0 0 0 FALSE
#3 680 1 1 1 0 FALSE
#4 683 1 1 1 1 TRUE
df1 <- structure(list(id = c(597, 672, 680, 683), `85` = c(1, 1, 1,
1), `291` = c(1, 0, 1, 1), `5680` = c(1, 0, 1, 1), `41` = c(1,
0, 0, 1)), class = "data.frame", row.names = c(NA, -4L))
Upvotes: 0
Reputation: 30559
Perhaps try using rowMeans
:
df$flag = rowMeans(df[-1]) >= .9
This assumes you have only 1 and 0 for values here.
Upvotes: 1
Reputation: 174641
If your "table" is actually a data frame with all columns except the first being columns of 1s and 0s, you could do:
df %>% mutate(flag = apply(df[-1], 1, function(x) sum(x)/length(x) > 0.9)
Upvotes: 0