Eisen
Eisen

Reputation: 1907

Threshold Flag creation in R

id | 85| 291| 5680| 41
---+---+----+-----+----
597| 1 |  1 |  1  |  1
672| 1 |  0 |  0  |  0
680| 1 |  1 |  1  |  0
683| 1 |  1 |  1  |  1

I have a table that looks something like above. I want to make a flag each row where the 1 values account for 90% of the row (not including the id column) So for this example only row 1 and 4 would be flagged.

intended output:

id | 85| 291| 5680| 41 | flag |
---+---+----+-----+----+------+
597| 1 |  1 |  1  |  1 |  yes |
672| 1 |  0 |  0  |  0 |  no  |
680| 1 |  1 |  1  |  0 |  no  |
683| 1 |  1 |  1  |  1 |  yes |

how can i do this in R using tidyverse syntax? I tried some stuff dealing with rowSums(), but i can't come up with a solution.

Upvotes: 0

Views: 73

Answers (3)

akrun
akrun

Reputation: 887991

An option in tidyverse, would be to reshape to 'long' format, get the mean and bind with the original dataset

library(dplyr)
library(tidyr)
df1 %>% 
  pivot_longer(cols = -id) %>% 
  group_by(id) %>% 
  summarise(flag = mean(value) > 0.9) %>% 
  right_join(df1) %>%
  select(names(df1), everything())
# A tibble: 4 x 6
#     id  `85` `291` `5680`  `41` flag 
#  <dbl> <dbl> <dbl>  <dbl> <dbl> <lgl>
#1   597     1     1      1     1 TRUE 
#2   672     1     0      0     0 FALSE
#3   680     1     1      1     0 FALSE
#4   683     1     1      1     1 TRUE 

data

df1 <- structure(list(id = c(597, 672, 680, 683), `85` = c(1, 1, 1, 
1), `291` = c(1, 0, 1, 1), `5680` = c(1, 0, 1, 1), `41` = c(1, 
0, 0, 1)), class = "data.frame", row.names = c(NA, -4L))

Upvotes: 0

Ben
Ben

Reputation: 30559

Perhaps try using rowMeans:

df$flag = rowMeans(df[-1]) >= .9

This assumes you have only 1 and 0 for values here.

Upvotes: 1

Allan Cameron
Allan Cameron

Reputation: 174641

If your "table" is actually a data frame with all columns except the first being columns of 1s and 0s, you could do:

df %>% mutate(flag = apply(df[-1], 1, function(x) sum(x)/length(x) > 0.9)

Upvotes: 0

Related Questions