Lennu
Lennu

Reputation: 1

Is there a function to calculate amount of rows based on a condition?

Example of my dataframe:

ID <- c("a","b","c","c","d","d","e","f","g","h")
status <- c(0,1,0,1,0,1,0,0,0,1)
DF <- data.frame(ID, status)
> DF
   ID status
1   a      0
2   b      1
3   c      0
4   c      1
5   d      0
6   d      1
7   e      0
8   f      0
9   g      0
10  h      1

I need to know how many ID there is with status number 1 and only 1. With the example DF the right answer would be 2 (IDs b and h).

Unfortunately I haven´t found any function for this. Should the answer be calculated with for loop or is there a simpler way for the solution?

Thanks!

Upvotes: 0

Views: 46

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 101149

You can use aggregate + mean + subset

> subset(aggregate(. ~ ID, DF, mean), status == 1)$ID
[1] "b" "h"

or more descriptive one (thank @Konrad Rudolph for coments)

> subset(aggregate(. ~ ID, DF, function(x) all(x == 1)), status)$ID
[1] "b" "h"

Upvotes: 1

Yuriy Saraykin
Yuriy Saraykin

Reputation: 8880

library(tidyverse)
ID <- c("a","b","c","c","d","d","e","f","g","h")
status <- c(0,1,0,1,0,1,0,0,0,1)
DF <- data.frame(ID, status)
DF
#>    ID status
#> 1   a      0
#> 2   b      1
#> 3   c      0
#> 4   c      1
#> 5   d      0
#> 6   d      1
#> 7   e      0
#> 8   f      0
#> 9   g      0
#> 10  h      1

# status_1
status_1 <- DF %>%
  group_by(ID) %>%
  filter(all(status == 1)) %>%
  ungroup()

status_1
#> # A tibble: 2 x 2
#>   ID    status
#>   <chr>  <dbl>
#> 1 b          1
#> 2 h          1

# result
sum(status_1$status)
#> [1] 2

Created on 2021-06-18 by the reprex package (v2.0.0)

data.table

library(data.table)
setDT(DF)[, .SD[all(status == 1)], by = "ID"]

Upvotes: 2

Related Questions