Jason Xu
Jason Xu

Reputation: 41

Count IDs of groups if one variable are equal in a group

I have a data frame in R like the following:

  Group.ID status
1        1   open
2        1   open
3        2   open
4        2 closed
5        2 closed
6        3   open

I want to count the number of IDs under the condition: when all status are "open" for same ID number. For example, Group ID 1 has two observations, and their status are both "open", so that's one for my count. Group ID 2 is not because not all status are open for group ID 2.

I can count the rows or the group IDs under conditions. However I don't know how to apply "all status equal to one value for a group" logic.

DATA.

df1 <-
structure(list(Group.ID = c(1, 1, 2, 2, 2, 3), status = structure(c(2L, 
2L, 2L, 1L, 1L, 2L), .Label = c("closed", "open"), class = "factor")), .Names = c("Group.ID", 
"status"), row.names = c(NA, -6L), class = "data.frame")

Upvotes: 4

Views: 67

Answers (2)

J_F
J_F

Reputation: 10352

a dplyrsolution:

library(dplyr)
df1 %>% 
  group_by(Group.ID) %>% 
  filter(cumsum(status == "open") == 2) %>%
  nrow()

Upvotes: 0

Rui Barradas
Rui Barradas

Reputation: 76402

Here are two solutions, both using base R, one more complicated with aggregate and the other with tapply. If you just want the total count of Group.ID matching you request, I suggest that you use the second solution.

agg <- aggregate(status ~ Group.ID, df1, function(x) as.integer(all(x == "open")))
sum(agg$status)
#[1] 2

sum(tapply(df1$status, df1$Group.ID, FUN = function(x) all(x == "open")))
#[1] 2

Upvotes: 1

Related Questions