lukeg
lukeg

Reputation: 1357

group by and filter data management using dplyr

Take a simple dataset

a <- c(1,2,3,4,5,6,7,8)
b <- c(1,2,2,1,2,2,2,2)
c <- c(1,1,1,2,2,2,3,3)
d <- data.frame(a,b,c)

now I want to filter my data, so that we group_by(c) and then remove all data where no b=1occurs.

Thus the results (e) should look like d but without the two bottom rows

I have tried using

e <- d %>%
  group_by(c) %>%
  filter(n(b)>1)

The output should contain the data in green below and remove the data in red

enter image description here

Upvotes: 26

Views: 53609

Answers (2)

Pat W.
Pat W.

Reputation: 1831

You can try

df <- d %>% mutate(test = ifelse((b != 1) == T, 0, 1)) %>% group_by(c) %>% 
            mutate(test = sum(test)) %>% filter(test != 0) %>% select(-test)

which yields

#  a b c
#1 1 1 1
#2 2 2 1
#3 3 2 1
#4 4 1 2
#5 5 2 2
#6 6 2 2

Upvotes: 0

Steven Beaupr&#233;
Steven Beaupr&#233;

Reputation: 21641

Try

d %>% 
  group_by(c) %>% 
  filter(any(b == 1))

Which gives:

#Source: local data frame [6 x 3]
#Groups: c
#
#  a b c
#1 1 1 1
#2 2 2 1
#3 3 2 1
#4 4 1 2
#5 5 2 2
#6 6 2 2

Upvotes: 48

Related Questions