Ironclad
Ironclad

Reputation: 37

Filtering using dplyr package

My dataset is set up as follows:

User   Day
 10      2
 1       3
 15      1
 3       1
 1       2
 15      3
 1       1

I'n trying to find out the users that are present on all three days. I'm using the below code using dplyr package:

MAU%>%
  group_by(User)%>%
  filter(c(1,2,3) %in% Day)   

  # but get this error message: 
  # Error in filter_impl(.data, quo) : Result must have length 12, not 3

any idea how to fix?

Upvotes: 2

Views: 88

Answers (2)

akrun
akrun

Reputation: 887851

We can use all to get a single TRUE/FALSE from the logical vector 1:3 %in% Day

library(dplyr)
MAU %>% 
    group_by(User)%>%
    filter(all(1:3 %in% Day))
# A tibble: 3 x 2
# Groups:   User [1]
#   User   Day
#  <int> <int>
#1     1     3
#2     1     2
#3     1     1

data

MAU <- structure(list(User = c(10L, 1L, 15L, 3L, 1L, 15L, 1L), Day = c(2L, 
 3L, 1L, 1L, 2L, 3L, 1L)), class = "data.frame", row.names = c(NA, 
 -7L))

Upvotes: 2

G. Grothendieck
G. Grothendieck

Reputation: 270195

Using the input shown reproducibly in the Note at the end, count the distinct Users and filter out those for which there are 3 days:

library(dplyr)

DF %>%
  distinct %>%
  count(User) %>%
  filter(n == 3) %>%
  select(User)

giving:

# A tibble: 1 x 1
   User
  <int>
1     1

Note

Lines <- "
User   Day
 10      2
 1       3
 15      1
 3       1
 1       2
 15      3
 1       1"
DF <- read.table(text = Lines, header = TRUE)

Upvotes: 3

Related Questions