user42485
user42485

Reputation: 811

Filtering out nested data frames by number of observations

Following from: Use filter() (and other dplyr functions) inside nested data frames with map()

I want to nest on multiple columns, and then filter out rows by the number of items that were nested into that row. For example,

df <- tibble(
  a = sample(x = c(rep(c('x','y'),4), 'w', 'z')),
  b = sample(c(1:10)),
  c = sample(c(91:100))
)

I want to nest on column a, as in:

df_nest <- df %>% 
nest(-a)

Then, I want to filter out the rows that only have 1 observation in the data column (where a = w or a = z, in this case.) How can I do that?

Upvotes: 1

Views: 1028

Answers (1)

akuiper
akuiper

Reputation: 214977

You can use map/map_int on the data column to return the nrow in each nested tibble, and construct the filter condition based on it:

df %>% 
    nest(-a) %>% 
    filter(map_int(data, nrow) == 1)
#   filter(map(data, nrow) == 1)        works as well

# A tibble: 2 x 2
#      a             data
#  <chr>           <list>
#1     w <tibble [1 x 2]>
#2     z <tibble [1 x 2]>

Upvotes: 2

Related Questions