Simon
Simon

Reputation: 1111

dplyr group_by() and slice() within group

Suppose you have a tibble:

library(tidyverse)

x <- tibble(
   name   = c("alice", "bob", "mary", "mary", "alice", "alice"),
   colour = c(NA, "green", "orange", "orange", NA, NA)
 ) %>% 
 group_by(name)



# A tibble: 6 x 2
# Groups:   name [3]
  name  colour
  <chr> <chr> 
1 alice NA    
2 bob   green 
3 mary  orange
4 mary  orange
5 alice NA    
6 alice NA  

How can you group by name and only return 1 name if all colours within a group are NA?

Expected output:

# A tibble: 4 x 2
  name  colour
  <chr> <chr> 
1 alice NA    
2 bob   green 
3 mary  orange
4 mary  orange

Upvotes: 4

Views: 1808

Answers (3)

akrun
akrun

Reputation: 887108

Using slice with group_by

library(dplyr)
x %>%
   group_by(name) %>%
   slice(if(!all(is.na(colour))) row_number() else 1) %>%
   ungroup

-output

# A tibble: 4 x 2
#  name  colour
#  <chr> <chr> 
#1 alice <NA>  
#2 bob   green 
#3 mary  orange
#4 mary  orange

Upvotes: 2

Nicol&#225;s Velasquez
Nicol&#225;s Velasquez

Reputation: 5898

You could filter two sets and then bind them. Note that this will not properly work in a pipe that starts with x %>% rbind, thus Cettt's answer might be preferable.

rbind(filter(.data = x, !is.na(colour)), unique(filter(.data = x, is.na(colour))))

Upvotes: 1

Cettt
Cettt

Reputation: 11981

You can use all inside filter to check if all colours are NA

x %>%
   filter(!(all(is.na(colour)) & 1:n() != 1))

Upvotes: 3

Related Questions