Reputation: 411
Let's say I have this data frame:
df <- data.frame(
a = c(NA,6,6,8),
x= c(1,2,2,4),
y = c(NA,2,NA,NA),
z = c("apple", 2, "2", NA),
d = c(NA, 5, 5, 5),stringsAsFactors = FALSE)
Rows 2 and 3 are duplicates and row 3 has an NA value. I want to delete the duplicate row with the NA value so that it looks like this:
df <- data.frame(
a = c(NA,6,8),
x= c(1,2,4),
y = c(NA,2,NA),
z = c("apple", 2, NA),
d = c(NA, 5, 5),stringsAsFactors = FALSE)
I tried this but it doesn't work:
df2 <- df %>% group_by (a,x,z,d) %>% filter(y == max(y))
Any suggestions?
Upvotes: 0
Views: 533
Reputation: 389235
Fill NA
values with previous non-NA and select unique rows with distinct
.
library(dplyr)
library(tidyr)
df %>% fill(everything()) %>% distinct()
# a x y z d
#1 NA 1 NA apple NA
#2 6 2 2 2 5
#3 8 4 NA <NA> 5
Upvotes: 0
Reputation: 79338
df %>%
arrange_all() %>%
filter(!duplicated(fill(., everything())))
a x y z d
1 NA 1 NA apple NA
2 6 2 2 2 5
3 8 4 NA <NA> 5
Upvotes: 1
Reputation: 99
df %>% arrange(a,x,z,d) %>% distinct(a,x,z,d,.keep_all=TRUE)
a x y z d
1 6 2 2 2 5
2 8 4 NA <NA> 5
3 NA 1 NA apple NA
Upvotes: 0