Reputation: 363
My data set(df) looks like,
ID Name Rating Score Ranking
1 abc 3 NA NA
1 abc 3 12 13
2 bcd 4 NA NA
2 bcd 4 19 20
I'm trying to remove duplicates which using
df <- df[!duplicated(df[1:2]),]
which gives,
ID Name Rating Score Ranking
1 abc 3 NA NA
2 bcd 4 NA NA
but I'm trying to get,
ID Name Rating Score Ranking
1 abc 3 12 13
2 bcd 4 19 20
How do I avoid rows containing NA's when removing duplicates at the same time, some help would be great, thanks.
Upvotes: 0
Views: 1311
Reputation: 696
Using dplyr
df <- df %>% filter(!duplicated(.[,1:2], fromLast = T))
Upvotes: 1
Reputation: 2640
You could just filter out the observations you don't want with which() and then use the unique() function:
a<-unique(c(which(df[,'Score']!="NA"), which(df[,'Ranking']!="NA")))
df2<-unique(df[a,])
> df2
ID Name Rating Score Ranking
2 1 abc 3 12 13
4 2 bcd 4 19 20
Upvotes: 0
Reputation: 755
First, push the NAs to last with na.last = T
df<-df[with(df, order(ID, Name, Score, Ranking),na.last = T),]
then do the removing of duplicated ones with fromLast = FALSE
argument:
df <- df[!duplicated(df[1:2],fromLast = FALSE),]
Upvotes: 1