Knight
Knight

Reputation: 363

Remove duplicates making sure of NA values R

My data set(df) looks like,

   ID    Name    Rating    Score  Ranking
   1     abc       3        NA      NA
   1     abc       3        12      13
   2     bcd       4        NA      NA
   2     bcd       4        19      20

I'm trying to remove duplicates which using

   df <- df[!duplicated(df[1:2]),]

which gives,

   ID    Name    Rating    Score  Ranking
   1     abc       3        NA      NA
   2     bcd       4        NA      NA

but I'm trying to get,

   ID    Name    Rating    Score  Ranking
   1     abc       3        12      13
   2     bcd       4        19      20

How do I avoid rows containing NA's when removing duplicates at the same time, some help would be great, thanks.

Upvotes: 0

Views: 1311

Answers (3)

Azam Yahya
Azam Yahya

Reputation: 696

Using dplyr

df <- df %>% filter(!duplicated(.[,1:2], fromLast = T))

Upvotes: 1

Andrew Haynes
Andrew Haynes

Reputation: 2640

You could just filter out the observations you don't want with which() and then use the unique() function:

a<-unique(c(which(df[,'Score']!="NA"), which(df[,'Ranking']!="NA")))

df2<-unique(df[a,])

> df2
  ID Name Rating Score Ranking
2  1  abc      3    12      13
4  2  bcd      4    19      20

Upvotes: 0

submartingale
submartingale

Reputation: 755

First, push the NAs to last with na.last = T

df<-df[with(df, order(ID, Name, Score, Ranking),na.last = T),]

then do the removing of duplicated ones with fromLast = FALSE argument:

df <- df[!duplicated(df[1:2],fromLast = FALSE),]

Upvotes: 1

Related Questions