Ignore case while using duplicated

Question

I am using the duplicated function in R to remove the duplicate rows in my data frame.

 df:

 Name Rank
  A    1
  a    1
  B    2


df[!duplicated(df),]

 Name Rank
  A    1
  a    1
  B    2

The second row is same as the first, but doesn't get deleted just because it takes the case of the "A" and "a" in to consideration. What is the turn around this? Thanks.

Hack-R · Accepted Answer

# If it's okay to change the case
df.lower      <- df
df.lower$Name <- tolower(df$Name)

df.lower[!duplicated(df.lower$Name),]

# If you don't want to change the case
df[!duplicated(df.lower$Name),]

or simply

df[!duplicated(tolower(df$Name)),]

  Name Rank
1    A    1
3    B    2

That's for deduping based on Name. For the entire row you could do:

df.lower[!duplicated(df.lower),] # changes the case

or

df[!duplicated(cbind(tolower(df$Name),df$Rank)),] # does not change case

Ignore case while using duplicated

Answers (1)

Related Questions