Reputation: 999
I am using the duplicated function in R to remove the duplicate rows in my data frame.
df:
Name Rank
A 1
a 1
B 2
df[!duplicated(df),]
Name Rank
A 1
a 1
B 2
The second row is same as the first, but doesn't get deleted just because it takes the case of the "A" and "a" in to consideration. What is the turn around this? Thanks.
Upvotes: 3
Views: 2142
Reputation: 23214
# If it's okay to change the case
df.lower <- df
df.lower$Name <- tolower(df$Name)
df.lower[!duplicated(df.lower$Name),]
# If you don't want to change the case
df[!duplicated(df.lower$Name),]
or simply
df[!duplicated(tolower(df$Name)),]
Name Rank 1 A 1 3 B 2
That's for deduping based on Name
. For the entire row you could do:
df.lower[!duplicated(df.lower),] # changes the case
or
df[!duplicated(cbind(tolower(df$Name),df$Rank)),] # does not change case
Upvotes: 5