Anup
Anup

Reputation: 1

R programming : How to remove Duplicates in a column based on values of another column

A   B
15  O
20  O
12  C
15  C
50  C
25  O
50  O
19  O
50  M

I have a data of the above format. I want to select unique rows based on unique elements in column A But incase there are duplicates then I need to refer to column B and select the one which has code 'C'

Expected Output:

A   B
20  O
12  C
15  C
50  C
25  O
19  O

Can anyone help..

Upvotes: 0

Views: 34

Answers (1)

akrun
akrun

Reputation: 887851

We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'A', order based on the logical condition (B==O), and get the first row with head

library(data.table)
setDT(df1)[order(B=="O"), head(.SD, 1), A]
#    A B
#1: 12 C
#2: 15 C
#3: 50 C
#4: 20 O
#5: 25 O
#6: 19 O

Or this can be done with base R by ordering and get the unique elements with duplicated

df2 <- df1[order(df1$A, df1$B=="O"),]
df2[!duplicated(df2$A),]

Upvotes: 1

Related Questions