JoeBass
JoeBass

Reputation: 559

R !Duplicate by Factor Level

There is a way to remove duplicated rows...

http://www.cookbook-r.com/Manipulating_data/Finding_and_removing_duplicate_records/

I am interested in doing the same thing, but by factor levels in my data frame.

test <- data.frame(fact = c('a','a','a','b','b','b','b','c','c'), id = c('1','1','2','1','2','2','3','1','2'), value = c(1:9))

I would like to whittle down my test data frame to include the following....

  fact id value
1    a  1     1
3    a  2     3
4    b  1     4
5    b  2     5
7    b  3     7
8    c  1     8
9    c  2     9

That is, only the first row where id is not duplicate is present. The wrinkle is that it would only be a duplicate for the factor level.

Upvotes: 1

Views: 953

Answers (2)

user227710
user227710

Reputation: 3194

library(data.table)    
setDT(test)[,.SD[!duplicated(id)],by=fact]

   fact id value
1:    a  1     1
2:    a  2     3
3:    b  1     4
4:    b  2     5
5:    b  3     7
6:    c  1     8
7:    c  2     9

Upvotes: 2

Shenglin Chen
Shenglin Chen

Reputation: 4554

library(dplyr)
test%>%group_by(fact,id)%>%distinct(id)

Upvotes: 1

Related Questions