Reputation: 559
There is a way to remove duplicated rows...
http://www.cookbook-r.com/Manipulating_data/Finding_and_removing_duplicate_records/
I am interested in doing the same thing, but by factor levels in my data frame.
test <- data.frame(fact = c('a','a','a','b','b','b','b','c','c'), id = c('1','1','2','1','2','2','3','1','2'), value = c(1:9))
I would like to whittle down my test data frame to include the following....
fact id value
1 a 1 1
3 a 2 3
4 b 1 4
5 b 2 5
7 b 3 7
8 c 1 8
9 c 2 9
That is, only the first row where id is not duplicate is present. The wrinkle is that it would only be a duplicate for the factor level.
Upvotes: 1
Views: 953
Reputation: 3194
library(data.table)
setDT(test)[,.SD[!duplicated(id)],by=fact]
fact id value
1: a 1 1
2: a 2 3
3: b 1 4
4: b 2 5
5: b 3 7
6: c 1 8
7: c 2 9
Upvotes: 2