user2909729
user2909729

Reputation: 31

select only rows that have the same id in r

ID Julian Month Year Location Distance
 2  40749  July 2011     8300    39625
 2  41425   May 2013 Hatchery    31325
 3  40749  July 2011     6950    38625
 3  41057   May 2012 Hatchery    31325
 6  40735  July 2011     8300    39650
12  40743  July 2011    11025    42350

Above is the head() for the data frame I'm working with. It contains over 7,000 rows and 3,000 unique ID values. I want to delete all the rows that have only one ID value. Is this possible? Maybe the solution is in keeping only rows where the ID is repeated?

Upvotes: 3

Views: 5114

Answers (2)

Blue Magister
Blue Magister

Reputation: 13363

If d is your data frame, I'd use duplicated to find the rows that have recurring IDs. Using both arguments in fromLast gets you the first and last duplicate ID row.

d[(duplicated(d$ID, fromLast = FALSE) | duplicated(d$ID, fromLast = TRUE)),]

This double-duplicated method has a variety of uses:

Finding ALL duplicate rows, including "elements with smaller subscripts"

How to get a subset of a dataframe which only has elements which appear in the set more than once in R

How to identify "similar" rows in R?

Upvotes: 5

Stedy
Stedy

Reputation: 7469

Here is how I would do it:

new.dataframe <- c()
ids <- unique(dataframe$ID)
for(id in ids){
temp <- dataframe[dataframe$ID == id, ]
if(nrow(temp) > 1){
new.dataframe <- rbind(new.dataframe, temp)
}}

This will remove all the IDs that only have one row

Upvotes: 1

Related Questions