Sarah
Sarah

Reputation: 3

R - Looping through Values in Variable and Dropping Duplicates Based on Condition

I would like to loop through each unique value in the variable "Item" (i.e. A, B, C), and only keep the rows with the lowest ID number for each item, deleting the other rows for each corresponding item.

So, I have a data frame that looks like this right now:

Item    Cost    ID

A        4       1
A                3
B       39      10
B               18
B               21
C       290     15
C          

And I want something that looks like this:

Item    Cost    ID

A        4       1
B       39      10
C       290     15

How do I do this?

(Thanks in advance - I'm new to R!)

Upvotes: 0

Views: 54

Answers (1)

SabDeM
SabDeM

Reputation: 7190

your task is quite easy with dplyr, but there is a variety of approaches.

library(dplyr)
df %>% group_by(Item) %>% filter(ID == min(ID, na.rm = TRUE))

Source: local data frame [3 x 3]
Groups: Item [3]

    Item  Cost    ID
  <fctr> <dbl> <dbl>
1      A     4     1
2      B    39    10
3      C   290    15

Data used:

structure(list(Item = structure(c(1L, 1L, 2L, 2L, 2L, 3L, 3L), .Label = c("A", 
"B", "C"), class = "factor"), Cost = c(4, NA, 39, NA, NA, 290, 
NA), ID = c(1, 3, 10, 18, 21, 15, NA)), .Names = c("Item", "Cost", 
"ID"), row.names = c(NA, -7L), class = "data.frame")

Upvotes: 1

Related Questions