R - Looping through Values in Variable and Dropping Duplicates Based on Condition

Question

I would like to loop through each unique value in the variable "Item" (i.e. A, B, C), and only keep the rows with the lowest ID number for each item, deleting the other rows for each corresponding item.

So, I have a data frame that looks like this right now:

Item    Cost    ID

A        4       1
A                3
B       39      10
B               18
B               21
C       290     15
C

And I want something that looks like this:

Item    Cost    ID

A        4       1
B       39      10
C       290     15

How do I do this?

(Thanks in advance - I'm new to R!)

SabDeM · Accepted Answer

your task is quite easy with dplyr, but there is a variety of approaches.

library(dplyr)
df %>% group_by(Item) %>% filter(ID == min(ID, na.rm = TRUE))

Source: local data frame [3 x 3]
Groups: Item [3]

    Item  Cost    ID
    
1      A     4     1
2      B    39    10
3      C   290    15

Data used:

structure(list(Item = structure(c(1L, 1L, 2L, 2L, 2L, 3L, 3L), .Label = c("A", 
"B", "C"), class = "factor"), Cost = c(4, NA, 39, NA, NA, 290, 
NA), ID = c(1, 3, 10, 18, 21, 15, NA)), .Names = c("Item", "Cost", 
"ID"), row.names = c(NA, -7L), class = "data.frame")

R - Looping through Values in Variable and Dropping Duplicates Based on Condition

Answers (1)

Related Questions