Reputation: 1706
I will try to explain my problem with one example.
df <- data.frame(VIN=paste("vin", c(1:6,2), sep = ""),
KM=c(15, 48, 545, 544, 874, 6523, 1422))
I want to clean my data.frame
, and keep only unique element in VIN column, in my example I duplicate "vin2", so to choose between the two I will take the VIN with the smaller KM. Here it's the second row.
How can I do this?
Upvotes: 0
Views: 1054
Reputation: 193517
Here are two options to consider.
The first uses rank
:
df[with(df, ave(KM, VIN, FUN = rank)) == 1, ]
# VIN KM
# 1 vin1 15
# 2 vin2 48
# 3 vin3 545
# 4 vin4 544
# 5 vin5 874
# 6 vin6 6523
The second depends on order
and `duplicated (and seems more intuitive, in a certain manner, but will require you to sort your data before proceeding).
X <- df[with(df, order(VIN, KM)), ]
X[!duplicated(X$VIN), ]
# VIN KM
# 1 vin1 15
# 2 vin2 48
# 3 vin3 545
# 4 vin4 544
# 5 vin5 874
# 6 vin6 6523
Upvotes: 2