Reputation: 281
I have the following data frame, call it df, which is a data frame consisting in three vectors: "Name," "Age," and "ZipCode."
df=
Name Age ZipCode
1 Joe 16 60559
2 Jim 20 60637
3 Bob 64 94127
4 Joe 23 94122
5 Bob 45 25462
I want to delete the entire row of df
if the Name
in it appears fewer than 2 times in the data frame as a whole (and flexibly 3, 4, or x times). Basically keep Bob
and Joe
in the data frame, but delete Jim
. How can I do this?
I tried to turn it into a table:
> table(df$Name)
Bob Jim Joe
2 1 2
But I don't know where to go from there.
Upvotes: 7
Views: 2431
Reputation: 193517
You can use ave
like this:
df[as.numeric(ave(df$Name, df$Name, FUN=length)) >= 2, ]
# Name Age ZipCode
# 1 Joe 16 60559
# 3 Bob 64 94127
# 4 Joe 23 94122
# 5 Bob 45 25462
This answer assumes that df$Name
is a character
vector, not a factor
vector.
You can also continue with table
as follows:
x <- table(df$Name)
df[df$Name %in% names(x[x >= 2]), ]
# Name Age ZipCode
# 1 Joe 16 60559
# 3 Bob 64 94127
# 4 Joe 23 94122
# 5 Bob 45 25462
Upvotes: 8