Reputation: 171
I have a dataframe like the following one:
ID STATUS
1638483 Very bad
1407499 Very good
1383920 Good
1407499 Bad
First column contains ID
, some are unique but some others are not.
Second column contains STATUS
which can be: "Very good"
, "Good"
, "Bad"
, or "Very Bad"
.
I'd like to:
ID
(STATUS
does not matter here): rows with ID
1638483
or 1383920
for example,ID
: rows with ID
1407499
for exampleThe desired output would be:
ID STATUS
1638483 Very bad
1407499 Very good
1383920 Good
I tried to use the dplyr
package.
I succeed to group data by ID
but then I'm stuck.
Upvotes: 2
Views: 3764
Reputation: 96
One possible solution using dplyr:
# create tibble
df <- tibble(
id = c("1638483", "1407499", "1383920", "1407499"),
status = c("Very bad", "Very good", "Good", "Bad")
)
# solution
df %>%
mutate_at("status", factor,
levels = c("Very bad", "Bad", "Good", "Very good")) %>%
arrange(desc(status)) %>%
group_by(id) %>%
filter(status == status[1]) %>%
ungroup()
Result:
# A tibble: 3 x 2
id status
<chr> <fctr>
1 1383920 Good
2 1407499 Very good
3 1638483 Very bad
Upvotes: 3
Reputation: 32548
Convert STATUS
to factor
according to desired levels
and use ave
df$STATUS = factor(df$STATUS, levels = c("Very bad", "Bad", "Good", "Very good"))
df[ave(as.numeric(df$STATUS), df$ID, FUN = function(x) x == max(x)) == 1,]
# ID STATUS
#1 1638483 Very bad
#2 1407499 Very good
#3 1383920 Good
DATA
df = structure(list(ID = c(1638483L, 1407499L, 1383920L, 1407499L),
STATUS = c("Very bad", "Very good", "Good", "Bad")), .Names = c("ID",
"STATUS"), class = "data.frame", row.names = c(NA, -4L))
Upvotes: 1