Reputation: 1706
ID <- c("A", "A", "A", "B", "B", "c")
Value <- c("blue", "blue", "green", "red", "orange", NA)
df <- tibble(ID, Value)
I have to group by the ID, keep the value of Value which is the most repeated
If the values are equal (ID == "B")
, I choose the first value
The value variable should look like :
Value_output <- c("blue", "blue", "blue", "red", "red", NA)
Upvotes: 1
Views: 62
Reputation: 886938
We can get the Mode
by group
library(dplyr)
df %>%
group_by(ID) %>%
arrange(ID, is.na(Value)) %>% # in case to keep non- NA elements for a tie
mutate(Value_output = Mode(Value))
where
Mode <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
ID <- c("A", "A", "A", "B", "B", "c", "c")
Value <- c("blue", "blue", "green", "red", "orange", NA, "yellow")
df <- tibble(ID, Value)
Upvotes: 2
Reputation: 33498
Using base R:
lot <- aggregate(
Value ~ ID,
df,
function(x) names(sort(table(x), decreasing=TRUE))[1]
)
df$Value <- lot[match(df$ID, lot$ID), "Value"]
df
ID Value
<chr> <chr>
1 A blue
2 A blue
3 A blue
4 B orange
5 B orange
6 c NA
Upvotes: 2
Reputation: 28309
Solution with data.table
package (count Value
by ID
).
ID <- c("A", "A", "A", "B", "B", "c")
Value <- c("blue", "blue", "green", "red", "orange", NA)
library(data.table)
foo <- data.table(ID, Value)
setkey(foo, ID)
foo[foo[, .N, .(ID, Value)][order(N, decreasing = TRUE)][, .(Value = Value[1]), ID]]$i.Value
[1] "blue" "blue" "blue" "red" "red" NA
Upvotes: 2