Reputation: 1894
Im trying to Create a column with the string that appears the most in the row and create another column with the number of times this most prevalent string appeared.
To facilitate my question this is what im trying to achieve:
My actual DF
What im trying to obtain: most prevalente category and count
example df: d
f <- data.frame(ID = 1:4,
V1 = c("A","B","C","D"),
V2 = c("A", "B","D","B"),
V3 = c("A","C","D","B"))
Upvotes: 3
Views: 50
Reputation: 4425
dplyr
rowwise
function to apply table
to each row from V1:V3
library(dplyr)
df |> rowwise() |>
mutate(category = names(table(c_across(V1:V3)))[which.max(table(c_across(V1:V3)))] ,
count = max(table(c_across(V1:V3))))
# A tibble: 4 × 6
# Rowwise:
ID V1 V2 V3 category count
<int> <chr> <chr> <chr> <chr> <int>
1 1 A A A A 3
2 2 B B C B 2
3 3 C D D D 2
4 4 D B B B 2
Upvotes: 1
Reputation: 11076
Here is another way:
count <- sapply(apply(f[, -1], 1, table), max)
count
# [1] 3 2 2 2
category <- names(sapply(apply(f[, -1], 1, table), which.max))
category
# [1] "A" "B" "D" "B"
f2 <- data.frame(f, category, count)
f2
# ID V1 V2 V3 category count
# 1 1 A A A A 3
# 2 2 B B C B 2
# 3 3 C D D D 2
# 4 4 D B B B 2
Upvotes: 2
Reputation: 942
df <- data.frame(ID = 1:4,
V1 = c("A","B","C","D"),
V2 = c("A", "B","D","B"),
V3 = c("A","C","D","B"))
library(data.table)
setDT(df)
other <- melt(df, id.vars = "ID", measure.vars = c("V1", "V2", "V3"))
other <- other[, .N, by = .(ID, value)]
colnames(other) <- c("ID", "category", "count")
other <- other[, .SD[which.max(count)], by = .(ID)]
res <- merge(df, other, by = c("ID"))
res
Upvotes: 1