Reputation: 1894

New column containing string that appears the most in the row

Im trying to Create a column with the string that appears the most in the row and create another column with the number of times this most prevalent string appeared.

To facilitate my question this is what im trying to achieve:

My actual DF

What im trying to obtain: most prevalente category and count

example df: d

f <- data.frame(ID = 1:4,
           V1 = c("A","B","C","D"),
           V2 = c("A", "B","D","B"),
           V3 = c("A","C","D","B"))

Upvotes: 3

Answers (3)

Mohamed Desouky

Reputation: 4425

We can use dplyr rowwise function to apply table to each row from V1:V3

library(dplyr)

df |> rowwise() |> 
      mutate(category = names(table(c_across(V1:V3)))[which.max(table(c_across(V1:V3)))] ,
      count = max(table(c_across(V1:V3))))

Output

# A tibble: 4 × 6
# Rowwise: 
     ID V1    V2    V3    category count
  <int> <chr> <chr> <chr> <chr>    <int>
1     1 A     A     A     A            3
2     2 B     B     C     B            2
3     3 C     D     D     D            2
4     4 D     B     B     B            2

Upvotes: 1

dcarlson

Reputation: 11076

Here is another way:

count <- sapply(apply(f[, -1], 1, table), max)
count
# [1] 3 2 2 2
category <- names(sapply(apply(f[, -1], 1, table), which.max))
category
# [1] "A" "B" "D" "B"
f2 <- data.frame(f, category, count)
f2
#   ID V1 V2 V3 category count
# 1  1  A  A  A        A     3
# 2  2  B  B  C        B     2
# 3  3  C  D  D        D     2
# 4  4  D  B  B        B     2

Upvotes: 2

Will

Reputation: 942

df <- data.frame(ID = 1:4,
                V1 = c("A","B","C","D"),
                V2 = c("A", "B","D","B"),
                V3 = c("A","C","D","B"))


library(data.table)
setDT(df)
other <- melt(df, id.vars = "ID", measure.vars = c("V1", "V2", "V3"))
other <- other[, .N, by = .(ID, value)]
colnames(other) <- c("ID", "category", "count")
other <- other[, .SD[which.max(count)], by = .(ID)]

res <- merge(df, other, by = c("ID"))
res

Upvotes: 1

New column containing string that appears the most in the row

Answers (3)

Related Questions