For each row, assign character to first highest value in that row

Question

In my data, people (id) rate (1-3) topics (A, B, C, D, E). I would like to allocate ids to their hightest rating topics. I calculated "popularity" of topics as the sum of maximum value ratings, e.g. topic B only has one 3-star rating whereas topic A has three 3-star ratings.

Now, I am looking for a loop that solves the following (assume data is already arranged by popularity):

For each row find the first occurence (there will be ambiguity) of max value in that row. (Do not count popularity).
Save the resulting row-column combination, e.g. replace topic character to this first max value occurence and set all values in that row NA.

    library(tidyverse)
    data <- data.frame(topic = c("A", "B", "C", "D", "E"), 
               id1 = c(1,2,3,1,2), 
               id2 = c(3,3,2,1,3),
               id3 = c(1,1,3,3,2),
               id4 = c(3,1,2,2,1),
               id5 = c(2,2,1,3,1),
               id6 = c(3,1,1,1,3)) %>% 
      mutate(popularity= rowSums(. == 3),) %>%
      arrange(popularity)
    
    # Initial Data
      topic id1 id2 id3 id4 id5 id6 popularity
    1     B   2   3   1   1   2   1         1
    2     C   3   2   3   2   1   1         2
    3     D   1   1   3   2   3   1         2
    4     E   2   3   2   1   1   3         2
    5     A   1   3   1   3   2   3         3
    
    # After one step of the loop
      topic id1 id2 id3 id4 id5 id6 popularity
    1     B   NA  B   NA  NA  NA  NA        1
    2     C   3   2   3   2   1   1         2
    3     D   1   1   3   2   3   1         2
    4     E   2   3   2   1   1   3         2
    5     A   1   3   1   3   2   3         3
    
    # After second step of the loop
      topic id1 id2 id3 id4 id5 id6 popularity
    1     B   NA  B   NA  NA  NA  NA        1
    2     C   C   NA  NA  NA  NA  NA        2
    3     D   1   1   3   2   3   1         2
    4     E   2   3   2   1   1   3         2
    5     A   1   3   1   3   2   3         3

akrun · Accepted Answer

We can do this without a loop by using the vectorized max.col to find the column index for each row where the max value of 'id' columns are. Then, cbind with sequence of rows and the column index to a temporary matrix created to assign values from 'topic' column. Assign that template dataset to the 'id' columns of 'data'

i1 <- startsWith(names(data), "id")
m1 <- matrix(NA, nrow(data), sum(i1))
m1[cbind(seq_len(nrow(m1)), max.col(data[i1], 'first'))] <- data$topic
data[i1] <- m1

-output

data
#  topic  id1  id2  id3  id4  id5  id6 popularity
#1     B     B              1
#2     C    C               2
#3     D      D             2
#4     E     E              2
#5     A     A              3

For each row, assign character to first highest value in that row

Answers (1)

Related Questions