Reputation: 2817
In my data, people (id) rate (1-3) topics (A, B, C, D, E). I would like to allocate ids to their hightest rating topics. I calculated "popularity" of topics as the sum of maximum value ratings, e.g. topic B only has one 3-star rating whereas topic A has three 3-star ratings.
Now, I am looking for a loop that solves the following (assume data is already arranged by popularity
):
popularity
).NA
. library(tidyverse)
data <- data.frame(topic = c("A", "B", "C", "D", "E"),
id1 = c(1,2,3,1,2),
id2 = c(3,3,2,1,3),
id3 = c(1,1,3,3,2),
id4 = c(3,1,2,2,1),
id5 = c(2,2,1,3,1),
id6 = c(3,1,1,1,3)) %>%
mutate(popularity= rowSums(. == 3),) %>%
arrange(popularity)
# Initial Data
topic id1 id2 id3 id4 id5 id6 popularity
1 B 2 3 1 1 2 1 1
2 C 3 2 3 2 1 1 2
3 D 1 1 3 2 3 1 2
4 E 2 3 2 1 1 3 2
5 A 1 3 1 3 2 3 3
# After one step of the loop
topic id1 id2 id3 id4 id5 id6 popularity
1 B NA B NA NA NA NA 1
2 C 3 2 3 2 1 1 2
3 D 1 1 3 2 3 1 2
4 E 2 3 2 1 1 3 2
5 A 1 3 1 3 2 3 3
# After second step of the loop
topic id1 id2 id3 id4 id5 id6 popularity
1 B NA B NA NA NA NA 1
2 C C NA NA NA NA NA 2
3 D 1 1 3 2 3 1 2
4 E 2 3 2 1 1 3 2
5 A 1 3 1 3 2 3 3
Upvotes: 1
Views: 41
Reputation: 887501
We can do this without a loop by using the vectorized max.col
to find the column index for each row where the max
value of 'id' columns are. Then, cbind
with sequence of rows and the column index to a temporary matrix
created to assign values from 'topic' column. Assign that template dataset to the 'id' columns of 'data'
i1 <- startsWith(names(data), "id")
m1 <- matrix(NA, nrow(data), sum(i1))
m1[cbind(seq_len(nrow(m1)), max.col(data[i1], 'first'))] <- data$topic
data[i1] <- m1
-output
data
# topic id1 id2 id3 id4 id5 id6 popularity
#1 B <NA> B <NA> <NA> <NA> <NA> 1
#2 C C <NA> <NA> <NA> <NA> <NA> 2
#3 D <NA> <NA> D <NA> <NA> <NA> 2
#4 E <NA> E <NA> <NA> <NA> <NA> 2
#5 A <NA> A <NA> <NA> <NA> <NA> 3
Upvotes: 1