Reputation: 135
I have a dataframe consisting of 12 columns with different participants, in a top 5. It looks like this:
> top_5
4 5 8 9 11 12 15 16 19 20 22 23
[1,] "Nia" "Hung" "Hanaaa" "Ramziyya" "Marissa" "Jaelyn" "Shyanne" "Jaabir" "Dionicio" "Nia" "Shyanne" "Roger"
[2,] "Razeena" "Husni" "Bradly" "Marissa" "Bradly" "Muhsin" "Razeena" "Dionicio" "Magnus" "Kelsey" "Nia" "Schyler"
[3,] "Shyanne" "Schyler" "Necko" "Johannah" "Tatiana" "Glenn" "Nia" "Jaelyn" "Shyanne" "Hanaaa" "Mildred" "German"
[4,] "Schyler" "German" "Hung" "Lubaaba" "Johannah" "Magnus" "Dionicio" "German" "German" "Razeena" "Dionicio" "Jaabir"
[5,] "Husni" "Necko" "Razeena" "Afeefa" "Schyler" "Dionicio" "Jaabir" "Roger" "Johannah" "Remy" "Jaabir" "Jaelyn"
(And can be recreated using this):
structure(c("Nia", "Razeena", "Shyanne", "Schyler", "Husni",
"Hung", "Husni", "Schyler", "German", "Necko", "Hanaaa", "Bradly",
"Necko", "Hung", "Razeena", "Ramziyya", "Marissa", "Johannah",
"Lubaaba", "Afeefa", "Marissa", "Bradly", "Tatiana", "Johannah",
"Schyler", "Jaelyn", "Muhsin", "Glenn", "Magnus", "Dionicio",
"Shyanne", "Razeena", "Nia", "Dionicio", "Jaabir", "Jaabir",
"Dionicio", "Jaelyn", "German", "Roger", "Dionicio", "Magnus",
"Shyanne", "German", "Johannah", "Nia", "Kelsey", "Hanaaa", "Razeena",
"Remy", "Shyanne", "Nia", "Mildred", "Dionicio", "Jaabir", "Roger",
"Schyler", "German", "Jaabir", "Jaelyn"), .Dim = c(5L, 12L), .Dimnames = list(
NULL, c("4", "5", "8", "9", "11", "12", "15", "16", "19",
"20", "22", "23")))
Now if a participant is in the top row, it means that they are in first place in that column (so for the 1st column "Nia" is first, "Razeena" is second, etc.). A first place in the ranking is worth 5 points, the second place 4 points, etc. Now I want to calculate for each participant in the matrix her/his points.
My goal is to make an overall top 5. How would I go about this?
Upvotes: 1
Views: 60
Reputation: 28675
Here is a "convert to long then summarise by group" method similar to M--'s answer, but with data.table
library(data.table)
df <- as.data.table(top_5)[, points := .N:1]
total_points <- melt(df, 'points')[, .(points = sum(points)), value]
setorder(total_points, -points)
head(total_points, 5)
# value points
# 1: Nia 17
# 2: Shyanne 16
# 3: Dionicio 14
# 4: Razeena 11
# 5: Schyler 10
Or an idea very similar to akrun's, just using tapply
in place of sapply
+ split
out <- sort(tapply(c(6 - row(top_5)), c(top_5), sum), decreasing = TRUE)
head(out, 5)
# Nia Shyanne Dionicio Razeena Schyler
# 17 16 14 11 10
Upvotes: 3
Reputation: 28825
Using tidyverse
functions:
library(tidyr)
library(dplyr)
top_5 %>%
as.data.frame %>%
head(.,5) %>%
mutate(rank = nrow(.):1) %>%
pivot_longer(., -c(rank), values_to = "name", names_to = "col") %>%
group_by(name) %>%
summarise_at(vars(rank), list(points = sum))
#> # A tibble: 26 x 2
#> name points
#> <fct> <int>
#> 1 Husni 5
#> 2 Nia 17
#> 3 Razeena 11
#> 4 Schyler 10
#> 5 Shyanne 16
#> 6 German 9
#> 7 Hung 7
#> 8 Necko 4
#> 9 Bradly 8
#> 10 Hanaaa 8
#> # ... with 16 more rows
Upvotes: 3
Reputation: 887028
An option is to split
the row index reversed with the matrix values into a list
and get the sum
of each list
element by looping over the list
(sapply
)
out <- sapply(split(row(top_5)[nrow(top_5):1, ], top_5), sum)
out
#Afeefa Bradly Dionicio German Glenn Hanaaa Hung Husni Jaabir Jaelyn Johannah Kelsey Lubaaba Magnus Marissa Mildred Muhsin
# 1 8 14 9 3 8 7 5 9 9 6 4 2 6 9 3 4
# Necko Nia Ramziyya Razeena Remy Roger Schyler Shyanne Tatiana
# 4 17 5 11 1 6 10 16 3
head(out[order(-out)], 5)
# Nia Shyanne Dionicio Razeena Schyler
# 17 16 14 11 10
Or another option is rowsum
rowsum(c(row(top_5)[nrow(top_5):1, ]), group = c(top_5))
Upvotes: 3