Reputation: 470
I have the results of a ranked choice voting poll that provides the data in this format:
choice_1 choice_2 choice_3 choice_4
<chr> <chr> <chr> <chr>
1 Tuesday E… Thursday … Wednesday… Monday E…
2 Saturday … Saturday … NA NA
3 Saturday … Tuesday E… Monday Ev… Wednesda…
4 Monday Ev… Tuesday E… Saturday … Sunday M…
I am trying to figure out a way to transform the data so it is structured like a ranked choice ballot actually would be, where the candidates are columns and the rankings are the values like so (not actual data):
| monday_evening | tuesday_evening | wednesday_evening |
| -------------- | --------------- | ----------------- |
| 1 | 3 | 2 |
| 2 | 1 | NA |
| 1 | 2 | 3 |
I know that pivot_longer
and pivot_wider
can be used to turn columns into values and turn values into columns respectively, but here I essentially need to do both at the same time and I'm totally lost as to if I can even make that happen.
Solution
I ended up using this to get the data formatted like a ballot would be:
x <- data %>%
mutate(rn = row_number()) %>%
pivot_longer(cols = contains("Choice"),
values_drop_na = TRUE) %>%
mutate(name = str_remove_all(name, "Choice ")) %>%
pivot_wider(names_from = value, values_from = name) %>%
select(-rn) %>%
mutate_all(as.integer)
Upvotes: 1
Views: 339
Reputation: 887221
if we need the suffix of column names a ranks, pivot to long format with pivot_longer
and then reshape to wide after splitting the 'name' column
library(dplyr)
library(tidyr)
library(data.table)
library(DescTools)
df1 %>%
pivot_longer(cols = everything(), values_drop_na = TRUE) %>%
separate(name, into = c('name1', 'name2'), convert = TRUE) %>%
select(-name1) %>%
arrange(match(word(value, 1), day.name)) %>%
mutate(rn = rowid(value)) %>%
pivot_wider(names_from = value, values_from = name2) %>%
select(-rn)
-output
# A tibble: 4 x 6
# `Monday Evening` `Tuesday Evening` `Wednesday Evening` `Thursday Evening` `Saturday Evening` `Sunday Morning`
# <int> <int> <int> <int> <int> <int>
#1 4 1 3 2 1 4
#2 3 2 4 NA 2 NA
#3 1 2 NA NA 1 NA
#4 NA NA NA NA 3 NA
df1 <- structure(list(choice_1 = c("Tuesday Evening", "Saturday Evening",
"Saturday Evening", "Monday Evening"), choice_2 = c("Thursday Evening",
"Saturday Evening", "Tuesday Evening", "Tuesday Evening"),
choice_3 = c("Wednesday Evening",
NA, "Monday Evening", "Saturday Evening"), choice_4 = c("Monday Evening",
NA, "Wednesday Evening", "Sunday Morning")), class = "data.frame", row.names = c("1",
"2", "3", "4"))
Upvotes: 3