Reputation: 1572
I have a df with this structure:
q1_1 q1_2 q1_3 q2_1 q2_2 q2_3
1 2 3 1 3 4
1 3 1 2 3
1 4 2
And would like to transform the data to this structure
Brand_1 Brand_2 Brand_3 Brand_4
q1 3 1 2 1
q2 2 2 2 1
So basically I'm reshaping the data BUT creating columns based on the counts of the values, so in this example I need a column named Brand_1 and that counts occurrences of 1 for q1 and q2. I've been able to reshape the data but cannot figure how to deal with the column names based on values.
Upvotes: 0
Views: 56
Reputation: 27732
a data.table
solution
library( data.table )
DT <- fread("q1_1 q1_2 q1_3 q2_1 q2_2 q2_3
1 2 3 1 3 4
1 3 NA 1 2 3
1 4 NA 2 NA NA")
#melt to long format
ans <- melt( DT, measure.vars = names(DT), na.rm = TRUE )
#summarise while casting to wide
dcast( ans, gsub("(^q[0-9]).*", "\\1", variable) ~
paste0("Brand_", value), fun.aggregate = length )
# variable Brand_1 Brand_2 Brand_3 Brand_4
# 1: q1 3 1 2 1
# 2: q2 2 2 2 1
Upvotes: 1
Reputation: 26218
Do this
library(tidyverse)
df %>% pivot_longer(everything(), names_to = "Brand", values_to = "Brand_no") %>%
filter(!is.na(Brand_no)) %>%
mutate(Brand = str_extract(Brand, ".*(?=_)"),
Brand_no = paste("Brand", Brand_no, sep = "_")) %>%
count(Brand, Brand_no) %>%
pivot_wider(id_cols = Brand, names_from = Brand_no, values_from = n, values_fill = 0)
# A tibble: 2 x 5
Brand Brand_1 Brand_2 Brand_3 Brand_4
<chr> <int> <int> <int> <int>
1 q1 3 1 2 1
2 q2 2 2 2 1
dput used
df <- structure(list(q1_1 = c(1L, 1L, 1L), q1_2 = 2:4, q1_3 = c(3L,
NA, NA), q2_1 = c(1L, 1L, 2L), q2_2 = c(3L, 2L, NA), q2_3 = c(4L,
3L, NA)), class = "data.frame", row.names = c(NA, -3L))
> df
q1_1 q1_2 q1_3 q2_1 q2_2 q2_3
1 1 2 3 1 3 4
2 1 3 NA 1 2 3
3 1 4 NA 2 NA NA
Upvotes: 0
Reputation: 11584
Does this work:
library(dplyr)
library(tidyr)
library(stringr)
df %>%
mutate(across(everything(), ~ recode(., `1` = 'Brand_1', `2` = 'Brand_2', `3` = 'Brand_3', `4` = 'Brand_4'))) %>%
pivot_longer(cols = everything()) %>% mutate(name = str_extract(name, 'q\\d')) %>% count(name,value) %>% na.omit() %>%
pivot_wider(id_cols = name, names_from = value, values_from = n)
# A tibble: 2 x 5
name Brand_1 Brand_2 Brand_3 Brand_4
<chr> <int> <int> <int> <int>
1 q1 3 1 2 1
2 q2 2 2 2 1
Upvotes: 1