Reputation: 162
I would like to sum together different columns using a mapping from an another dataframe.
This is a simplified version of my datasets:
df_raw = data.frame(grp1_1 = c(1,2,1,2,1,2,1,2), grp1_2 = c(1,1,1,5,3,1,1,6),
grp2_1 = c(1,1,2,3,5,6,2,1), grp2_2 = c(1,1,2,3,4,5,6,7),
grp3_1 = c(1,2,0,9,8,0,1,2))
df_groups = data.frame(cols = c("grp1_1","grp1_2","grp2_1","grp2_2","grp3_1"),
group = c("group1","group1","group2","group2","group3"))
The end result should be:
# A tibble: 8 x 3
# Rowwise:
group1 group2 group3
<dbl> <dbl> <dbl>
1 2 2 1
2 3 2 2
3 2 4 0
4 7 6 9
5 4 9 8
6 3 11 0
7 2 8 1
8 8 8 2
I already understand I would have to use transmute
,rowwise
and probably c_across
in order to reach the desired output, something like this:
df_raw %>%
rowwise() %>%
transmute(group1 = sum(c_across(df_groups$cols[df_groups$group == "group1"])) ,
group2 = sum(c_across(df_groups$cols[df_groups$group == "group2"])),
group3 = sum(c_across(df_groups$cols[df_groups$group == "group3"])))
There are two main problems here:
df_groups
as a new variable name in the processed df.Thank you.
Upvotes: 2
Views: 38
Reputation: 887691
We may split the df_groups and loop over the list
library(purrr)
library(dplyr)
imap(split(df_groups$cols, df_groups$group), ~
df_raw %>%
transmute(!! .y := rowSums(across(all_of(.x))))) %>%
list_cbind
-output
group1 group2 group3
1 2 2 1
2 3 2 2
3 2 4 0
4 7 6 9
5 4 9 8
6 3 11 0
7 2 8 1
8 8 8 2
Or another option is to reshape to 'long', join with the df_groups and reshape back to wide
library(tidyr)
df_raw %>%
mutate(rn = row_number()) %>%
pivot_longer(cols = -rn, names_to = 'cols') %>%
left_join(df_groups) %>%
dplyr::select(-cols) %>%
pivot_wider(names_from = group, values_from = value, values_fn = sum) %>%
dplyr::select(-rn)
-output
# A tibble: 8 × 3
group1 group2 group3
<dbl> <dbl> <dbl>
1 2 2 1
2 3 2 2
3 2 4 0
4 7 6 9
5 4 9 8
6 3 11 0
7 2 8 1
8 8 8 2
Upvotes: 2