Reputation: 1817
In R, I would like to manipulate (say multiply) data.frame columns with appropriately named values stored in a vector (or data.frame, if that's easier).
Let's say, I want to first summarise
the variables disp
, hp
, and wt
from the mtcars
dataset.
vars <- c("disp", "hp", "wt")
mtcars %>%
summarise_at(vars, funs(sum(.))
(throw a group_by(cyl)
into the mix, or use mutate_at
if you'd like to have more rows)
Now I'd like to multiply each of the resulting columns with a particular value, given by
multiplier <- c("disp" = 2, "hp" = 3, "wt" = 4)
Is it possible to refer to these within the summarise_at
function?
The result should look like this (and I don't want to have to refer to the variable names directly while getting there):
disp hp wt
14766.2 14082 411.808
UPDATE:
Maybe my MWE was too minimal. Let's say I want to do the same operation with a data.frame grouped by cyl
mtcars %>%
group_by(cyl) %>%
summarise_at(vars, sum)
The result should thus be:
cyl disp hp wt
1 4 2313.0 2727 100.572
2 6 2566.4 2568 87.280
3 8 9886.8 8787 223.956
UPDATE 2:
Maybe I was not explicit enough here either, but the columns in the data.frame should be multiplied by the respective values in the vector (and only those columns mentioned in the vector), so e.g. disp
should be multiplied by 2, hp
by 3 and wt
by 4, all other variables (e.g. cyl
) should remain untouched by the multiplication.
Upvotes: 1
Views: 318
Reputation: 886938
We could also do this with map
function from purrr
library(purrr)
mtcars %>%
summarise_at(vars, sum) %>%
map2_df(multiplier, `*`)
# disp hp wt
# <dbl> <dbl> <dbl>
# 1 14766.2 14082 411.808
For the updated question
d1 <- mtcars %>%
group_by(cyl) %>%
summarise_at(vars, sum)
d1 %>%
select(one_of(vars)) %>%
map2_df(multiplier[vars], ~ .x * .y) %>%
bind_cols(d1 %>% select(-one_of(vars)), .)
# cyl disp hp wt
# <dbl> <dbl> <dbl> <dbl>
#1 4 2313.0 2727 100.572
#2 6 2566.4 2568 87.280
#3 8 9886.8 8787 223.956
Or we can use gather/spread
library(tidyr)
mtcars %>%
group_by(cyl) %>%
summarise_at(vars, sum) %>%
gather(var, val, -cyl) %>%
mutate(val = val*multiplier[match(var, names(multiplier))]) %>%
spread(var, val)
# cyl disp hp wt
# <dbl> <dbl> <dbl> <dbl>
#1 4 2313.0 2727 100.572
#2 6 2566.4 2568 87.280
#3 8 9886.8 8787 223.956
Upvotes: 1
Reputation: 1287
I am not sure if you can do this in the summarise_at
function, but this is close alternative...
library(dplyr)
library(purrr)
vars <- c("disp", "hp", "wt")
multiplier <- c("disp" = 2, "hp" = 3, "wt" = 4)
mtcars %>%
summarise_at(vars, sum) %>%
do(. * multiplier)
disp hp wt
1 14766.2 14082 411.808
****REDUX****
Include the grouping var cyl
in the multiplier
and set it equal to 1. @akrun's
map2_df
does the real work here.
vars <- c("disp", "hp", "wt")
multiplier <- c("cyl" = 1, "disp" = 2, "hp" = 3, "wt" = 4)
mtcars %>%
group_by(cyl) %>%
summarise_at(vars, sum) %>%
map2_df(multiplier, ~ .x * .y)
cyl disp hp wt
<dbl> <dbl> <dbl> <dbl>
1 4 2313.0 2727 100.572
2 6 2566.4 2568 87.280
3 8 9886.8 8787 223.956
Upvotes: 0