ccamara
ccamara

Reputation: 1225

Product of several columns on a data frame by a vector using dplyr

I would like to multiply several columns on a dataframe by the values of a vector (all values within the same column should be multiplied by the same value, which will be different according to the column), while keeping the other columns as they are.

Since I'm using dplyr extensively I thought that it might be useful to use mutate_each function, so I can modify all columns at the same time, but I am completely lost on the syntax on the fun() part.

On the other hand, I've read this solution which is simple and works fine, but only works for all columns instead of the selected ones.

That's what I've done so far:

Imagine that I want to multiply all columns in df but letters by weight_df vector as follows:

df = data.frame(
  letters = c("A", "B", "C", "D"),
  col1 = c(3, 3, 2, 3),
  col2 = c(2, 2, 3, 1),
  col3 = c(4, 1, 1, 3)
)
> df
  letters col1 col2 col3
1       A    3    2    4
2       B    3    2    1
3       C    2    3    1
4       D    3    1    3
> 
weight_df = c(1:3)

If I use select before applying mutate_each I get rid of letters columns (as expected), and that's not what I want (a part from the fact that the vector is not applyed per columns basis but per row basis! and I want the opposite):

df = df %>% 
  select(-letters) %>% 
  mutate_each(funs(. * weight_df))
> df
  col1 col2 col3
1    3    2    4
2    6    4    2
3    6    9    3
4    3    1    3

But if I don't select any particular columns, all values within letters are removed (which makes a lot of sense, by the way), but that's not what I want, neither (a part from the fact that the vector is not applyed per columns basis but per row basis! and I want the opposite):

df = df %>% 
  mutate_each(funs(. * issb_weight))
> df
  letters col1 col2 col3
1      NA    3    2    4
2      NA    6    4    2
3      NA    6    9    3
4      NA    3    1    3

(Please note that this is a very simple dataframe and the original one has way more rows and columns -which unfortunately are not labeled in such an easy way and no patterns can be obtained)

Upvotes: 1

Views: 4019

Answers (3)

manotheshark
manotheshark

Reputation: 4357

try this

library(plyr)
library(dplyr)

df %>% select_if(is.numeric) %>% adply(., 1, function(x) x * weight_df)

Upvotes: 2

David Arenburg
David Arenburg

Reputation: 92292

The problem here is that you are basically trying to operate over rows, rather columns, hence methods such as mutate_* won't work. If you are not satisfied with the many vectorized approaches proposed in the linked question, I think using tydeverse (and assuming that letters is unique identifier) one way to achieve this is by converting to long form first, multiply a single column by group and then convert back to wide (don't think this will be overly efficient though)

library(tidyr)
library(dplyr)

df %>% 
  gather(variable, value, -letters) %>%
  group_by(letters) %>%
  mutate(value = value * weight_df) %>%
  spread(variable, value)

#Source: local data frame [4 x 4]
#Groups: letters [4]

#     letters  col1  col2  col3
# *    <fctr> <dbl> <dbl> <dbl>
#   1       A     3     4    12
#   2       B     3     4     3
#   3       C     2     6     3
#   4       D     3     2     9 

Upvotes: 6

mabdrabo
mabdrabo

Reputation: 1060

using dplyr. This filters numeric columns only. Gives flexibility for choosing columns. Returns the new values along with all the other columns (non-numeric)

index <- which(sapply(df, is.numeric) == TRUE)
df[,index] <- df[,index] %>% sweep(2, weight_df, FUN="*")

> df
  letters col1 col2 col3
1       A    3    4   12
2       B    3    4    3
3       C    2    6    3
4       D    3    2    9

Upvotes: 2

Related Questions