woshishui
woshishui

Reputation: 2064

r - use basic math operators between columns in dplyr

I have a data frame with exchange rates. I want to divide any column that starts with "rates." by the "rates.AUD" column.

df <- structure(list(timestamp = c(1490659199L, 1490745599L, 1490831999L, 
1490918399L, 1491004766L, 1491091173L, 1491177598L, 1491263999L, 
1491350399L, 1491436799L), rates.USD = c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L), rates.AUD = c(1.311019, 1.306745, 1.303737, 
1.306658, 1.31053, 1.31053, 1.310702, 1.314962, 1.321414, 1.321726
), rates.EUR = c(0.920726, 0.924523, 0.929473, 0.935651, 0.937734, 
0.937734, 0.937251, 0.937221, 0.936495, 0.937035)), .Names = c("timestamp", 
"rates.USD", "rates.AUD", "rates.EUR"), row.names = c(NA, 10L
), class = "data.frame")

I've tried the following

library(tidyverse)
result <- df %>% mutate_at(vars(starts_with("rates.")), funs(./rates.AUD))

But it didn't apply the function to all columns that start with "rates.". rates.USD and rates.AUD changed, but rates.EUR stayed the same.

I'm a bit confused, help appreciated.

Upvotes: 2

Views: 2758

Answers (2)

Rafael Zayas
Rafael Zayas

Reputation: 2451

I had the same issue, and couldn't figure it out, and even posted it as an issue on the dplyr GH repo here. The response was very helpful, and would work for you. The summary of the problem is that rates.AUD gets divided by itself as it goes colwise through your data.frame, and after the mutate divides rates.AUD by itself that column (now consisting entirely of 1s) is used in the subsequent calculations.

Two approaches were suggested by Lionel Henry, which I'll update for this example.

result2 <- df %>%
  mutate_at(vars(starts_with("rates.")), function (x) x/df$rates.AUD)


result3 <- df %>% 
  mutate_at(vars(starts_with("rates.")),`/`, y = .$rates.AUD)

both return this:

    timestamp rates.USD rates.AUD rates.EUR
1  1490659199 0.7627655         1 0.7022980
2  1490745599 0.7652602         1 0.7075007
3  1490831999 0.7670259         1 0.7129298
4  1490918399 0.7653112         1 0.7160642
5  1491004766 0.7630501         1 0.7155380
6  1491091173 0.7630501         1 0.7155380
7  1491177598 0.7629499         1 0.7150756
8  1491263999 0.7604782         1 0.7127362
9  1491350399 0.7567651         1 0.7087067
10 1491436799 0.7565865         1 0.7089480

Upvotes: 6

Lamia
Lamia

Reputation: 3875

When you divide your three rates columns by rates.AUD, they are divided in order. The rates.AUD is divided by itself and becomes equal to 1s before it can be used to divide the rates.EUR column. This way the rates.EUR is divided by 1s and remains unchanged. A work-around is to change the columns order placing rates.AUD as the last one: df = df[,c(1,2,4,3)] before doing your calculations.

Upvotes: 2

Related Questions