Reputation: 262
R - Create multiple new columns using conditional statements
I am wondering if there is a way to create multiple columns based on a condition(s).
For example below, I have a dataframe with data and I want to create two columns based on the based on ccy. One column is a gbp conversion rate of the ccy and the other is a cad conversion.
If I pipe the mutates I can get it to work but there is repetition (and in my real problem, I have a complex list of ifelse so repeating the code for every column would create a lot of repetition).
df <- structure(list(product = c('option', 'forward', 'forward', 'option'),
ccy = c('usd', 'usd', 'eur', 'usd'),
amount = c(1000, 2000, 1000, 5000)),
.Names = c('product', 'ccy', 'amount'),
row.names = c(NA, 4L),
class = "data.frame")
df
product ccy amount
1 option usd 1000
2 forward usd 2000
3 forward eur 1000
4 option usd 5000
df %>% mutate(gbp_amount =
ifelse(ccy == 'usd', round(amount / 1.8, 2),
ifelse(ccy == 'eur', round(amount / 1.3, 2),
'not_converted'))) %>%
mutate(cad_amount =
ifelse(ccy == 'usd', round(amount / 0.85, 2),
ifelse(ccy == 'eur', round(amount / .7, 2),
'not_converted')))
product ccy amount gbp_amount cad_amount
1 option usd 1000 555.56 1176.47
2 forward usd 2000 1111.11 2352.94
3 forward eur 1000 769.23 1428.57
4 option usd 5000 2777.78 5882.35
Is there a way to create multiple columns based on a single if condition?
E.g., something like this pseudo code...
df %>% ifelse(df$ccy == 'usd',
(mutate(gbp_amount = round(amount / 1.8, 2)),
mutate(cad_amount = round(amount / 0.85, 2))),
ifelse(df$ccy == 'eur',
(mutate(gbp_amount = round(amount / 1.3, 2)),
mutate(cad_amount = round(amount / 0.7, 2))),
'not_converted'))
Upvotes: 1
Views: 1384
Reputation: 8760
You can use SQL-alike joins if you have many "equals" conditions.
I am using the data.table
syntax but you could do this dplyr
too:
library(data.table)
setDT(df)
# add a row which cannot be found ("joined") to demonstrate missing rates
df <- rbind(df, data.table(product = "option", ccy = "aud", amount = 3000))
df
lookup <- data.table(ccy = c("usd", "eur"),
gbp_rate = c( 1.8, 1.3),
cad_rate = c( 0.85, 0.7))
lookup
# ccy gbp_rate cad_rate
# 1: usd 1.8 0.85
# 2: eur 1.3 0.70
df[lookup, `:=`(gbp_amount = round(amount / gbp_rate, 2),
cad_amount = round(amount / cad_rate, 2)),
on = "ccy"]
df
# product ccy amount gbp_amount cad_amount
# 1: option usd 1000 555.56 1176.47
# 2: forward usd 2000 1111.11 2352.94
# 3: forward eur 1000 769.23 1428.57
# 4: option usd 5000 2777.78 5882.35
# 5: option aud 3000 NA NA
You have to sort the result as you like and mark the lookup errors (missing conversion rates) with another value than NA
if you want (but not with the string "not_converted"
like in your question since this would mix up the data type of the column - double vs character).
Upvotes: 2
Reputation: 107587
Consider building a rates data set and merge with your original, avoiding nested ifelse
:
rates_df <- data.frame(ccy = c('usd', 'eur'),
type = c('gbp', 'gbp', 'cad', 'cad'),
rate = c(1.8, 1.3, 0.85, 0.7),
stringsAsFactors = FALSE)
rates_df
df %>%
inner_join(rates_df, by="ccy") %>%
mutate(gbp_amount = ifelse(type=="gbp", round(amount / rate, 2), 0),
cad_amount = ifelse(type=="cad", round(amount / rate, 2), 0)) %>%
select(product, ccy, matches("amount")) %>%
group_by(product, ccy, amount) %>%
summarise_all(sum)
# # A tibble: 4 x 5
# # Groups: product, ccy [?]
# product ccy amount gbp_amount cad_amount
# <chr> <chr> <dbl> <dbl> <dbl>
# 1 forward eur 1000 769.23 1428.57
# 2 forward usd 2000 1111.11 2352.94
# 3 option usd 1000 555.56 1176.47
# 4 option usd 5000 2777.78 5882.35
Upvotes: 3
Reputation: 2091
You'll have to use a for-loop
if you want to do multiple actions. @R Yoda's solution is probably better though. Like he said, I'd use NA rather than a character string so you aren't mixing data types in a vector, otherwise it'll default to character.
for (i in 1:nrow(df)) {
if(df$ccy[i] == "usd") {
df$gbp_amount[i] <- round(df$amount[i] / 1.8, 2);
df$cad_amount[i] <- round(df$amount[i] / 0.85, 2);
} else {
NA
}
if(df$ccy[i] == "eur") {
df$gbp_amount[i] <- round(df$amount[i] / 1.3, 2);
df$cad_amount[i] <- round(df$amount[i] / 0.7, 2);
} else {
NA
}
}
Upvotes: 0