antimuon
antimuon

Reputation: 262

R - Create multiple new columns using conditional statements

R - Create multiple new columns using conditional statements

I am wondering if there is a way to create multiple columns based on a condition(s).

For example below, I have a dataframe with data and I want to create two columns based on the based on ccy. One column is a gbp conversion rate of the ccy and the other is a cad conversion.

If I pipe the mutates I can get it to work but there is repetition (and in my real problem, I have a complex list of ifelse so repeating the code for every column would create a lot of repetition).

df <- structure(list(product = c('option', 'forward', 'forward', 'option'),
                 ccy = c('usd', 'usd', 'eur', 'usd'),
                 amount = c(1000, 2000, 1000, 5000)),
            .Names = c('product', 'ccy', 'amount'),
            row.names = c(NA, 4L),
            class = "data.frame")
df
  product ccy amount
1  option usd   1000
2 forward usd   2000
3 forward eur   1000
4  option usd   5000

df %>% mutate(gbp_amount = 
                  ifelse(ccy == 'usd', round(amount / 1.8, 2),
                         ifelse(ccy == 'eur', round(amount / 1.3, 2),
                                'not_converted'))) %>% 
    mutate(cad_amount = 
               ifelse(ccy == 'usd', round(amount / 0.85, 2),
                      ifelse(ccy == 'eur', round(amount / .7, 2),
                             'not_converted')))

  product ccy amount gbp_amount cad_amount
1  option usd   1000     555.56    1176.47
2 forward usd   2000    1111.11    2352.94
3 forward eur   1000     769.23    1428.57
4  option usd   5000    2777.78    5882.35

Is there a way to create multiple columns based on a single if condition?

E.g., something like this pseudo code...

df %>% ifelse(df$ccy == 'usd',
        (mutate(gbp_amount = round(amount / 1.8, 2)),
        mutate(cad_amount = round(amount / 0.85, 2))),
    ifelse(df$ccy == 'eur',
        (mutate(gbp_amount = round(amount / 1.3, 2)),
        mutate(cad_amount = round(amount / 0.7, 2))),
        'not_converted'))

Upvotes: 1

Views: 1384

Answers (3)

R Yoda
R Yoda

Reputation: 8760

You can use SQL-alike joins if you have many "equals" conditions.

I am using the data.table syntax but you could do this dplyr too:

library(data.table)

setDT(df)

# add a row which cannot be found ("joined") to demonstrate missing rates
df <- rbind(df, data.table(product = "option", ccy = "aud", amount = 3000))
df

lookup <- data.table(ccy      = c("usd", "eur"),
                     gbp_rate = c( 1.8,   1.3),
                     cad_rate = c( 0.85,  0.7))
lookup
#    ccy gbp_rate cad_rate
# 1: usd      1.8     0.85
# 2: eur      1.3     0.70

df[lookup, `:=`(gbp_amount = round(amount / gbp_rate, 2),
                cad_amount = round(amount / cad_rate, 2)),
                on = "ccy"]
df
#    product ccy amount gbp_amount cad_amount
# 1:  option usd   1000     555.56    1176.47
# 2: forward usd   2000    1111.11    2352.94
# 3: forward eur   1000     769.23    1428.57
# 4:  option usd   5000    2777.78    5882.35
# 5:  option aud   3000         NA         NA

You have to sort the result as you like and mark the lookup errors (missing conversion rates) with another value than NA if you want (but not with the string "not_converted" like in your question since this would mix up the data type of the column - double vs character).

Upvotes: 2

Parfait
Parfait

Reputation: 107587

Consider building a rates data set and merge with your original, avoiding nested ifelse:

rates_df <- data.frame(ccy = c('usd', 'eur'),
                       type = c('gbp', 'gbp', 'cad', 'cad'),
                       rate = c(1.8, 1.3, 0.85, 0.7),
                       stringsAsFactors = FALSE)    
rates_df

df %>% 
  inner_join(rates_df, by="ccy") %>%
  mutate(gbp_amount = ifelse(type=="gbp", round(amount / rate, 2), 0),
         cad_amount = ifelse(type=="cad", round(amount / rate, 2), 0)) %>%
  select(product, ccy, matches("amount")) %>%
  group_by(product, ccy, amount) %>%
  summarise_all(sum)

# # A tibble: 4 x 5
# # Groups:   product, ccy [?]
#   product   ccy amount gbp_amount cad_amount
#     <chr> <chr>  <dbl>      <dbl>      <dbl>
# 1 forward   eur   1000     769.23    1428.57
# 2 forward   usd   2000    1111.11    2352.94
# 3  option   usd   1000     555.56    1176.47
# 4  option   usd   5000    2777.78    5882.35

Upvotes: 3

Anonymous coward
Anonymous coward

Reputation: 2091

You'll have to use a for-loop if you want to do multiple actions. @R Yoda's solution is probably better though. Like he said, I'd use NA rather than a character string so you aren't mixing data types in a vector, otherwise it'll default to character.

for (i in 1:nrow(df)) {
  if(df$ccy[i] == "usd") {
    df$gbp_amount[i] <- round(df$amount[i] / 1.8, 2);
    df$cad_amount[i] <- round(df$amount[i] / 0.85, 2);
  } else {
    NA
 }
  if(df$ccy[i] == "eur") {
    df$gbp_amount[i] <- round(df$amount[i] / 1.3, 2);
    df$cad_amount[i] <- round(df$amount[i] / 0.7, 2);
  } else {
    NA
  }
}

Upvotes: 0

Related Questions