Sklenicka
Sklenicka

Reputation: 655

R - mutate if + dynamically create column name

My df looks like this:

df <- read.table(text="
   expenses     month     paid_gas   paid_fees  paid_hotel   name
   100          2019-01   20         70         10           Jack Carver
   200          2019-02   40         140        20           Jack Carver
", header=TRUE)

I´d like to calculate how much % of expenses column is created by each column with prefix paid. In other words i´d like to create something like this:

result <- 
  mutate(
    prc_gas = paid_gas/expenses
)

But I don´t want to do it manually for each column since my df have dozens of paid_ columns + name of the newly created column should be always text after prefix. So result should be

 result  <- read.table(text="
       expenses     month     paid_gas   paid_fees  paid_hotel   name           prc_gas    prc_fees   prc_hote
       100          2019-01   20         70         10           Jack Carver    20         70         10     
       200          2019-02   40         140        20           Jack Carver    20         70         10     
    ", header=TRUE) 

Upvotes: 2

Views: 239

Answers (3)

Maurits Evers
Maurits Evers

Reputation: 50738

We can use mutate_at with a named list function to automatically create new columns

library (dplyr) # for mutate_at()

df %>% mutate_at(vars(starts_with("paid")), list(prc = ~. / expenses))
#  expenses   month paid_gas paid_fees paid_hotel        name paid_gas_prc
#1      100 2019-01       20        70         10 Jack Carver          0.2
#2      200 2019-02       40       140         20 Jack Carver          0.2
#  paid_fees_prc paid_hotel_prc
#1           0.7            0.1
#2           0.7            0.1

Note that you are missing some single ticks in your sample data df.


Sample data

df <- read.table(text="expenses     month     paid_gas   paid_fees  paid_hotel   name
  100          2019-01   20         70         10           'Jack Carver'
  200          2019-02   40         140        20           'Jack Carver'", header=TRUE)

Upvotes: 3

akrun
akrun

Reputation: 887951

Here is an option with data.table

library(data.table)
nm1 <- startsWith(names(df), "paid")
setDT(df)[, paste0("perc_", names(df)[nm1]) :=
             lapply(.SD, `/`, expenses), .SDcols = nm1]
df
#   expenses   month paid_gas paid_fees paid_hotel        name perc_paid_gas perc_paid_fees perc_paid_hotel
#1:      100 2019-01       20        70         10 Jack Carver           0.2            0.7             0.1
#2:      200 2019-02       40       140         20 Jack Carver           0.2            0.7             0.1

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 389325

We could also use base R lapply to calculate for multiple columns

inds <- grep("^paid", names(df), value = TRUE)
df[paste0("perc_", inds)] <- lapply(df[inds], function(x) x/df$expenses)

#  expenses   month paid_gas paid_fees paid_hotel      name        
#1      100 2019-01       20        70         10 Jack Carver           
#2      200 2019-02       40       140         20 Jack Carver                      

#  perc_paid_gas perc_paid_fees perc_paid_hotel
#       0.2            0.7             0.1
#       0.2            0.7            0.1

Or with mapply

df[paste0("perc_", inds)] <- mapply(`/`, df[inds], df["expenses"])

Upvotes: 2

Related Questions