user7662688
user7662688

Reputation:

Subtract multiple columns based on condition in R

I have a dataset which looks like this. In my new dataset, I want to subtract the amount column(s) with remainder(s) column. For instance, if there is 5 amount column and 3 remainder column then the first amount column must be subtracted from the first remainder column, 2nd with 2nd and 3rd with 3rd. The last remaining 2 amount columns must be as it is.

amount1  amount2   amount3   amount4  amount5     remainder1  remainder2    remainder3  
 100      250       150        250       100         80         100          100 
 200      200       350        450       100        120         100          50
 300      150       450        200       100        150         100          100
 250      550       550        100       200         50         500          200
 550      200       650        250       200        500         100          500

My new dataset must look like this. Please note am stands for amount and rem stands for remainder.

 newamount1       newamount2         newamount3      newamount4     newamount5     
 20(am1-rem1)        150(am2-rem2)      50(am3-rem3)   amount4        amount5
 80                  100                300            amount4        amount5
 150                 50                 350            amount4        amount5
 200                 50                 350            amount4        amount5
 50                  100                100            amount4        amount4

Upvotes: 0

Views: 286

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389215

In base R, you can use :

am_cols <- grep('Principal_', names(summary))

rm_cols <- grep('PrincipalPaid_', names(summary))

summary[is.na(summary)] <- 0

result <- cbind(summary[head(am_cols, length(rm_cols))] - summary[rm_cols], 
      summary[tail(am_cols, -length(rm_cols))])
names(result) <- paste0('new_principal', seq_along(result))
cbind(summary[1], result)

#  LoanId new_principal1 new_principal2 new_principal3 new_principal4
#1 825334           1965           2030           2051           2095
#2 825337           1965           2030           2051           2095
#3 838276           1961           2026           2047           2091
#4 850614              0           2275           2296           2346
#5 851791              0           2037           2059           2103

Upvotes: 0

Ben
Ben

Reputation: 30494

Another approach, using tidyverse could be the following.

Add row numbers to your data. Use pivot_longer to put your data into long format, and subtract remainder from amount for each month/row. Then put back into wide format.

Note this assumes instead of numbers at end of each column name, it ends with an underscore _ and month abbreviation, as per comment.

library(tidyr)
library(dplyr)

df %>%
  mutate(rn = row_number()) %>%
  pivot_longer(cols = -rn, names_to = c(".value", "month"), names_sep = "_") %>%
  mutate(newamount = amount - remainder) %>%
  pivot_wider(id_cols = rn, names_from = month, values_from = newamount, names_prefix = "newamount_") %>%
  dplyr::select(-rn)

Output

  newamount_Jan newamount_Feb newamount_Mar newamount_Apr newamount_May
          <dbl>         <dbl>         <dbl>         <dbl>         <dbl>
1            20           150            50            NA            NA
2            80           100           300            NA            NA
3           150            50           350            NA            NA
4           200            50           350            NA            NA
5            50           100           150            NA            NA

Upvotes: 0

Related Questions