Gainz
Gainz

Reputation: 1771

Create multiple percentage columns based on existing columns in R

I want to create multiple columns that will show the percentage of each element of col2, col3 and Total. The code I came up with only paste the percentage in those columns instead of pasting it in new columns. I have searched on stack and google but I have not found the answer I was looking for.

Sample data :

data <- data.table(col1= c("A", "B", "C"),
                   col2= c(43,23,19),
                   col3= c(102,230,149))
data <- data[, Total := col2 + col3]
data <- janitor::adorn_title(data)

Output :

  col1 col2 col3 Total
    A   43  102   145
    B   23  230   253
    C   19  149   168
Total   85  481   566

My percentage function :

add_percent <- function(dt, col_no_percent, col_percent){
  dt <- dt[
    , c(.SD[, col_no_percent, with=FALSE],
        lapply(.SD[, col_percent, with=FALSE], function(x){
          paste0(x, format(round(x / sum(x) * 100 * 2, 1), nsmall = 1, decimal.mark = "."))

        }))
    ]
}

Data output with my function:

data <- add_percent(data, "col1", c("col2", "col3", "Total"))
    col1    col2     col3    Total
       A 43 50.6 102 21.2 145 25.6
       B 23 27.1 230 47.8 253 44.7
       C 19 22.4 149 31.0 168 29.7
   Total 85 100.0 481 100.0 566 100.0

Data output I want :

  col1 col2 col3 Total col2.x col3.x Total.x
    A   43  102   145   50.6   21.2   25.6
    B   23  230   253   27.1   47.8   44.7
    C   19  149   168   22.4   31.0   29.7
Total   85  481   566  100.0  100.0  100.0

It is possible that my data will contain way more columns, so all the new columns will have to be created "automatically". So I would like to know how to generate those columns based on my percent function or even a more efficient way if possible.

Thank you.

Upvotes: 0

Views: 372

Answers (2)

TimTeaFan
TimTeaFan

Reputation: 18581

I know it is a data.table question, but dplyr has a really nice way of doing this. So just to add it as one possible answer.

library(dplyr)

# this is your function (slightly changed)
as_perc <- function(x) {

  paste0(format(100 * (round(x/ sum(x),  2)), nsmall = 1, decimal.mark = "."), "%")

}

data %>%
   mutate_if(is.numeric, .funs = list(perc = ~ as_perc(.)))

  col1 col2 col3 Total col2_perc col3_perc Total_perc
1    A   43  102   145     51.0%     21.0%      26.0%
2    B   23  230   253     27.0%     48.0%      45.0%
3    C   19  149   168     22.0%     31.0%      30.0%

Upvotes: 1

IceCreamToucan
IceCreamToucan

Reputation: 28705

Initial Data. Note I removed the janitor step. Will do that part at the end.

data <- data.table(col1= c("A", "B", "C"),
                   col2= c(43,23,19),
                   col3= c(102,230,149))
data <- data[, Total := col2 + col3]

Add percent columns for all numeric columns and add "Total" row

cols <- names(data)[sapply(data, is.numeric)]

data[, paste0(cols, '_pct') := lapply(.SD, function(x) 100*x/sum(x))
   , .SDcols = cols]


adorn_totals(data)
 #  col1 col2 col3 Total  col2_pct  col3_pct Total_pct
 #     A   43  102   145  50.58824  21.20582  25.61837
 #     B   23  230   253  27.05882  47.81705  44.69965
 #     C   19  149   168  22.35294  30.97713  29.68198
 # Total   85  481   566 100.00000 100.00000 100.00000

Upvotes: 5

Related Questions