Reputation: 1771
I want to create multiple columns that will show the percentage of each element of col2
, col3
and Total
. The code I came up with only paste
the percentage in those columns instead of pasting it in new columns.
I have searched on stack and google but I have not found the answer I was looking for.
Sample data :
data <- data.table(col1= c("A", "B", "C"),
col2= c(43,23,19),
col3= c(102,230,149))
data <- data[, Total := col2 + col3]
data <- janitor::adorn_title(data)
Output :
col1 col2 col3 Total
A 43 102 145
B 23 230 253
C 19 149 168
Total 85 481 566
My percentage function :
add_percent <- function(dt, col_no_percent, col_percent){
dt <- dt[
, c(.SD[, col_no_percent, with=FALSE],
lapply(.SD[, col_percent, with=FALSE], function(x){
paste0(x, format(round(x / sum(x) * 100 * 2, 1), nsmall = 1, decimal.mark = "."))
}))
]
}
Data output with my function:
data <- add_percent(data, "col1", c("col2", "col3", "Total"))
col1 col2 col3 Total
A 43 50.6 102 21.2 145 25.6
B 23 27.1 230 47.8 253 44.7
C 19 22.4 149 31.0 168 29.7
Total 85 100.0 481 100.0 566 100.0
Data output I want :
col1 col2 col3 Total col2.x col3.x Total.x
A 43 102 145 50.6 21.2 25.6
B 23 230 253 27.1 47.8 44.7
C 19 149 168 22.4 31.0 29.7
Total 85 481 566 100.0 100.0 100.0
It is possible that my data will contain way more columns, so all the new columns will have to be created "automatically". So I would like to know how to generate those columns based on my percent function or even a more efficient way if possible.
Thank you.
Upvotes: 0
Views: 372
Reputation: 18581
I know it is a data.table question, but dplyr has a really nice way of doing this. So just to add it as one possible answer.
library(dplyr)
# this is your function (slightly changed)
as_perc <- function(x) {
paste0(format(100 * (round(x/ sum(x), 2)), nsmall = 1, decimal.mark = "."), "%")
}
data %>%
mutate_if(is.numeric, .funs = list(perc = ~ as_perc(.)))
col1 col2 col3 Total col2_perc col3_perc Total_perc
1 A 43 102 145 51.0% 21.0% 26.0%
2 B 23 230 253 27.0% 48.0% 45.0%
3 C 19 149 168 22.0% 31.0% 30.0%
Upvotes: 1
Reputation: 28705
Initial Data. Note I removed the janitor step. Will do that part at the end.
data <- data.table(col1= c("A", "B", "C"),
col2= c(43,23,19),
col3= c(102,230,149))
data <- data[, Total := col2 + col3]
Add percent columns for all numeric columns and add "Total" row
cols <- names(data)[sapply(data, is.numeric)]
data[, paste0(cols, '_pct') := lapply(.SD, function(x) 100*x/sum(x))
, .SDcols = cols]
adorn_totals(data)
# col1 col2 col3 Total col2_pct col3_pct Total_pct
# A 43 102 145 50.58824 21.20582 25.61837
# B 23 230 253 27.05882 47.81705 44.69965
# C 19 149 168 22.35294 30.97713 29.68198
# Total 85 481 566 100.00000 100.00000 100.00000
Upvotes: 5