Nick Knauer
Nick Knauer

Reputation: 4243

Get Percentage of All Columns by Column Total

I have a dataframe that looks like this:

Label       Jan       Feb       Mar
    A         1         2         3
    B         1         4         9
    C         1         4         3

The number of columns after "Label" will continue to increase month over month.

I want to create an output that looks like this:

Label       Jan       Feb       Mar
    A      0.33      0.20      0.20
    B      0.33      0.40      0.40
    C      0.33      0.40      0.20

Below works but removes the label. I can cbind the label back but wondering if there is an easier way

library(dplyr)
df[,-1] %>%
  mutate_all(funs(./sum(.))))

Upvotes: 1

Views: 1566

Answers (1)

akrun
akrun

Reputation: 887911

We can use mutate_at/mutate_if or mutate with across (from dplyr version >= 1.00)

library(dplyr)
df1 <- df %>%
     mutate(across(where(is.numeric), ~ ./sum(.)))

if we use an older dplyr version < 1.00

df %>%
    mutate_if(is.numeric, ~ ./sum(.))

If we need percentage, we can use formattable::percent which would enable it to remain as a numeric column (in case we want to do color scaling on those columns)

df %>%
    mutate(across(where(is.numeric), ~ 
           formattable::percent(./sum(.))))

-output

# Label    Jan    Feb    Mar
#1     A 33.33% 20.00% 20.00%
#2     B 33.33% 40.00% 60.00%
#3     C 33.33% 40.00% 20.00%

data

df - structure(list(Label = c("A", "B", "C"), Jan = c(1L, 1L, 1L), 
    Feb = c(2L, 4L, 4L), Mar = c(3L, 9L, 3L)),
    class = "data.frame", row.names = c(NA, 
-3L))

Upvotes: 3

Related Questions