Reputation: 1209
I have a dataset which has values in first row & total in third row. I want to create a fourth row which is percentage of first by total which can be done by dividing first row with fourth row.
below is structure of dataframe
ds = structure(list(t1 = structure(c("1", "2", "Total"), label = "currently smoke any tobacco product", labels = c(no = 0,
yes = 1), class = "haven_labelled"), c1Female = c(679357.516868591,
8394232.81394577, 9073590.33081436), c1Male = c(2254232.8617363,
5802560.20343018, 8056793.06516647), se.c1Female = c(63743.4459540534,
421866.610586848, 485610.056540901), se.c1Male = c(185544.754820322,
386138.725133411, 571683.479953732), Total_1 = c(`1` = 2933590.37860489,
`2` = 14196793.0173759, `3` = 17130383.3959808), per = c(`1` = 0.171250713471665,
`2` = 0.828749286528335, `3` = 1)), class = "data.frame", row.names = c(NA,
-3L))
My try & what is wrong with this
ds %>% mutate(percentage = .[1,]/.[3,])
OUTPUT SHOULD BE : Below is the dput of Output Dataframe that I want
structure(list(t1 = structure(c(1L, 2L, 4L, 3L), .Label = c("1",
"2", "Percentage", "Total"), class = "factor"), c1Female = c(679357.517,
8394232.814, 9073590.331, 0.074871963), c1Male = c(2254232.86,
5802560.2, 8056793.07, 0.279792821), se.c1Female = c(63743.446,
421866.611, 485610.057, 0.131264674), se.c1Male = c(185544.755,
386138.725, 571683.48, 0.324558539), Total_1 = c(2933590.38,
14196793.02, 17130383.4, 0.171250714), per = c(0.171250713, 0.828749287,
1, 0.171250713)), class = "data.frame", row.names = c(NA, -4L
))
Do share the tidyverse way to do this. Also, do tell what is wrong with this approach below line code
ds %>% mutate(percentage = .[1,]/.[3,])
Upvotes: 2
Views: 137
Reputation: 887108
We can use summarise_at
to divide multiple column values to return a single row and then bind with the original dataset
library(dplyr)
ds %>%
summarise_at(-1, ~ .[1]/.[3]) %>%
mutate(t1 = 'Percentage') %>%
bind_rows(ds, .)
# t1 c1Female c1Male se.c1Female se.c1Male Total_1 per
#1 1 6.793575e+05 2.254233e+06 6.374345e+04 1.855448e+05 2.933590e+06 0.1712507
#2 2 8.394233e+06 5.802560e+06 4.218666e+05 3.861387e+05 1.419679e+07 0.8287493
#3 Total 9.073590e+06 8.056793e+06 4.856101e+05 5.716835e+05 1.713038e+07 1.0000000
#4 Percentage 7.487196e-02 2.797928e-01 1.312647e-01 3.245585e-01 1.712507e-01 0.1712507
Or another option is add_row
ds %>%
add_row(t1 = 'Percentage') %>%
mutate_at(-1, ~ replace_na(., .[1]/.[3]))
Or do this within the add_row
step itself
ds %>%
add_row(t1 = 'Percentage', !!!as.list(.[-1][1,]/.[-1][3,]))
# t1 c1Female c1Male se.c1Female se.c1Male Total_1 per
#1 1 6.793575e+05 2.254233e+06 6.374345e+04 1.855448e+05 2.933590e+06 0.1712507
#2 2 8.394233e+06 5.802560e+06 4.218666e+05 3.861387e+05 1.419679e+07 0.8287493
#3 Total 9.073590e+06 8.056793e+06 4.856101e+05 5.716835e+05 1.713038e+07 1.0000000
#4 Percentage 7.487196e-02 2.797928e-01 1.312647e-01 3.245585e-01 1.712507e-01 0.1712507
Upvotes: 2