Catalyst
Catalyst

Reputation: 426

How to match two columns then sum the other column variable and add new row with different name for it?

I would like to take the sum of a column variable when the other two columns match and then adding a new row with different name for this sum result. I'm able to get the first part but unsure how to add the new row for the result obtained...

Here's my dataframe:

df <- structure(list(measure_name = c("Prevalence", "Prevalence", "Incidence", 
"Incidence", "Deaths", "Deaths", "YLLs (Years of Life Lost)", 
"YLLs (Years of Life Lost)", "YLDs (Years Lived with Disability)", 
"YLDs (Years Lived with Disability)", "DALYs (Disability-Adjusted Life Years)", 
"DALYs (Disability-Adjusted Life Years)", "Prevalence", "Prevalence", 
"Incidence", "Incidence", "YLDs (Years Lived with Disability)", 
"YLDs (Years Lived with Disability)", "DALYs (Disability-Adjusted Life Years)", 
"DALYs (Disability-Adjusted Life Years)"), age_name = c("1-4 years", 
"5-9 years", "1-4 years", "5-9 years", "1-4 years", "5-9 years", 
"1-4 years", "5-9 years", "1-4 years", "5-9 years", "1-4 years", 
"5-9 years", "1-4 years", "5-9 years", "1-4 years", "5-9 years", 
"1-4 years", "5-9 years", "1-4 years", "5-9 years"), cause_name = c("Asthma", 
"Asthma", "Asthma", "Asthma", "Asthma", "Asthma", "Asthma", "Asthma", 
"Asthma", "Asthma", "Asthma", "Asthma", "Attention-deficit/hyperactivity disorder", 
"Attention-deficit/hyperactivity disorder", "Attention-deficit/hyperactivity disorder", 
"Attention-deficit/hyperactivity disorder", "Attention-deficit/hyperactivity disorder", 
"Attention-deficit/hyperactivity disorder", "Attention-deficit/hyperactivity disorder", 
"Attention-deficit/hyperactivity disorder"), val = c(21809765.44, 
33602368.48, 10004723.65, 6417738.685, 6101.934992, 1699.9247, 
524901.7761, 138969.73, 879880.8571, 1355302.883, 1404782.633, 
1494272.613, 1367581.312, 14033704.42, 1314270.786, 2654899.128, 
16774.31306, 171847.3209, 16774.31306, 171847.3209)), row.names = c(NA, 
-20L), class = c("tbl_df", "tbl", "data.frame"))

> df
# A tibble: 20 × 4
   measure_name                           age_name  cause_name                                     val
   <chr>                                  <chr>     <chr>                                        <dbl>
 1 Prevalence                             1-4 years Asthma                                   21809765.
 2 Prevalence                             5-9 years Asthma                                   33602368.
 3 Incidence                              1-4 years Asthma                                   10004724.
 4 Incidence                              5-9 years Asthma                                    6417739.
 5 Deaths                                 1-4 years Asthma                                       6102.
 6 Deaths                                 5-9 years Asthma                                       1700.
 7 YLLs (Years of Life Lost)              1-4 years Asthma                                     524902.
 8 YLLs (Years of Life Lost)              5-9 years Asthma                                     138970.
 9 YLDs (Years Lived with Disability)     1-4 years Asthma                                     879881.
10 YLDs (Years Lived with Disability)     5-9 years Asthma                                    1355303.
11 DALYs (Disability-Adjusted Life Years) 1-4 years Asthma                                    1404783.
12 DALYs (Disability-Adjusted Life Years) 5-9 years Asthma                                    1494273.
13 Prevalence                             1-4 years Attention-deficit/hyperactivity disorder  1367581.
14 Prevalence                             5-9 years Attention-deficit/hyperactivity disorder 14033704.
15 Incidence                              1-4 years Attention-deficit/hyperactivity disorder  1314271.
16 Incidence                              5-9 years Attention-deficit/hyperactivity disorder  2654899.
17 YLDs (Years Lived with Disability)     1-4 years Attention-deficit/hyperactivity disorder    16774.
18 YLDs (Years Lived with Disability)     5-9 years Attention-deficit/hyperactivity disorder   171847.
19 DALYs (Disability-Adjusted Life Years) 1-4 years Attention-deficit/hyperactivity disorder    16774.
20 DALYs (Disability-Adjusted Life Years) 5-9 years Attention-deficit/hyperactivity disorder   171847.

Desired output using prevalence of asthma as an example:

 measure_name                           age_name  cause_name                                     val
   <chr>                                  <chr>     <chr>                                        <dbl>
 1 Prevalence                             1-4 years Asthma                                   21809765.
 2 Prevalence                             5-9 years Asthma                                   33602368.
 3 Prevalence                             1–9 years Asthma                                   55412133.

What I have tried:

Total <- df %>%
          group_by(measure_name, cause_name) %>%
          summarise(val = sum(val, na.rm = TRUE)) %>% arrange(cause_name, factor(measure_name, levels = c("Prevalence",
                                                                                                      "Incidence",
                                                                                                      "Deaths",
                                                                                                      "YLLs (Years of Life Lost)",
                                                                                                      "YLDs (Years Lived with Disability)",
                                                                                                      "DALYs (Disability-Adjusted Life Years)")))

# A tibble: 10 × 3
# Groups:   measure_name [6]
   measure_name                           cause_name                                     val
   <chr>                                  <chr>                                        <dbl>
 1 Prevalence                             Asthma                                   55412134.
 2 Incidence                              Asthma                                   16422462.
 3 Deaths                                 Asthma                                       7802.
 4 YLLs (Years of Life Lost)              Asthma                                     663872.
 5 YLDs (Years Lived with Disability)     Asthma                                    2235184.
 6 DALYs (Disability-Adjusted Life Years) Asthma                                    2899055.
 7 Prevalence                             Attention-deficit/hyperactivity disorder 15401286.
 8 Incidence                              Attention-deficit/hyperactivity disorder  3969170.
 9 YLDs (Years Lived with Disability)     Attention-deficit/hyperactivity disorder   188622.
10 DALYs (Disability-Adjusted Life Years) Attention-deficit/hyperactivity disorder   188622.

I get the total but would like to create new row with new age_name that is 1–9 years (pls see above desired output using prevalence of asthma as an example). Can someone help pls? Thanks.

Upvotes: 0

Views: 60

Answers (1)

martin
martin

Reputation: 26

If I understand your question correctly, you can just sum val, change the value of age_name, and then use bind_rows to tie it up with the original data set. After that it is just a matter of organizing the rows to appear in the way you want it. Here is such a solution:

df %>% 
  group_by(measure_name, cause_name) %>% 
  summarise(val = sum(val, na.rm = TRUE), age_name = "1-9 years", .groups = "drop") %>% 
  bind_rows(df) %>% 
  arrange(cause_name, factor(measure_name, levels = c("Prevalence",
                                                      "Incidence",
                                                      "Deaths",
                                                      "YLLs (Years of Life Lost)",
                                                      "YLDs (Years Lived with Disability)",
                                                      "DALYs (Disability-Adjusted Life Years)")), 
          age_name = factor(age_name, levels = c("1-4 years", "5-9 years", "1-9 years")))

Upvotes: 1

Related Questions