Reputation: 310
My data frame includes region
, date
(by day), and general response data for a survey. I have grouped and summarized by region
and date
, so that I have a daily count of completed surveys (looking at a boolean Finished
variable). If I use n()
, I get a total for a region
on that date
(since I grouped the data frame by both of these).
I would like to add the total based on only region
, across all dates. Is this possible using the dplyr methods? I have currently calculated the region
totals, and after summarizing my table, have manually added a column using case_when
to add the region
totals. The goal is to make cumulative completion percentages per region in ggplot
.
Is my approach the only way, or is it possible to use dplyr methods to remove a grouping level when calculating n()
?
Example below. DailyCompeted
is the sum of boolean
Finished
, and total
is giving the total count for that day in that region. My goal is for total
to be the sum of that column (92
), per region
in every row.
x %>% group_by(Region, EndDate) %>% summarize(DailyCompleted=sum(Finished), total=n())
Region EndDate DailyCompleted total
<fct> <date> <int> <int>
1 AMER 2019-04-29 0 1
2 AMER 2019-04-30 0 1
3 AMER 2019-05-02 1 1
4 AMER 2019-05-03 1 1
5 AMER 2019-05-06 40 54
6 AMER 2019-05-07 12 17
7 AMER 2019-05-08 4 7
8 AMER 2019-05-09 3 5
9 AMER 2019-05-10 3 4
10 AMER 2019-05-12 1 1
Upvotes: 0
Views: 2797
Reputation: 10845
We can reaggregate the summarized data by Region
, and then join it bak to the data frame that has been aggregated by EndDate
.
textFile <- "Region EndDate DailyCompleted total
AMER 2019-04-29 0 1
AMER 2019-04-30 0 1
AMER 2019-05-02 1 1
AMER 2019-05-03 1 1
AMER 2019-05-06 40 54
AMER 2019-05-07 12 17
AMER 2019-05-08 4 7
AMER 2019-05-09 3 5
AMER 2019-05-10 3 4
AMER 2019-05-12 1 1"
regionDay <- read.table(text = textFile,header = TRUE)
library(dplyr)
regionDay %>% group_by(Region) %>%
summarise(regionDay,region_total = sum(total)) %>%
inner_join(regionDay,.)
region_tot
...and the output is:
Region EndDate DailyCompleted total region_total
1 AMER 2019-04-29 0 1 92
2 AMER 2019-04-30 0 1 92
3 AMER 2019-05-02 1 1 92
4 AMER 2019-05-03 1 1 92
5 AMER 2019-05-06 40 54 92
6 AMER 2019-05-07 12 17 92
7 AMER 2019-05-08 4 7 92
8 AMER 2019-05-09 3 5 92
9 AMER 2019-05-10 3 4 92
10 AMER 2019-05-12 1 1 92
>
Upvotes: 1