Reputation: 724
I have a dataframe that consists of sales and population data. Reference below:
Location Sales Population Month
A 10 480 Jan
B 12 480 Jan
C 14 480 Jan
A 13 480 Jan
B 11 480 Jan
C 16 480 Jan
A 12 480 Jan
B 10 480 Jan
C 14 480 Jan
What I would like to do is use dplyr to group by month (only showing Janurary but goes to Dec) for sum of sales and the month's population.
I get the sales with this line of code by my population comes out as NA..
test2 <- df_2019 %>% group_by(Month) %>% summarize(SumSales = sum(Total_Sales, na.rm = TRUE), Pop_Sum = sum(Population, na.rm = TRUE))
Month
SumSales
Pop_Sum
1 Apr 285591.9 134786490
2 Aug 384246.5 131901771
3 Dec 254748.9 89512147
4 Feb 251463.7 135634878
5 Jan 243624.6 135901304
6 Jul 286468.8 134335668
7 Jun 283395.2 134335668
8 Mar 289453.8 135658132
9 May 365272.2 134768586
10 Nov 291248.8 89576444
11 Oct 375402.2 89589288
12 Sep 290888.5 132878020
DESIRED OUTPUT would look like this:
Month
SumSales
Pop_Sum
1 Apr 285591.9 437
2 Aug 384246.5 440
3 Dec 254748.9 443
4 Feb 251463.7 435
5 Jan 243624.6 480
6 Jul 286468.8 455
7 Jun 283395.2 465
8 Mar 289453.8 460
9 May 365272.2 479
10 Nov 291248.8 435
11 Oct 375402.2 444
12 Sep 290888.5 451
Where Month Population has mutliple rows with the same value but sales are unique. Any help would be very helpful!
Upvotes: 0
Views: 59
Reputation: 389235
Since the population
values are already calculated we can take any population
value for each month. For example, taking the 1st value of Population
, we can do
library(dplyr)
df_2019 %>%
group_by(Month) %>%
summarize(SumSales = sum(Total_Sales, na.rm = TRUE),
Pop_Sum = first(Population))
Upvotes: 1