Reputation: 2077
I have the following hypothetical dataframe
Region <- c("District A", "District B","District A","District A","District B")
Gender <- c("Male","Male","Female", "Male","Female")
Age <- c(20, 21, 23, 34, 22)
AmountSold <- c(50, 10, 20, 4, 12)
RegionSales <- data.frame(Region, Gender, Age, AmountSold)
I'd like to create a pivot table or a table that shows both mean of Amount sold per gender and Region as well as the mean of Age per gender and region. How do I do that in R?
Upvotes: 2
Views: 295
Reputation: 2636
This would be my approach with the dplyr
package:
library(dplyr)
RegionSales %>%
group_by(Region, Gender) %>%
summarize(mean_age = mean(Age), mean_amount = mean(AmountSold))
Output:
# A tibble: 4 x 4
# Groups: Region [2]
Region Gender mean_age mean_amount
<chr> <chr> <dbl> <dbl>
1 District A Female 23 20
2 District A Male 27 27
3 District B Female 22 12
4 District B Male 21 10
An option that ignores NA
values:
RegionSales %>%
group_by(Region, Gender) %>%
summarize(mean_age = mean(Age, na.rm = T),
mean_amount = mean(AmountSold, na.rm = T))
Upvotes: 3
Reputation: 102469
A base option using aggregate
could help
> aggregate(. ~ Region + Gender, RegionSales, mean)
Region Gender Age AmountSold
1 District A Female 23 20
2 District B Female 22 12
3 District A Male 27 27
4 District B Male 21 10
Upvotes: 2
Reputation: 887691
With dplyr
, another option is to specify the variables in across
library(dplyr)
RegionSales %>%
group_by(Region, Gender) %>%
summarise(across(c(Age, AmountSold),
~ mean(., na.rm = TRUE), .names = "mean_{.col}"))
Upvotes: 3