Reputation: 179
Apologies for the newbie question, still getting the hang of DPLYR (and R in general).
I have the following dataset, and I am trying to work out the average rating for each area group where there are more than 1 entry.
Area Rating
UK 5.256
Ireland 6.1465
Canada 6.1452
USA 6.14
Ireland 4.258
USA 3.154
My expected returned data would be :
Area Count Average_Rating
Ireland 2 5.20255
USA 2 4.647
I have tried the following code, and seem to be falling down when I try to perform the 'count' column (it returns the mean value for all areas)
df %>%
group_by (Area) %>%
mutate (count =n()) %>%
summarise (mean = mean(Average_Rating)) %>%
arrange(desc(mean))
I have tried playing about with the order of the verbs with no success. Any help greatly appreciated!
Upvotes: 1
Views: 670
Reputation: 887118
We need a filter
after the group_by
step
library(dplyr)
df1 %>%
group_by(Area) %>%
filter(n() > 1) %>%
summarise(Count = n(), Average_Rating = mean(Rating)) %>%
arrange(desc(Average_Rating))
# A tibble: 2 x 3
# Area Count Average_Rating
# <chr> <int> <dbl>
#1 Ireland 2 5.20
#2 USA 2 4.65
df1 <- structure(list(Area = c("UK", "Ireland", "Canada", "USA", "Ireland",
"USA"), Rating = c(5.256, 6.1465, 6.1452, 6.14, 4.258, 3.154)),
class = "data.frame", row.names = c(NA,
-6L))
Upvotes: 3