Reputation: 21
I am trying to find the country with the highest average age but I also need to filter out countries with less than 5 entries in the data frame. I tried the following but it does not work:
bil %>%
group_by(citizenship,age) %>%
mutate(n=count(citizenship), theMean=mean(age,na.rm=T)) %>%
filter(n>=5) %>%
arrange(desc(theMean))
bil is the dataset and I am trying to count how many entries I have for each country, filter out countries with less than 5 entries, find the average age for each country and then find the country with the highest average. I am confused on how to do both things at the same time. If I do one summarize at a time I lose the rest of my data.
Upvotes: 0
Views: 339
Reputation: 887831
Perhaps, this could help. Note that the parameter 'x' in count
is a tbl/data.frame
. So, instead of count
, we group by 'citizenship' and get the frequency of values with n()
, get the mean
of 'age' (not sure about the 'age' as grouping variable) and do the filter
bil %>%
group_by(citizenship) %>%
mutate(n = n()) %>%
mutate(theMean = mean(age, na.rm=TRUE)) %>%
filter(n>=5) %>%
arrange(desc(theMean))
Upvotes: 2