Reputation: 2297
I want to build new data (age_summary) with a total number of people by age group. I would like to use "cut" and My codes are:
set.seed(12345)
#create a numeric variable Age
AGE <- sample(0:110, 100, replace = TRUE)
# Creat Data fame
Sample.data <-data.frame(AGE)
age_summary <- Sample.data %>% summarize(group_by(Sample.data,
cut(
AGE,
breaks=c(0,0.001, 0.083, 2, 13, 65,1000),
right=TRUE,
labels = c("Foetus(0 yr)","Neonate (0.001 - 0.082 yr)","Infant(0.083-1.999 yrs)","Child(2-12.999 yrs)", "Adolescent(13-17.999 yrs)","Adult(18-64.999 yrs.)","Elderly(65-199 yrs)")
),"Total people" = n())
)
However my codes do not work. I am not sure what went wrong. Any suggestion on how to solve this?
Add: I was able to get results that look like this:
is it possible for me to achieve something looks like this:
Here is what I get with adorn_totals(.) on a new data set. the total people looks OK, but the ave-age looks strange.
Any idea?
Upvotes: 1
Views: 245
Reputation: 887223
If we remove the summarise
wrapping around the group_by
, we can find the issue more easily. Here, the cut
labels
and breaks
have different lengths
, which can be changed if we add -Inf
or Inf
in breaks
library(dplyr)
Sample.data %>%
group_by(grp = cut(AGE,
breaks=c(-Inf, 0,0.001, 0.083, 2, 13, 65,1000),
right=TRUE,
labels = c("Foetus(0 yr)",
"Neonate (0.001 - 0.082 yr)","Infant(0.083-1.999 yrs)","Child(2-12.999 yrs)", "Adolescent(13-17.999 yrs)",
"Adult(18-64.999 yrs.)","Elderly(65-199 yrs)")
)) %>%
summarise(TotalPeople = n())
If we need to create a row with different functions applied on different columns, use add_row
library(tibble)
library(tidyr)
Sample.data %>%
group_by(grp = cut( AGE, breaks=c(-Inf, 0,0.001, 0.083, 2, 13, 65,1000),
right=TRUE, labels = c("Foetus(0 yr)","Neonate (0.001 - 0.082 yr)","Infant(0.083-1.999 yrs)","Child(2-12.999 yrs)",
"Adolescent(13-17.999 yrs)","Adult(18-64.999 yrs.)","Elderly(65-199 yrs)") )) %>%
summarise(TotalPeople = n(), Ave_age=mean(AGE))%>%
complete(grp = levels(grp), fill = list(TotalPeople = 0)) %>%
add_row(grp = "Total", TotalPeople = sum(.$TotalPeople),
Ave_age = mean(.$Ave_age, na.rm = TRUE))
Upvotes: 1