Finding percentage of male and females from dataframe

Question

I have a dataframe which has the columns

year sex   name          n   prop
           
1  1880 F     Mary       7065 0.0724
2  1880 F     Anna       2604 0.0267
3  1880 F     Emma       2003 0.0205
4  1880 F     Elizabeth  1939 0.0199
5  1880 F     Minnie     1746 0.0179
6  1880 F     Margaret   1578 0.0162

from the babynames library, and I want to find the percentage a certain name has in each gender. For example, if the name is Anna(a traditionally female name), find out that out of all babies named Anna, how many are male and how many are female.

I know that I have to filter by name, but past that I'm unsure of how to get the percentage. I tried group_by(year) and group_by(gender) and summarize() but I am not getting what I need. I am unsure of whether or not that is even the correct thing to do.

edit: I would like to see it by year(Say, in 1880 x% were F and the rest was male, and in 1882 y% were F) Thank you

Ronak Shah · Accepted Answer

You could filter the name "Anna", sum their count by sex and calculate the ratio.

library(babynames)
library(dplyr)

babynames %>%
  filter(name == "Anna") %>%
  group_by(sex) %>%
  summarise(n = sum(n)) %>%
  mutate(n = n/sum(n) * 100)

#   sex    n
#    
#1 F      99.7  
#2 M      0.307

Finding percentage of male and females from dataframe

Answers (2)

Related Questions