temo
temo

Reputation: 69

Finding year with equal occurrences

I'm using the babynames package to find out when a certain name(like Alex) was closest to having an equal number of male and female babies that had that name.

I currently have but I'm not sure what math needs to be done to find out when this name was most unisex, since it probably wasn't a perfect 50/50.

Alex <- babynames %>%
  filter(name == "Alex", year >=1920) %>%
  group_by(year, sex) %>%
  summarise(n = sum(n)) %>%
  mutate(n = n/sum(n) * 100)

Thank you.

Upvotes: 0

Views: 51

Answers (1)

dario
dario

Reputation: 6483

Graphically:

library(babynames)
library(dplyr)
library(ggplot2)
babynames %>%
  filter(name == "Alex", year >=1920) %>%
  ggplot(aes(year, n, color=sex)) +
  geom_line()

Numerically:

library(tidyr)
babynames %>%
filter(name == "Alex", year >=1920) %>%
group_by(year) %>% 
mutate(pct = n / sum(n, na.rm = TRUE)) %>% 
ungroup() %>% 
select(year, name, pct, sex) %>% 
pivot_wider(names_from = sex, values_from = pct) %>% 
mutate(diff = abs(F - M)) %>% 
arrange(diff)

For all names:

babynames %>%
  filter(year >=1920) %>%
  group_by(name, year) %>% 
  mutate(pct = n / sum(n, na.rm = TRUE),
         total = sum(n)) %>% 
  ungroup() %>% 
  select(year, name, total, pct, sex) %>% 
  pivot_wider(names_from = sex, values_from = pct) %>% 
  mutate(diff = abs(F - M)) %>% 
  arrange(diff)

Not sure about this data set though ;)

babynames %>%
  filter(name == "Othello", year ==1920)
   year sex   name        n       prop
  <dbl> <chr> <chr>   <int>      <dbl>
1  1920 F     Othello     8 0.00000643
2  1920 M     Othello     8 0.00000727

Upvotes: 3

Related Questions