user14415653
user14415653

Reputation: 9

How can I find the proportion of something occurring in a data frame R

Say I have a large dataframe that has numbered levels(1-4) corresponding to different states which are repeated. I need to find the proportion of an occurrence of level 3 for each state and then which state has the highest proportion of level 3's. For example, the data frame below has NY listed 3 times, and there is a 1/3 or 0.03 proportion of level 3's.

DF1

 State  City              Level
  NY    Brooklyn          2  
  TX    Dallas            3
  UT    Salt Lake City    4
  WI    Milwaukee         1  
  CA    Fresno            3
  NY    New York          2
  UT    Ogden             1
  NY    Buffalo           3

Upvotes: 0

Views: 100

Answers (1)

AndrewGB
AndrewGB

Reputation: 16856

It's not clear what your expected output is, as any state that only has 1 level 3, will have a proportion of 1.

library(tidyverse)

results <- DF1 %>% 
  group_by(State) %>% 
  count(Level) %>% 
  mutate(prop_occ = proportions(n)) %>% 
  ungroup %>% 
  filter(Level == 3) %>%
  slice_max(prop_occ)

Output

  State Level     n prop_occ
  <chr> <int> <int>    <dbl>
1 CA        3     1        1
2 TX        3     1        1

If you want just the state names, then we could use pull at the end.

results %>% 
  pull(State)

# [1] "CA" "TX"

Upvotes: 1

Related Questions