GageEleven
GageEleven

Reputation: 49

Counting the number of occurrences of a value per column in r?

Say I have a dataset like so:

          state                   address
Business - AK - Fairbanks    1234 wherever street
Business - AK - Juneau       2234 wherever street
Business - AK - Anchorage    3234 wherever street
Business - AL - Montgomery   4234 wherever street
Business - AL - Birmingham   5234 wherever street
Business - AL - Birmingham   1234 wherever street
Business - AL - Birmingham   7234 wherever street
Business - AL - Greenville   6234 wherever street
Business - AZ - Parker       7234 wherever street
Business - AZ - Parker       5234 wherever street
Business - AZ - Phoenix      8234 wherever street
Business - AZ - Tucson       9234 wherever street

And so on and so forth, with variable state entries and etc, you get the idea. I want it to count the number of times each state appears in the 'state' column, regardless of city. Ideally, it ends up looking like so:

state   total
AK       3 
AL       5
AZ       4

So on and so forth, with the count being different for each state based on the entries, obviously. I'm having difficulty figuring out how, exactly, to make it so that it counts each state entry by its 2-letter code, though. I'm using dplyr and tidyverse, if it matters.

Upvotes: 0

Views: 58

Answers (2)

Chris Ruehlemann
Chris Ruehlemann

Reputation: 21440

table(gsub(".*([A-Z]{2}).*", "\\1", df$state))

or, based on stringrs function str_extract and positive lookbehind and lookahead:

table(str_extract(df$state, "(?<=- )[A-Z]{2}(?= -)"))

With dplyr:

df %>% mutate(state = str_extract(state, "(?<=- )[A-Z]{2}(?= -)")) %>% count(state)

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389255

You can get the state and use table to count the frequency.

Use stack if you want output as dataframe.

stack(table(sub('-.*', '', df$state)))[2:1]

#  ind values
#1 AK       3
#2 AL       5
#3 AZ       4

Using dplyr :

library(dplyr)
df %>% mutate(state = sub('-.*', '', state)) %>% count(state)

Upvotes: 0

Related Questions