Jessie Buckner
Jessie Buckner

Reputation: 1

How do I use R to count multiple variables?

I have a large dataset of Bird observations. I would like to count by groups i.e. the species observed by various categories: year, season, and grid.

For example how many American Crows (AMCR) were observed in 2017? Or how many American Robins were observed in 2017 in Breeding season (BB column)?

Here's an example of my headers and first line of data:

Data Headers

enter image description here

Year    Season  Date    Grid    Species Count   Behavior
2015     BB   22-Jul-15  FF       AMCR     1        C

I tried to use the dplyr count_ and group_by but I think I'm doing it wrong. Please help!

Upvotes: 0

Views: 2257

Answers (2)

Here is other solution using dplyr. It is similar to the previously suggested; however, I think it might be closer to what you want to do.
To count the number of observed species by year, season and grid:

#Count number of species
df %>%
  #Grouping variables
  group_by(Year, Season, Grid) %>%
  #Remove possible duplicates in the species column
  distinct(Species) %>%
  #Count number of species
  count(name = "SpCount")

To count the number of observed birds by species, year, season and grid:

#Count number of birds per species
df %>%
  #Grouping variables
  group_by(Species, Year, Season, Grid) %>%
  #Count number of birds per species
  summarize(BirdCount = sum(Count))

Upvotes: 0

NickCHK
NickCHK

Reputation: 1233

It sounds like you're trying to count the number of observations within group. This is what count in dplyr is designed for. The trick is that you don't need a group_by before it.

Here is some example code:

library(dplyr)
data("storms")

count_by_group <- storms %>%
  # The variables you want to count observations within
  count(year, month, status)

Alternately, if you have a variable called "Count" in your raw data and you want to sum it up within each group, you should instead use summarize with group_by

sum_by_group <- storms %>%
  group_by(year, month, status) %>%
  # pressure doesn't make a lot of sense here, but just whatever variable you're trying to sum up
  summarize(Count = sum(pressure))

Upvotes: 1

Related Questions