alec22
alec22

Reputation: 762

Count rows in a dataframe by year and by condition

I've got a large dataframe of the following structure:

Date           Weight
2018-01-03     0.05000000
2018-01-09     0.42000000
2019-01-10     0.27500000
2019-01-11     0.55000000
2020-01-04     0.25025991
2020-01-07     0.27000012

Firstly I'd like to be able to count the number of datapoints in each year so that I can then create a bar chart of it. I felt I needed to create a Year column to achieve this:

df$Year <- format(df$Date, format="%Y")

I would also like to count the number of datapoints that exist above some particular bounds, say how many datapoints in each year that are above say 0.1, 0.2, 0.5 etc.

Does anyone know how to achieve this?

Upvotes: 1

Views: 51

Answers (1)

PaulS
PaulS

Reputation: 25383

A possible solution, based on dplyr and lubridate::year:

library(dplyr)
library(lubridate)

df %>% 
  group_by(year = year(Date)) %>% 
  summarise(n = n(), n05 = sum(Weight > 0.5))

#> # A tibble: 3 × 3
#>    year     n   n05
#>   <dbl> <int> <int>
#> 1  2018     2     0
#> 2  2019     2     1
#> 3  2020     2     0

Upvotes: 2

Related Questions