Reputation: 319
I am working with an event-level df
on protest patterns from 2000-2017. I want to create a new df
that tells me the number of individual protest events for each month over this period. I have tried to do this with the following code but have been unsuccessful. There should thus be two columns: one listing the month/year and one listing the number of protests.
dates <- as.Date(dat$Date, '%m/%d/%Y')
yr <- year(dates)
monyr <- as.yearmon(dates)
lst <- lapply(list(dates, yr, monyr), function(x)
transform(dat, Count=ave(seq_along(x), x, FUN= length)))
names(lst) <- paste0('newdf', seq_along(lst))
Here is the data:
structure(list(Date = c("16-Mar-07", "19-Mar-07", "20-Mar-07",
"21-Mar-07", "21-Mar-07", "22-Mar-07"), Region = c("Kemerovo",
"Moscow City", "Saratov", "Novosibirsk", "Omsk", "Tyva")), row.names = c(NA,
6L), class = "data.frame")
Thanks for your help.
Upvotes: 0
Views: 86
Reputation: 269441
There are the following problems:
Assuming that what is wanted is a data frame with a yearmon column, yrmon, and a count, Freq or n column we can make these changes resulting in the following code which produces a data frame with one row per year/month. The library statement in the first line of code below is needed for all solutions.
library(zoo)
# 1 - aggregate
yrmon <- as.yearmon(dat$Date, '%d-%b-%y')
aggregate(dat[1], data.frame(yrmon), length)
## yrmon count
## 1 Mar 2007 6
# 2 - table
as.data.frame(table(yrmon)) # yrmon defined above
## yrmon Freq
## 1 Mar 2007 6
# 3 - xtabs
as.data.frame(xtabs(~ yrmon)) # yrmon defined above
## yrmon Freq
## 1 Mar 2007 6
# 4 - dplyr
library(dplyr)
dat %>%
mutate(yrmon = as.yearmon(Date, '%d-%b-%y')) %>%
count(yrmon)
## yrmon n
## 1 Mar 2007 6
The first 3 can also be expressed as a pipeline. The last one already is a pipeline.
# 1a - as a pipeline
dat |>
transform(yrmon = as.yearmon(Date, '%d-%b-%y')) |>
aggregate(Date ~ yrmon, length)
# 2a - table
dat |>
transform(yrmon = as.yearmon(Date, '%d-%b-%y')) |>
with(table(yrmon)) |>
as.data.frame()
# 3a - xtabs
dat |>
transform(yrmon = as.yearmon(Date, '%d-%b-%y')) |>
xtabs(~ yrmon, data = _) |>
as.data.frame()
Lines <- "Date,Region
16-Mar-07,Kemerovo
19-Mar-07,Moscow City
20-Mar-07,Saratov
21-Mar-07,Novosibirsk
21-Mar-07,Omsk
22-Mar-07,Tyva
16-Mar-08,Kemerovo
19-Mar-08,Moscow City
20-Mar-08,Saratov
21-Mar-08,Novosibirsk
21-Mar-08,Omsk
22-Mar-08,Tyva"
dat2 <- read.csv(text = Lines)
dat2 |>
transform(yrmon = as.yearmon(Date, '%d-%b-%y')) |>
with(table(yrmon)) |>
as.data.frame()
## yrmon Freq
## 1 Mar 2007 6
## 2 Mar 2008 6
Upvotes: 1
Reputation: 454
aggregate
should work.
dat$month <- substr(dat$Date, 4, 9)
aggregate(dat, Region ~ month, length)
But you only have one month in your example data:
> aggregate(dat, Region ~ month, length)
month Region
1 Mar-07 6
Upvotes: 0