Reputation: 3505
I have a dataframe giving attendances at sports events
Crowd matchDate
2345 1993-01-26
4567 1993-08-01
8888 1994-03-02
1298 1994-11-07
9876 1995-09-01 etc
1237 2011-09-09
The matchdate is a POSIXct class
I want to be able to create a season factor based on the date such that each season runs from, say, 1st August to 31 July e.g factor 1992/3 would include dates 1992-08-01 to 1993-07-31
ideally it would be a function that I could apply for several analyses, not necessarily with same start and end dates in the year
Upvotes: 21
Views: 20053
Reputation: 44638
An example of my comment.
x <- as.Date(1:1000, origin = "2000-01-01")
x <- cut(x, breaks = "quarter")
And then relabel as you please, if necessary.
labs <- paste(substr(levels(x),1,4), "/", 1:4, sep="")
x <- factor(x, labels = labs)
?cut.POSIXct
breaks
a vector of cut points or number giving the number of intervals which x is to be cut into or an interval specification, one of "sec", "min", "hour", "day", "DSTday", "week", "month", "quarter" or "year", optionally preceded by an integer and a space, or followed by "s". (For "Date" objects only interval specifications using "day", "week", "month", "quarter" and "year" are allowed.)
Upvotes: 17
Reputation: 58825
If your question is more related to how you automatically generate the breaks and labels, maybe this will help
DF <- data.frame(matchDate = as.POSIXct(as.Date(sample(5000,100,replace=TRUE), origin="1993-01-01")))
years <- 1992:2011
DF$season <- cut(DF$matchDate,
breaks=as.POSIXct(paste(years,"-08-01",sep="")),
labels=paste(years[-length(years)],years[-length(years)]+1,sep="/"))
Upvotes: 12