Reputation: 1341
I have some data in CSV like:
"Timestamp", "Count"
"2009-07-20 16:30:45", 10
"2009-07-20 16:30:45", 15
"2009-07-20 16:30:46", 8
"2009-07-20 16:30:46", 6
"2009-07-20 16:30:46", 8
"2009-07-20 16:30:47", 20
I can read it into R using read.cvs. I'd like to plot:
"2009-07-20 16:30:45", 2 "2009-07-20 16:30:46", 3 "2009-07-20 16:30:47", 1
"2009-07-20 16:30:45", 12.5 "2009-07-20 16:30:46", 7.333 "2009-07-20 16:30:47", 20
Is there some way to do this (collect by second/min/etc & plot) in R?
Upvotes: 9
Views: 9672
Reputation: 368261
Read your data, and convert it into a zoo object:
R> X <- read.csv("/tmp/so.csv")
R> X <- zoo(X$Count, order.by=as.POSIXct(as.character(X[,1])))
Note that this will show warnings because of non-unique timestamps.
Task 1 using aggregate
with length
to count:
R> aggregate(X, force, length)
2009-07-20 16:30:45 2009-07-20 16:30:46 2009-07-20 16:30:47
2 3 1
Task 2 using aggregate
:
R> aggregate(X, force, mean)
2009-07-20 16:30:45 2009-07-20 16:30:46 2009-07-20 16:30:47
12.500 7.333 20.000
Task 3 can be done the same way by aggregating up to higher-order indices. You can call plot
on the result from aggregate:
plot(aggregate(X, force, mean))
Upvotes: 7
Reputation: 18487
Averaging the data is easy with the plyr package.
library(plyr)
Second <- ddply(dataset, "Timestamp", function(x){
c(Average = mean(x$Count), N = nrow(x))
})
To do the same thing by minute or hour, then you need to add fields with that info.
library(chron)
dataset$Minute <- minutes(dataset$Timestamp)
dataset$Hour <- hours(dataset$Timestamp)
dataset$Day <- dates(dataset$Timestamp)
#aggregate by hour
Hour <- ddply(dataset, c("Day", "Hour"), function(x){
c(Average = mean(x$Count), N = nrow(x))
})
#aggregate by minute
Minute <- ddply(dataset, c("Day", "Hour", "Minute"), function(x){
c(Average = mean(x$Count), N = nrow(x))
})
Upvotes: 2