Ole Tange
Ole Tange

Reputation: 33740

R: time series with value

I have a log file with dates and sizes (of files). I would like to plot the bandwidth used per 1 minute and per 5 minutes. Input looks like this:

2014-08-08 06:37:34.610    639205638
2014-08-08 06:37:37.110    239205638
2014-08-08 06:38:58.810    635899318
2014-08-08 06:38:21.877   1420094614
2014-08-08 06:40:11.772    140034211

So I need to bin the values by date into 1 minute and 5 minutes bins, sum each bin, average them by the number of minites, and plot them against the time.

But I have a feeling this has been done before and that I can use a generic plotting function.

Upvotes: 0

Views: 56

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 270195

Its not clear what "average them by number of minutes" means but ignoring that, this bins the data by 1 minute and 5 minutes and plots the bins. Note that we have specified that the data is "numeric" to avoid integer overflow. Omit facet = NULL if you want them shown in separate panels:

library(zoo)
library(ggplot2)    
library(scales)

# read data from character variable Lines; Lines shown after graph
z <- read.zoo(text = Lines, index = 1:2, tz = "",
          colClasses = c(NA, NA, "numeric"))

ag1 <- aggregate(z, as.POSIXct(cut(time(z), "min")), sum)
ag5 <- aggregate(z, as.POSIXct(cut(time(z), "5 min")), sum)

autoplot(na.approx(cbind(ag1, ag5)), facet = NULL) + 
   scale_x_datetime(breaks = "1 min", labels = date_format("%H:%M"))

screenshot

Here is `Lines` :

Lines <- "2014-08-08 06:37:34.610    639205638
2014-08-08 06:37:37.110    239205638
2014-08-08 06:38:58.810    635899318
2014-08-08 06:38:21.877   1420094614
2014-08-08 06:45:11.772    140034211"

Upvotes: 0

GSee
GSee

Reputation: 49820

You can do this easily with xts.

# read in the data
x <- read.table(text="2014-08-08 06:37:34.610    639205638
2014-08-08 06:37:37.110    239205638
2014-08-08 06:38:58.810    635899318
2014-08-08 06:38:21.877   1420094614
2014-08-08 06:40:11.772    140034211", stringsAsFactors=FALSE)

# convert to xts
xx <- xts(x[, 3], as.POSIXct(paste(x[,1], x[, 2])))

# find the 1 minute and 5 minute endpoints
ep1 <- endpoints(xx, "minutes", 1)
ep5 <- endpoints(xx, "minutes", 5)

period.sum(xx, ep1) # 1 minute sums
period.sum(xx, ep5) # 5 minute sums

More general (but slower):

period.apply(xx, ep1, sum)

For the last part of your Question, just take the mean of these results

mean(period.sum(xx, ep1))
#[1] 1024813140

Upvotes: 1

Related Questions