Reputation: 601
I have the data for an arrival process and I want to convert it to count process. This is what I did:
# inter-arrival time in milliseconds
x <- rpareto(100000, location = 10, shape = 1.2)
# arrival time in milliseconds
x.cumsum <- cumsum(x)
# the last arrival
x.max <- max(x.cumsum)
# the time scale for the count data, in this case 1 second
kTimeScale <- 1000
count.length <- ceiling(x.max / kTimeScale)
counts <- rep(0, times = count.length)
for (i in x.cumsum) {
counts[round(i / kTimeScale)] <- counts[round(i / kTimeScale)] + 1
}
This works but for very large dataset (few millions it's slow). I was wondering if there is a better faster way to do this?
Upvotes: 0
Views: 504
Reputation: 15163
You can do this with table
:
countsTable<-table(round(x.cumsum/kTimeScale))
counts[1:10]
## [1] 24 41 1 2 33 26 20 45 36 19
countsTable[1:10]
##
## 0 1 2 3 4 5 6 7 8 9
## 5 24 41 1 2 33 26 20 45 36
The difference is that your function misses the 0 values. The table
function won't put in 0 for values where there are no observations but you can do something like this to fix that:
counts2<-rep(0,length(counts)+1)
counts2[as.integer(names(countsTable))+1]<-countsTable
identical(counts,counts2[-1])
## [1] TRUE
Upvotes: 1