Reputation: 1016
I'm trying to calculate the dtw distance for very long time series but I get an error that shows I cannot allocate memory for the matrix.
Here what I do:
library(dtw)
set.seed(1234)
N <- 300000
x <- rnorm(N)
y <- rnorm(N)
dtw(x,y,distance.only=TRUE)$distance
Error: cannot allocate vector of size 670.6 Gb
Is there an alternative way to calculate the dtw distance that does not need to allocate so much memory?
Upvotes: 1
Views: 600
Reputation: 121588
Idon't know this package , but From the companion paper of the package you have:
Larger problems may be addressed by approximate strategies, e.g., computing a preliminary alignment between downsampled time series (Salvador and Chan 2004); indexing (Keogh and Ratanamahatana 2005); or breaking one of the sequences into chunks and then iterating subsequence matches.
The latter option can be implemented by something like :
lapply(split(y,1:100), ## I split y in 100 chnucks
function(z)dtw(x,z,distance.only=TRUE)$distance)
PS: By larger here , it means problems that exceed 8000 × 8000 points (close to the virtual memory limit) which it is your case here.
Upvotes: 1