Reputation: 854
I am exploring some alternatives to compute Dynamic Time Warping (DTW) distances in Python. And I am struggling with the limited documentation about the packages I am finding...
I have explored dtaidistance
's distance_fast
and distance
; fastdtw
's fastdtw
and dtw-python
's dtw
functions. They are leading to different results... I am sure it is related to some default parameters that sometimes are not so clear in the documentation, but I was not able to find any explanations about...
But my question at this moment here is about dtw-python
. Specifically, a comparison of the Python's dtw-python
package results with the R's dtw
package results.
They are direct equivalent each other, as indicated here. And the major function documentation is very similar. So it is really weird...
The R code:
data=read.table('synthetic_control.data',header=FALSE)
library(dtw)
dtw(x=data[1,], y=data[2,] ,
window.type= "sakoechiba", window.size= 5)$distance
dtw(x=data[2,], y=data[3,] ,
window.type= "sakoechiba", window.size= 5)$distance
dtw(x=data[2,], y=data[600,],
window.type= "sakoechiba", window.size= 5)$distance
dtw(x=data[1,], y=data[600,],
window.type= "sakoechiba", window.size= 5)$distance
dtw(x=data[3,], y=data[600,],
window.type= "sakoechiba", window.size= 5)$distance
Results:
[1] 42.181 [1] 35.09292
[1] 105.0999
[1] 110.9285
[1] 105.9934
(Edit: the above values are Euclidian distances)
The Python code:
import pandas as pd
data = pd.read_csv("synthetic_control.data",
header=None,delimiter=r"\s+")
from dtw import *
print(dtw(x=data_a.loc[0], y=data_a.loc[1] ,
window_args= {"window_type": "sakoechiba", "window_size": 5},
keep_internals=True).distance)
print(dtw(data_a.loc[1], data_a.loc[2] ,
window_args= {"window_type": "sakoechiba", "window_size": 5},
keep_internals=True).distance)
print(dtw(data_a.loc[1], data_a.loc[599] ,
window_args= {"window_type": "sakoechiba", "window_size": 5},
keep_internals=True).distance)
print(dtw(data_a.loc[0], data_a.loc[599] ,
window_args= {"window_type": "sakoechiba", "window_size": 5},
keep_internals=True).distance)
print(dtw(data_a.loc[2], data_a.loc[599] ,
window_args= {"window_type": "sakoechiba", "window_size": 5},
keep_internals=True).distance)
Results:
166.58120000000002 165.87640000000005 647.89322 604.29862 663.3971200000001
The example above use synthetic dataset available here, and each row represent a time serie.
As pointed above, the intention was to set the same parameters (windows type, windows size)... and the defaults appears to be the same for both python and R... What am I missing here?
Thanks in advance,
Upvotes: 0
Views: 606
Reputation: 126
I think I can shed some light on the differences (disclaimer: dtaidistance co-author).
The differences can be explained by a different interpretation of the DTW algorithm. The DTW algorithm as defined in multiple papers and on Wikipedia:
Given two series x and y with length m, n, respectively.
dtw[i,j] = d(x[i], y[j]) + min(dtw[i-1,j], dtw[i-1,j-1], dtw[i,j-1])
where d is often absolute cost |x[i]-yj]| or quadratic (x[i]-yj])^2 cost. In case of quadratic, we can do dtw[:,:] = sqrt(dtw[:,:]) to have the results again in the original magnitude. The final DTW distance is dtw[m,n].
Quadratic distance is the more popular choice as it better penalizes large deviations [1,2] and also the choice in the dtaidistance toolbox.
This can be verified by computing the cumulative cost on the warped time series. I'll use your series as example:
x = np.array([0, 0, 1, 2, 1, 0, 1, 0, 0])
y = np.array([0, 1, 2, 0, 0, 0, 0, 0, 0])
from dtaidistance import dtw
d = dtw.distance(x,y)
print(f'DTW = {d}') # DTW = 1.4142135623730951
x_warped = [0, 0, 1, 2, 1, 0, 1, 0, 0, 0]
y_warped = [0, 0, 1, 2, 0, 0, 0, 0, 0, 0]
d_warped = math.sqrt(sum([(xi - yi)**2 for xi,yi in zip(x_warped, y_warped)]))
print(f'DTW = {d_warped}') # DTW = 1.4142135623730951
Python-dtw choses absolute cost, because the default choice for dist_method
is 'euclidean'. Thus even if you give Euclidean as argument, because d(a,b)=sqrt((a-b)^2))=|a-b|, it is still absolute value. See this line in the code:
https://github.com/DynamicTimeWarping/dtw-python/blob/0aec342c558fbb601a530890e963e5106c60564c/dtw/dtw.py#L364
For example:
import python_dtw
d = python_dtw.dtw(x,y).distance
print(f'DTW = {d}') # DTW = 2.0
x_warped = [0, 0, 1, 2, 1, 0, 1, 0, 0, 0]
y_warped = [0, 0, 1, 2, 0, 0, 0, 0, 0, 0]
d_warped = sum([(xi - yi) for xi,yi in zip(x_warped, y_warped)])
print(f'DTW = {d_warped}') # DTW = 2
With respect to fastdtw, this is an approximate method. So the result will likely differ from the exact output. In most cases, however, standard DTW is faster than fastdtw because it is easier to optimize for a cpu (e.g. dtaidistance is >100x faster than fastdtw). See also the comparison by Wu and Keogh [3].
[1] Keogh, E. (2002). Exact indexing of dynamic time warping. In 28th International Conference on Very Large Data Bases.
[2] A. Mueen and E. Keogh. Extracting optimal performance from dynamic time warping. KDD Tutorial, 2016.
[3] https://arxiv.org/abs/2003.11246
Upvotes: 1