Dynamic Time Warping and Normalization

Question

I wanna do a timeseries clustering task. Let's say we have four data (t1~t4).

t1={1,1,1,1,1,1,1}

t2={10,10,10,10,10,10,10}

t3={100,100,100,100,100,100,100}

t4 = {1,5,9,13,17,21,25}

Here, my intention behind this example is that I want to group t1, t2, t3 together, because its shape is a constant line. However, t4 looks like ascending line so it is supposed to be in other group.

But, If I compute distances between t1 and the others using DTW (python mlpy package), I got result as follows:

t1-t1: 0 (absolutely)

t1-t2: 63

t1-t3: 693

t1-t4: 84

As we can see, distance between t1-t3 is much greater than that between t1-t4. I guess it is because the scale of amplitude of t3 is much greater than the others.

In this situation, is it good way to use min-max normalization (i.e., 0 to 1 normalization) for each timeseries data before adapting DTW? In other words, making t1, t2, t3 to be {0,0,0,0,0,0,0}, and t4 to be {0, 0,17,...., 1}? Then, DTW returns result as I want.

Shortly, I wonder the suitability of normalization task before DTW. I'm newbie to the DTW, sorry for bothering you with the basic question! :)

eamonn · Accepted Answer

No, you should do z-normalization.

Zero-One normalization is very sensitive to a single outlier

Source http://www.cs.unm.edu/~mueen/DTW.pdf

Dynamic Time Warping and Normalization

Answers (1)

Related Questions