Joachim
Joachim

Reputation: 3260

Understanding Dynamic Time Warping

We want to use the dtw library for R in order to shrink and expand certain time series data to a standard length.

Consider, three time series with equivalent columns. moref is of length(rows) 105, mobig is 130 and mosmall is 100. We want to project mobig and mosmall to a length of 105.

moref <- good_list[[2]]
mobig <- good_list[[1]]
mosmall <- good_list[[3]]

Therefore, we compute two alignments.

ali1 <- dtw(mobig, moref)
ali2 <- dtw(mosmall, moref)

If we print out the alignments the result is:

DTW alignment object
 Alignment size (query x reference): 130 x 105
 Call: dtw(x = mobig, y = moref)
DTW alignment object
 Alignment size (query x reference): 100 x 105
 Call: dtw(x = mosmall, y = moref)

So exactly what we want? From my understanding we need to use the warping functions ali1$index1 or ali1$index2 in order to shrink or expand the time series. However, if we invoke the following commands

length(ali1$index1)
length(ali2$index1)
length(ali1$index2)
length(ali2$index2)

the result is

[1] 198
[1] 162
[1] 198
[1] 162

These are vector with indices (probably refering to other vectors). Which one of these can we use for the mapping? Aren't they all to long?

Upvotes: 2

Views: 2830

Answers (1)

B.Mr.W.
B.Mr.W.

Reputation: 19638

First of all, we need to agree that index1 and index2 are two vectors of the same length that maps query/input data to reference/stored data and vice versa.

Since you did not give out any data. Here is some dummy data to give people an idea.

# Reference data is the template that we use as reference. 
# say perfect pronunciation from CNN
data_reference <- 1:10
# Query data is the input data that we want to map to our reference
# say random youtube audio 
data_query <- seq(1,10,0.5) + rnorm(19)
library(dtw)
alignment <- dtw(x=data_query, y=data_reference, keep=TRUE)
alignment$index1
alignment$index2
lcm <- alignment$costMatrix
image(x=1:nrow(lcm), y=1:ncol(lcm), lcm)
plot(alignment, type="threeway")

Here are the outputs:

enter image description here enter image description here

> alignment$index1
 [1]  1  2  3  4  5  6  7  7  8  9 10 11 12 13 13 14 14 15 16 17 18 19
> alignment$index2
 [1]  1  1  1  2  2  3  3  4  5  6  6  6  6  6  7  8  9  9  9  9 10 10

So basically, the mapping from index1 to index2 is how to map input data to the reference data.

i.e. the 10th data point at the input data has been matched to the 6th data point from the template.

index1: Warping function φx(k) for the query

index2: Warping function φy(k) for the reference

-- Toni Giorgino

Per your question, "what is the deal with the length of the index", since it is basically the coordinates of the optimal, path, it could be as long as m+n(really shallow) or min(m,n) (perfect diagonal). Clearly, it is not a one-to-one mapping which might bothers people a little bit, I guess you can do more research from here how to pick up the mapping you want.


I don't know if there is some buildin function functionality to pick up the best one-to-one mapping. But here is one way.

library(plyr)
mapping <- data.frame(index1=alignment$index1, index2=alignment$index2) 
mapping <- ddply(mapping, .(index1), summarize, index2_new = max(index2))

Now mapping contains a one-to-one mapping from query to reference. Then you can map the query to the reference and scale the mapped input in whatever way you want.

I am not exactly sure about the content below the line and anyone is more than welcome to make any improvement how the mapping and scaling should work.

References: 1, 2

Upvotes: 6

Related Questions