Reputation: 1
For example, let's say we have data coming from a sensor at every 5 seconds on live basis, at the same time we have been collecting this data for last whole year. Now, we got a spike today from 5:00 PM to 5:10 PM. We want to search for the similar shaped (Amplitude & shape) spike in past data as well.
We tried using DTW (Dynamic time wrapping) but for 5 Sec frequency data, it is taking a lot of time. Let me know if anyone can help here.
Upvotes: -1
Views: 41
Reputation: 46389
I would suggest creating "fingerprints" using exponentially damped moving averages to limit how many candidates to compare with, then compare as you do now.
An exponentially damped moving average is calculated by avg[n] = a*value[n] + (1-a)*avg[n-1]
where a
is a constant. If t
is the time between samples this is roughly an average indicating where the trend was at time t/a
. If you've ever seen Unix load averages from an uptime
calculation, this is how they are calculated.
So calculate, say, 3 averages. If they are calculated with a
in {1/12, 1/24, 1/60}
, then they are basically 1 minute, 2 minute, and 5 minute moving averages. That combination of numbers is a point in 3-dimensional space, which represents a "fingerprint". You can toss that into a k/d tree. And now when you see a spike, look for past fingerprints in the tree that are close to this one, then do a more extensive comparison of that past peak.
You'll need to experiment with your actual data to answer questions like these:
But starting only with reasonable candidates should be a good performance win.
Upvotes: 0