Markus Palme
Markus Palme

Reputation: 689

Detect significant trend changes

I would like to detect the dates where a trend curve significantly changes using R. The red dots are the points in time where I see a significant changed, these should be detected. Small fluctuations should be ignored.

Trend curve

I have tried the breakpoints functions which finds the dates indicated by the dotted lined. I don't see how these lines correlate with the data.

Example data from the chart:

structure(c(431.510725286867, 421.634186460535, 379.627938613016, 
425.906255600274, -14.1367284804303, -384.10599618701, -611.193815166686, 
-460.535003689942, -309.875390598749, -84.9820334889592, 217.330882967973, 
437.111949107673, 738.919896124628, 752.79552200685, 804.851028725362, 
757.869760812822, 1197.91301915761, 1567.88256933466, 1794.97067632374, 
1644.31215300884, 1493.6528224525, 1268.75973855711, 968.432034953716, 
743.503624686386, 510.63191994943), .Tsp = c(2016.66666666667, 
2018.66666666667, 12), class = "ts")

Upvotes: 1

Views: 1225

Answers (1)

AlainD
AlainD

Reputation: 6356

Compare the forward and backward finite difference, and filter out small values.

Explicitly: compute ∆(t) = x(t+1)-x(t) and ∇(t) = x(t)-x(t-1), then d(t) = ∆(t)-∇(t)=x(t+1)-x(t-1), and keep the t for which |d(t)| > ε, where ε captures what you call a small fluctuation.

In your case, d = c(NA, -32.1, 88.3, -486.3, 70.1, 142.9, 377.7, 0.0, 74.2, 77.4, -82.5, 82.0, -287.9, 38.2, -99.0, 487.0, -70.1, -142.9, -377.7, -0.0, -74.2, -75.4, 75.4, -7.9, NA). Which is greater, in absolute value, than ε=200 for t=c(4, 7, 13, 16, 19), exactly your 4 red dots.

Of course, the threshold of ε=200 may be chosen with more rigor (on a histogram of d the value of 200 jump in the face).

You may also want to smooth down the fluctuations by taking an average on a few points rather that the previous and next value : dn(t) = x(t+n)+ ...+x(t+1)-x(t-1)-...-x(t-n).

Upvotes: 1

Related Questions