user7310181
user7310181

Reputation:

Finding a valley in a noisy data

Date        Time_GMTTime_IST    Current
11/15/2016  5:12:27 10:42:27    26.61
11/15/2016  5:12:28 10:42:28    42.27
11/15/2016  5:12:29 10:42:29    25.48
11/15/2016  5:12:30 10:42:30    24.24
11/15/2016  5:12:31 10:42:31    25.91
11/15/2016  5:12:32 10:42:32    27.75
11/15/2016  5:12:33 10:42:33    24.46
11/15/2016  5:12:34 10:42:34    24.32
11/15/2016  5:12:35 10:42:35    24.81
11/15/2016  5:12:36 10:42:36    27.36
11/15/2016  5:12:37 10:42:37    28.2
11/15/2016  5:12:38 10:42:38    28.29
11/15/2016  5:12:39 10:42:39    26.52
11/15/2016  5:12:40 10:42:40    32.58
11/15/2016  5:12:41 10:42:41    24.24
11/15/2016  5:12:42 10:42:42    24.36
11/15/2016  5:12:43 10:42:43    26.48
11/15/2016  5:12:44 10:42:44    28.76
11/15/2016  5:12:45 10:42:45    24.51
11/15/2016  5:12:46 10:42:46    23.93
11/15/2016  5:12:47 10:42:47    25.23
11/15/2016  5:12:48 10:42:48    27.9
11/15/2016  5:12:49 10:42:49    27.84
11/15/2016  5:12:50 10:42:50    27.31
11/15/2016  5:12:51 10:42:51    29.17
11/15/2016  5:12:52 10:42:52    24
11/15/2016  5:12:53 10:42:53    32.51
11/15/2016  5:12:54 10:42:54    26.63
11/15/2016  5:12:55 10:42:55    22.34
11/15/2016  5:12:56 10:42:56    29.14
11/15/2016  5:12:57 10:42:57    46.62
11/15/2016  5:12:58 10:42:58    48.85
11/15/2016  5:12:59 10:42:59    30.59
11/15/2016  5:13:00 10:43:00    30.68
11/15/2016  5:13:01 10:43:01    30.82
11/15/2016  5:13:02 10:43:02    31.64
11/15/2016  5:13:03 10:43:03    43.91

The above is a sample data, the data goes on for days.I have to find the depression in current as shown in the image. If the current goes below 30 amps for a long time I have to detect that valley-like depression. I have been working on it for a while and I'm not able to think of any logic that can find the solution precicely. Any kind of suggestion is appreciated. A machine learning approach is also accepted.

Upvotes: 0

Views: 2821

Answers (2)

Sandipan Dey
Sandipan Dey

Reputation: 23101

We can try to find valleys using similar idea, but using numpy convolution:

  1. Pick a window and compute smoothed data e.g., with MA (moving average) using convolution.
  2. Compute the residual from the original data and the smoothed data.
  3. Valley points are the consecutive points where residual values are small.

    import numpy as np
    Import pandas as pd # read data in data frame df
    w_sz = 3 # window size
    ma = np.convolve(df.Current, np.ones(w_sz)/w_sz, mode='same')
    resid = df.Current - ma
    threshold = 1 #0.1
    prob_val = np.where(abs(resid)<=threshold)
    val_indices = np.where(np.diff(prob_val) != 1)[1]+1 
    import matplotlib.pyplot as plt
    plt.plot(df.Current)
    plt.plot(ma)
    plt.plot(resid)
    plt.axhline(0)
    plt.plot(val_indices, np.zeros(len(val_indices)), 'o', color='red')
    plt.legend(['Current', 'MA-smoothed', 'Residual'], loc='upper center');
    plt.show()
    

enter image description here There are 3 valleys shown in the figure, between each 2 consecutive red points. It seems there is only one red point for the first valley, but actually there are two consecutive points and the length of the valley is one. We can filter out small length valleys too.

Upvotes: 0

jbndlr
jbndlr

Reputation: 5210

You could just use a moving window average approach:

  1. Select an appropriate window width (in your case, the delta between entries is one second each, so your chosen width will be in dimensions of seconds)

  2. Iterate over your currents column and calculate the average of currents with respect to your chosen window width

  3. Check when it drops below a threshold or raises above it, depending on its prior state

With your example data, this may look like the following. In this plot, your original currents data is depicted as a blue dotted line, the moving average is the thick green line and state changes are marked as red vertical lines.

Rolling window and state changes

The code I used to generate that image is:

import matplotlib
import matplotlib.pyplot as plt

c = [26.61, 42.27, 25.48, 24.24, 25.91, 27.75, 24.46, 24.32, 24.81, 27.36, 28.2, 28.29, 26.52, 32.58, 24.24, 24.36, 26.48, 28.76, 24.51, 23.93, 25.23, 27.9, 27.84, 27.31, 29.17, 24, 32.51, 26.63, 22.34, 29.14, 46.62, 48.85, 30.59, 30.68, 30.82, 31.64, 43.91]

if __name__ == '__main__':
    # Choose window width and threshold
    window = 5
    thres = 27.0

    # Iterate and collect state changes with regard to previous state
    changes = []
    rolling = [None] * window
    old_state = None
    for i in range(window, len(c) - 1):
        slc = c[i - window:i + 1]
        mean = sum(slc) / float(len(slc))
        state = 'good' if mean > thres else 'bad'

        rolling.append(mean)
        if not old_state or old_state != state:
            print('Changed to {:>4s} at position {:>3d} ({:5.3f})'.format(state, i, mean))
            changes.append((i, state))
            old_state = state

    # Plot results and state changes
    plt.figure(frameon=False, figsize=(10, 8))
    currents, = plt.plot(c, ls='--', label='Current')
    rollwndw, = plt.plot(rolling, lw=2, label='Rolling Mean')
    plt.axhline(thres, xmin=.0, xmax=1.0, c='grey', ls='-')
    plt.text(40, thres, 'Threshold: {:.1f}'.format(thres), horizontalalignment='right')
    for c, s in changes:
        plt.axvline(c, ymin=.0, ymax=.7, c='red', ls='-')
        plt.text(c, 41.5, s, color='red', rotation=90, verticalalignment='bottom')
    plt.legend(handles=[currents, rollwndw], fontsize=11)
    plt.grid(True)
    plt.savefig('local/plot.png', dpi=72, bbox_inches='tight')

Upvotes: 4

Related Questions