henry
henry

Reputation: 965

python Automatic Resampling of Data

Let's say that I have the following data (measurements):

enter image description here

As you can see, there are a lot of sharp points (i.e. where the slope changes a lot). It would therefore, be good to take some more measurements around those points. To do that I wrote a script:

  1. I calculate the curvature of 3 consecutive points: Menger curvature: https://en.wikipedia.org/wiki/Menger_curvature#Definition

  2. Then I decide which values I should resample, based on the curvature.

...and I iterate until the average curvature goes down... but it does not work, because, it goes up. Do you know why ?

Here is the complete code (stopped it after the length of the x values get 60):

import numpy as np
import matplotlib.pyplot as plt

def curvature(A,B,C):
    """Calculates the Menger curvature fro three Points, given as numpy arrays.
    Sources:
    Menger curvature: https://en.wikipedia.org/wiki/Menger_curvature#Definition
    Area of a triangle given 3 points: https://math.stackexchange.com/questions/516219/finding-out-the-area-of-a-triangle-if-the-coordinates-of-the-three-vertices-are
    """

    # Pre-check: Making sure that the input points are all numpy arrays
    if any(x is not np.ndarray for x in [type(A),type(B),type(C)]):
        print("The input points need to be a numpy array, currently it is a ", type(A))

    # Augment Columns
    A_aug = np.append(A,1)
    B_aug = np.append(B,1)
    C_aug = np.append(C,1)

    # Caclulate Area of Triangle
    matrix = np.column_stack((A_aug,B_aug,C_aug))
    area = 1/2*np.linalg.det(matrix)

    # Special case: Two or more points are equal 
    if np.all(A == B) or  np.all(B == C):
        curvature = 0
    else:
        curvature = 4*area/(np.linalg.norm(A-B)*np.linalg.norm(B-C)*np.linalg.norm(C-A))

    # Return Menger curvature
    return curvature


def values_to_calulate(x,curvature_list, max_curvature):
    """Calculates the new x values which need to be calculated
    Middle point between the three points that were used to calculate the curvature """
    i = 0
    new_x = np.empty(0)
    for curvature in curvature_list:
        if curvature > max_curvature:
            new_x = np.append(new_x, x[i]+(x[i+2]-x[i])/3 )
        i = i+1
    return new_x


def plot(x,y, title, xLabel, yLabel):
    """Just to visualize"""

    # Plot
    plt.scatter(x,y)
    plt.plot(x, y, '-o')

    # Give a title for the sine wave plot
    plt.title(title)

    # Give x axis label for the sine wave plot
    plt.xlabel(xLabel)

    # Give y axis label for the sine wave plot
    plt.ylabel(yLabel)
    plt.grid(True, which='both')
    plt.axhline(y=0, color='k')


    # Display the sine wave
    plt.show
    plt.pause(0.05)

### STARTS HERE


# Get x values of the sine wave
x = np.arange(0, 10, 1);

# Amplitude of the sine wave is sine of a variable like time
def function(x):
    return 1+np.sin(x)*np.cos(x)**2
y = function(x)

# Plot it
plot(x,y, title='Data', xLabel='Time', yLabel='Amplitude')


continue_Loop = True

while continue_Loop == True :
    curvature_list = np.empty(0)
    for i in range(len(x)-2):
        # Get the three points
        A = np.array([x[i],y[i]])
        B = np.array([x[i+1],y[i+1]])
        C = np.array([x[i+2],y[i+2]])

        # Calculate the curvature
        curvature_value = abs(curvature(A,B,C))
        curvature_list = np.append(curvature_list, curvature_value)



    print("len: ", len(x) )
    print("average curvature: ", np.average(curvature_list))

    # Calculate the points that need to be added 
    x_new = values_to_calulate(x,curvature_list, max_curvature=0.3)

    # Add those values to the current x list:
    x = np.sort(np.append(x, x_new))

    # STOPED IT AFTER len(x) == 60
    if len(x) >= 60:
        continue_Loop = False

    # Amplitude of the sine wave is sine of a variable like time
    y = function(x)

    # Plot it
    plot(x,y, title='Data', xLabel='Time', yLabel='Amplitude')

This is how it should look:

enter image description here

EDIT:

If you let it run even further... :

enter image description here

Upvotes: 2

Views: 460

Answers (1)

Julien
Julien

Reputation: 15071

So summarize my comments above:

  • you are computing the average curvature of your curve which has no reason to go to 0. At every point, no matter how close your points get, the circle radius will converge to whatever the curvature is at that point, not 0.

  • an alternative would be to use the absolute derivative change between two points: keep sampling until abs(d(df/dx)) < some_threshold where d(df/dx) = (df/dx)[n] - (df/dx)[n-1]

Upvotes: 2

Related Questions