Reputation: 1192

Reshape array on xAxis and fill with mean value in Python?

i'm trying to reshape a array in Python and fill it with mean values. Example:

Given array: [2, 3, -20, 10, 4]
Searched array: [2, 2.5, 3, -8.5, -20, -5, 10, 7, 4]

More advanced: I've got an array with e.g 1000 samples. But I know it should be 1300 samples long. How to scale the the array to the new length and fill it well distributed with mean values? A solution with interpolation could make me happy too

Edit: I was questioned for an example what i mean with well distributed values. E.g: a sensor should deliver data with 100Hz. But sometimes the sensor is not able to provide the full sampling frequency. Instead of getting 1300 samples in 13 seconds i get a random amount between 900 and 1300 samples. I don't know when a value is missing. I want to distribute the missing values uniformly over the whole array and assign them a meaningful value.

Thank you

Upvotes: 2

Answers (3)

maniac

Reputation: 1192

I've written a solution which is even better for me. I had some problems with floating errors on large arrays. To correct those i inserted some missing ones randomly. Maybe someone knows how to avoid this I'm sure the code is very optimizable feel free to do this.

import numpy as np
def resizeArray(data, newLength):

    datalength = len(data)
    if (datalength == newLength): return data

    appendIndices = []
    appendNow = 0
    step = newLength / datalength
    increase =  step % 1
    for i in np.arange(0, datalength-2, step):
        appendNow += increase
        if appendNow >= 1:
            appendIndices.append(round(i,0))
            appendNow = appendNow % 1

    #still missing values due to floating errors?
    diff = newLength - datalength - len(appendIndices)
    if diff > 0:
        for i in range(0, diff):
            appendIndices.append(np.random.randint(1, datalength - 2))

    #insert average at the specified indizes
    appendVals = [(data[i] + data[i+1]) / 2 for i in appendIndices]
    a = np.insert(data, appendIndices, appendVals)

    return a

Upvotes: 0

Divakar

Reputation: 221664

You can use a differentiation trick here with np.diff. Thus, assuming A as the input array, you can do -

out = np.empty(2*A.size-1)
out[0::2] = A
out[1::2] = (np.diff(A) + 2*A[:-1]).astype(float)/2 # Interpolated values

The trick here is that the differentiation between two consecutive elements when added with twice of the previous element would be the mean value between those two elements. We just use this trick throughout the extent of the input 1D array to get our desired interpolated array.

Sample run -

In [34]: A
Out[34]: array([  2,   3, -20,  10,   4])

In [35]: out = np.empty(2*A.size-1)
    ...: out[0::2] = A
    ...: out[1::2] = (np.diff(A) + 2*A[:-1]).astype(float)/2
    ...: 

In [36]: out
Out[36]: array([  2. ,   2.5,   3. ,  -8.5, -20. ,  -5. ,  10. ,   7. ,   4. ])

I think @thomas's solution would be the go-to approach here as we are basically doing interpolation with a specific case in mind. But since, I am mostly interested in the performance of codes, here's a runtime test comparing these two solutions -

In [62]: def interp_based(A):   # @thomas's solution
    ...:    new_length = 2*A.size-1
    ...:    return np.interp(np.linspace(0,len(A)-1,new_length),range(len(A)),A)
    ...: 
    ...: def diff_based(A): 
    ...:    out = np.empty(2*A.size-1)
    ...:    out[0::2] = A
    ...:    out[1::2] = (np.diff(A) + 2*A[:-1]).astype(float)/2
    ...:    return out
    ...: 

In [63]: A = np.random.randint(0,10000,(10000))

In [64]: %timeit interp_based(A)
1000 loops, best of 3: 932 µs per loop

In [65]: %timeit diff_based(A)
10000 loops, best of 3: 148 µs per loop

Upvotes: 1

thomas

Reputation: 1813

It depends what you mean by well distributed values. Assuming your values lie on an evenly spaced grid the following solution using interpolation could make sense:

>>> import numpy as np
>>> new_length = 9
>>> b = np.interp(np.linspace(0,len(a)-1,new_length),range(len(a)),a)
>>> b
array([  2. ,   2.5,   3. ,  -8.5, -20. ,  -5. ,  10. ,   7. ,   4. ])

This will also work if len(a)=1000 and new_length=1300.

Upvotes: 3

Reshape array on xAxis and fill with mean value in Python?

Answers (3)

Related Questions