Reputation: 53
Before I implement a Neural Network, I need to preprocess some data. But I'm a noob in math and I can't find a function in Python to do what I want.
I have matrix like this:
[[0 4 ... 0 ]
[0 3 ... 6 ]
[0 3 ... 10]]
And I have a number, for example 7, which determines how many rows I must have in my new matrix after the transformation. What I want to achieve is this:
[[0 4 ... 0 ]
[0 3.66 ... 2 ]
[0 3.33 ... 4 ]
[0 3 ... 6 ]
[0 3 ... 7.33]
[0 3 ... 8.66]
[0 3 ... 10 ]]
You see that first column doesn't change because for each row in the original matrix the first is zero. For the second column the first row decrease slowly on the four first row from 4 to 3 and after it stabilizes. And finally the last column increases from 0 to 10 passing by 6.
A math student told me that was an interpolation, but I can't find in scipy's documentation how to do that correctly.
Do you have an idea how I can do that?
Upvotes: 2
Views: 3030
Reputation: 1686
You can use numpy.interp
. As it is only for 1D, I used a for loop.
import numpy as np
# You input matrix:
a = np.array([[0, 4, 0], [0, 3, 6], [0, 3, 10]])
# Put the shape you need here:
old_dim, new_dim = a.shape[1], 7
# Define new matrix
b = np.zeros((7, a.shape[1]))
# Define linspace that will serve for interpolation
nls, ols = np.linspace(0, 1, new_dim), np.linspace(0, 1, old_dim)
# Interpolate on each column
for col in range(old_dim):
b[:,col] = np.interp(nls, ols, a[:,col])
print b
Output:
[[ 0. 4. 0. ]
[ 0. 3.66666667 2. ]
[ 0. 3.33333333 4. ]
[ 0. 3. 6. ]
[ 0. 3. 7.33333333]
[ 0. 3. 8.66666667]
[ 0. 3. 10. ]]
It is not a 2D interpolation function, but I am not very familiar with scipy (and numpy does not have any).
Edit Fix issues with not square matrix
import numpy as np
a = np.array([[0, 4, 0], [0, 3, 6], [0, 3, 10]])
old_dim, n_col, new_dim = a.shape[0], a.shape[1], 7
b = np.zeros((7, n_col))
nls, ols = np.linspace(0, 1, new_dim), np.linspace(0, 1, old_dim)
for col in range(n_col):
b[:,col] = np.interp(nls, ols, a[:,col])
print b
My mistake, I inverted n_col and n_rows at some point.
Upvotes: 3