Reputation: 159
I have an array of n length and I want to resize it to a certain length conserving the proportions.
I would like a function like this:
def rezise_my_array(array, new_lentgh)
For example, the input would be an array of length 9:
l = [1,2,3,4,5,6,7,8,9]
If I rezise it to length 5, the output would be:
[1,3,5,7,9]
or vice versa.
I need this to create a linear regression model on pyspark, since all the features must have the same length.
Upvotes: 5
Views: 1198
Reputation: 221664
Here's one way with linspace
and then rounding those to get the places along the length where we need to select our new elements and then simply indexing into the input array gives us the required output -
def resize_down(a, newlen):
a = np.asarray(a)
return a[np.round(np.linspace(0,len(a)-1,newlen)).astype(int)]
Sample runs -
In [23]: l # larger one than given sample
Out[23]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
In [24]: resize_down(l, 2)
Out[24]: array([ 1, 11])
In [25]: resize_down(l, 3)
Out[25]: array([ 1, 6, 11])
In [26]: resize_down(l, 4)
Out[26]: array([ 1, 4, 8, 11])
In [27]: resize_down(l, 5)
Out[27]: array([ 1, 3, 6, 9, 11])
In [28]: resize_down(l, 6)
Out[28]: array([ 1, 3, 5, 7, 9, 11])
Timings on a large array with 900000
elements and resizing to 500000
-
In [43]: np.random.seed(0)
...: l = np.random.randint(0,1000,(900000))
# @jdehesa's soln
In [44]: %timeit resize_proportional(l, 500000)
10 loops, best of 3: 22.2 ms per loop
In [45]: %timeit resize_down(l, 500000)
100 loops, best of 3: 5.58 ms per loop
Upvotes: 2
Reputation: 59731
You can do something like this:
import numpy as np
def resize_proportional(arr, n):
return np.interp(np.linspace(0, 1, n), np.linspace(0, 1, len(arr)), arr)
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9]
print(resize_proportional(arr, 5))
# [1. 3. 5. 7. 9.]
The result here is a floating point value but you can round or cast to integer if you need.
Upvotes: 3