Reputation: 25629
Starting with this:
x = range(30,60,2)[::-1];
x = np.asarray(x); x
array([58, 56, 54, 52, 50, 48, 46, 44, 42, 40, 38, 36, 34, 32, 30])
Create an array like this: (Notice, first item repeats) But if I can get this faster without the first item repeating, I can np.hstack
first item.
[[58 58 56 54 52]
[56 56 54 52 50]
[54 54 52 50 48]
[52 52 50 48 46]
[50 50 48 46 44]
[48 48 46 44 42]
[46 46 44 42 40]
[44 44 42 40 38]
[42 42 40 38 36]
[40 40 38 36 34]
[38 38 36 34 32]
[36 36 34 32 30]
[34 34 32 30 None]
[32 32 30 None None]
[30 30 None None None]]
The code below works, want it faster without 'for' loop and enumerate.
arr = np.empty((0,5), int)
for i,e in enumerate(x):
arr2 = np.hstack((x[i], x[i:i+4], np.asarray([None]*5)))[:5]
arr = np.vstack((arr,arr2))
Upvotes: 1
Views: 728
Reputation: 221514
Approach #1
Here's a vectorized approach using NumPy broadcasting
-
N = 4 # width factor
x_ext = np.concatenate((x,[None]*(N-1)))
arr2D = x_ext[np.arange(N) + np.arange(x_ext.size-N+1)[:,None]]
out = np.column_stack((x,arr2D))
Approach #2
Here's another one using hankel
-
from scipy.linalg import hankel
N = 4 # width factor
x_ext = np.concatenate((x,[None]*(N-1)))
out = np.column_stack((x,hankel(x_ext[:4], x_ext[3:]).T))
Runtime test
Here's a modified version of @Aaron's benchmarking script
using an input format for this post identical to the one used for his post in that script for a fair benchmarking and focusing just on these two approaches -
upper_limit = 58 # We will edit this to vary the dataset sizes
print "Timings are : "
t = time()
for _ in range(1000): #1000 iterations of @Aaron's soln.
width = 3
x = np.array(range(upper_limit,28,-2) + [float('nan')]*width)
arr = np.empty([len(x)-width, width+2])
arr[:,0] = x[:len(x)-width]
for i in xrange(len(x)-width):
arr[i,1:] = x[i:i+width+1]
print(time()-t)
t = time()
for _ in range(1000):
N = 4 # width factor
x_ext = np.array(range(upper_limit,28,-2) + [float('nan')]*(N-1))
arr2D = x_ext[np.arange(N) + np.arange(x_ext.size-N+1)[:,None]]
out = np.column_stack((x_ext[:len(x_ext)-N+1],arr2D))
print(time()-t)
Case #1 (upper_limit = 58 ) :
Timings are :
0.0316879749298
0.0322730541229
Case #2 (upper_limit = 1058 ) :
Timings are :
0.680443048477
0.124517917633
Case #3 (upper_limit = 5058 ) :
Timings are :
3.28129291534
0.47504901886
Upvotes: 5
Reputation: 231335
Starting with Divaker's padded x
N = 4 # width factor
x_ext = np.concatenate((x,[None]*(N-1)))
Since we aren't doing math on it, padding with None (which makes an object array) or np.nan
(which makes a float) shouldn't make much difference.
The column stack could be eliminated with a little change to the indexing:
idx = np.r_[0,np.arange(N)] + np.arange(x_ext.size-N+1)[:,None]
this produces
array([[ 0, 0, 1, 2, 3],
[ 1, 1, 2, 3, 4],
[ 2, 2, 3, 4, 5],
[ 3, 3, 4, 5, 6],
[ 4, 4, 5, 6, 7],
...
so the full result is
x_ext[idx]
================
A different approach is to use striding to create a kind of rolling window.
as_strided = np.lib.stride_tricks.as_strided
arr2D = as_strided(x_ext, shape=(15,4), strides=(4,4))
This is one of easier applications of as_strided
. shape
is straight forward - the shape of the desired result (without the repeat column) (x.shape[0],N)
.
In [177]: x_ext.strides
Out[177]: (4,)
For 1d array of this type, the step to the next item is 4 bytes. If I reshape the array to 2d with 3 columns, the stride to the next row is 12 - 3*4 (3 offset).
In [181]: x_ext.reshape(6,3).strides
Out[181]: (12, 4)
Using strides=(4,4)
means that the step to next row is just 4 bytes, one element in original.
as_strided(x_ext,shape=(8,4),strides=(8,4))
produces a 2 item overlap
array([[58, 56, 54, 52],
[54, 52, 50, 48],
[50, 48, 46, 44],
[46, 44, 42, 40],
....
The potentially dangerous part of as_strided
is that it is possible to create an array that samples memory outside of the original data buffer. Usually that appears as large random numbers where None
appears in this example. It's the same sort of error that you would encounter if C code if you were careless in using array pointers and indexing.
The as_strided
array is a view (the repeated values are not copied). So writing to that array could be dangerous. The column_stack
with x
will make a copy, replicating the repeated values as needed.
Upvotes: 3
Reputation: 1423
I suggest to contruct an initial matrix with equal columns and then use np.roll()
to rotate them:
import numpy as np
import timeit as ti
import numpy.matlib
x = range(30,60,2)[::-1];
x = np.asarray(x);
def sol1():
# Your solution, for comparison
arr = np.empty((0,5), int)
for i,e in enumerate(x):
arr2 = np.hstack((x[i], x[i:i+4], np.asarray([None]*5)))[:5]
arr = np.vstack((arr,arr2))
return arr
def sol2():
# My proposal
x2 = np.hstack((x, [None]*3))
mat = np.matlib.repmat(x2, 5, 1)
for i in range(3):
mat[i+2, :] = np.roll(mat[i+2, :], -(i+1))
return mat[:,:-3].T
print(ti.timeit(sol1, number=100))
print(ti.timeit(sol2, number=100))
which guives:
0.026760146000015084
0.0038611710006080102
It uses a for loop but it only iterates over the shorter axis. Also, it should not be hard to adapt this code for other configurations instead of using hardcoded numbers.
Upvotes: 2
Reputation: 11075
I got about an order of magnitude faster by avoiding _stack()
and only using floats...
edit: added @Divakar's post to time trial...
import numpy as np
from time import time
t = time()
for _ in range(1000): #1000 iterations of my soln.
width = 3
x = np.array(range(58,28,-2) + [float('nan')]*width)
arr = np.empty([len(x)-width, width+2])
arr[:,0] = x[:len(x)-width]
for i in xrange(len(x)-width):
arr[i,1:] = x[i:i+width+1]
print(time()-t)
t = time()
for _ in range(1000): #1000 iterations of OP code
x = range(30,60,2)[::-1];
x = np.asarray(x)
arr = np.empty((0,5), int)
for i,e in enumerate(x):
arr2 = np.hstack((x[i], x[i:i+4], np.asarray([None]*5)))[:5]
arr = np.vstack((arr,arr2))
print(time()-t)
t = time()
for _ in range(1000):
x = np.array(range(58,28,-2))
N = 4 # width factor
x_ext = np.hstack((x,[None]*(N-1)))
arr2D = x_ext[np.arange(N) + np.arange(x_ext.size-N+1)[:,None]]
out = np.column_stack((x,arr2D))
print(time()-t)
prints out:
>>> runfile('...temp.py', wdir='...')
0.0160000324249
0.374000072479
0.0319998264313
>>>
Upvotes: 3