Reputation: 83
I'm working with 2D numpy arrays which exhibit variable sizes, in terms of the number of rows and columns. I'd like to pad this array with zeros both before the start of the first row and at the end of the last row, but I'd like the start/end of the zeros to be offset in a different way for each column of data.
So the original 2D array:
1 2 3
4 5 6
7 8 9
A Normal example of padding:
0 0 0
0 0 0
1 2 3
4 5 6
7 8 9
0 0 0
Modified Padding with offsets (what I'm trying to do):
0 0 0
1 0 0
4 0 3
7 2 6
0 5 9
0 8 0
Does numpy possess any functions which can replicate the last example in an extendable manner for variables numbers of rows/columns, that avoids the use of for loops/other computationally slow approaches?
Upvotes: 3
Views: 2854
Reputation: 221614
Here's a vectorized one with broadcasting
and boolean-indexing
-
def create_padded_array(a, row_start, n_rows):
r = np.arange(n_rows)[:,None]
row_start = np.asarray(row_start)
mask = (r >= row_start) & (r < row_start+a.shape[0])
out = np.zeros(mask.shape, dtype=a.dtype)
out.T[mask.T] = a.ravel('F')
return out
Sample run -
In [184]: a
Out[184]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
In [185]: create_padded_array(a, row_start=[1,3,2], n_rows=6)
Out[185]:
array([[0, 0, 0],
[1, 0, 0],
[4, 0, 3],
[7, 2, 6],
[0, 5, 9],
[0, 8, 0]])
Upvotes: 4
Reputation: 83
Sorry for the trouble, but I think I found the answer that I was looking for.
I can use numpy.pad to create an arbitrary number of filler zeros at the end of my original array. There is also a function called numpy.roll which can then be used to shift all array elements along a given axis by a set number of positions down the column.
After a quick test, it looks like this is extendable for an arbitrary number of matrix elements and allows a unique offset along each column.
Thanks to everyone for their responses to this question!
Upvotes: 2
Reputation: 303
To my knowledge there is no such numpy function with those exact specific requirements, however what you can do is have your array:
`
In [10]: arr = np.array([(1,2,3),(4,5,6),(7,8,9)])
In [11]: arr
Out[11]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])`
Then pad it:
In [12]: arr = np.pad(arr, ((2,1),(0,0)), 'constant', constant_values=(0))
In [13]: arr
Out[13]:
array([[0, 0, 0],
[0, 0, 0],
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[0, 0, 0]])
Then you can randomize with shuffle (which I assume is what you want to do): But np.random.shuffle only shuffles rows if this is satisfactory for your needs then:
In [14]: np.random.shuffle(arr)
In [15]: arr
Out[15]:
array([[7, 8, 9],
[4, 5, 6],
[0, 0, 0],
[0, 0, 0],
[0, 0, 0],
[1, 2, 3]])
If this is not satisfactory you can do this:
First create a 1D array:
In [16]: arr = np.arange(1,10)
In [17]: arr
Out[17]: array([1, 2, 3, 4, 5, 6, 7, 8, 9])
Then pad your array with zeros:
In [18]: arr = np.pad(arr, (6,3), 'constant', constant_values = (0))
In [19]: arr
Out[19]: array([0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 0, 0])
Then you shuffle the array:
In [20]: np.random.shuffle(arr)
In [21]: arr
Out[21]: array([4, 0, 0, 5, 0, 0, 3, 0, 0, 0, 8, 0, 7, 2, 1, 6, 0, 9])
Finally you reshape to the desired format:
In [22]: np.reshape(arr,[6,3])
Out[22]:
array([[4, 0, 0],
[5, 0, 0],
[3, 0, 0],
[0, 8, 0],
[7, 2, 1],
[6, 0, 9]])
Although this may seem lengthy this is much quicker for large data sets than it will be using for loops, or any other python control structures. When you say offsets if you want to change the amount of randomness you can choose to only shuffle portions of the 1D array then combine it to the rest of the data so that way the whole data set is not shuffled but a portion you want to be shuffled is. (If what you mean by offsets is different from my assumption above please clarify in a comment)
Upvotes: 0