Reputation: 3991
I have a list of 2d numpy arrays of the same height but not width:
list_of_arrays = [np.random.rand(3,4),np.random.rand(3,5),np.random.rand(3,6)]
I want to build a new array where each column is a random column of the arrays in my list. I can do this with a for loop, eg:
new_array = np.zeros((3,3))
for x in range(3):
new_array[:,x] = list_of_arrays[x][:,random.randint(0,list_of_arrays[x].shape[1])]
This does not feel clean to me. I would like to use a list comprehension-like approach, eg
new_array = [list_of_arrays[x][:,random.randint(0,list_of_arrays[x].shape[1])] for x in range(3)]
Which obviously returns a list, not an array as desired. I could convert the list into an array, but that adds an extraneous intermediate. Is there a simple way to do this? Similar questions that I have seen working with 1d arrays use numpy.fromiter, but that will not work in 2 dimensions.
If anyone wants to suggest entirely different/cleaner/more efficient ways to solve this problem, that would be appreciated as well.
Upvotes: 3
Views: 7498
Reputation: 18521
You could make your list comprehension simpler by iterating over the arrays instead of the index,
new_array = np.array([x[:,np.random.randint(0, x.shape[1])] for x in list_of_arrays]).T
In [32]: %timeit np.array([x[:,np.random.randint(0, x.shape[1])] for x in a]).T
100000 loops, best of 3: 10.2 us per loop
The transposes (.T
) are because iterating through an array yields the rows, so iterating through arr.T
yields the columns. Likewise, when constructing arrays, each element is considered a row, so after construction, we need to transpose it so the lists we feed the array construct are transformed to columns.
If you import the standard random
module, you could do
new_array = np.array([random.choice(x.T) for x in list_of_arrays]).T
In [36]: %timeit np.array([random.choice(x.T) for x in a]).T
100000 loops, best of 3: 9.18 us per loop
which is slightly faster.
Upvotes: 2
Reputation: 31050
Could you combine the arrays into another array rather than a list?
>>> b= np.hstack((np.random.rand(3,4),np.random.rand(3,5),np.random.rand(3,6)))
>>> b.shape
(3, 15)
Then you can use broadcasting, as opposed to list comprehension, to pick random columns:
new_array=b[:,np.random.randint(0,b.shape[1],3)]
Upvotes: 0