Reputation: 1539
I have a 1D empty numpy array (b
) of size 4 into which I want to stack columns. The value contained in the columns are dependent on another 1D numpy array (a
) containing True/False bool values.
I have manage to fill it in the way I want using for loops but I think it can be done more efficiently using slices.
Here is the working code giving me the correct result:
import numpy as np
import random
b = np.empty(4, dtype=object) # The array we are trying to fill
for i in range (5):
# a contains 4 random True/False values
a = np.random.randint(0,2,size=(4), dtype=bool)
# If a row is true in a then b should contain data, otherwise nan
data = random.random()
iteration = 0
for value in a :
if b[iteration] is None: # We check if b is empty, if yes we initialize
if (value): # If the row in a is true we fill with the value
b[iteration]=np.array([data])
else:
b[iteration]=np.array([np.nan])
else: # If b is not empty then we just stack
if (value):
b[iteration]=np.hstack([b[iteration],data])
else:
b[iteration]=np.hstack([b[iteration],np.nan])
iteration +=1
print(b)
Output:
array([array([ nan, 0.04209371, 0.03540539, nan, 0.59604905]),
array([0.66677989, nan, 0.03540539, nan, nan]),
array([0.66677989, 0.04209371, 0.03540539, nan, 0.59604905]),
array([0.66677989, 0.04209371, 0.03540539, nan, nan])],
dtype=object)
I have tried the following code using slices of numpy arrays but it gives me an error:
b = np.empty(4, dtype=object)
for i in range (5):
a = np.random.randint(0,2,size=(4), dtype=bool)
data = random.random()
b[a] = np.vstack([b[a],np.zeros(len(b[a]))+data])
print(b)
Output:
TypeError: NumPy boolean array indexing assignment requires a 0 or 1-dimensional input, input has 2 dimensions
I am trying to find the most efficient way of solving this problem, any suggestions ?
Upvotes: 1
Views: 3839
Reputation: 231385
I haven't tried to figure out what is wrong with your 2nd approach.
From the output, your first creates a 4 element array, where each element is a 4 element array, with a randomly placed np.nan
.
Here's a direct 2d array approach to generating the same sort of array:
A 4x4 array of random floats:
In [29]: b = np.random.rand(4,4)
In [30]: b
Out[30]:
array([[0.12820464, 0.41477273, 0.35926356, 0.15205777],
[0.28082327, 0.76574665, 0.2489097 , 0.17054426],
[0.20950568, 0.78342284, 0.14498205, 0.52107821],
[0.74684041, 0.83661847, 0.29467814, 0.66062565]])
Same size boolean array:
In [31]: a = np.random.randint(0,2, size=(4,4), dtype=bool)
In [32]: a
Out[32]:
array([[False, True, False, True],
[ True, True, False, True],
[False, False, False, False],
[False, False, True, False]])
Using a
as a mask or boolean index, replace each corresponding element of b
with nan
:
In [33]: b[a]=np.nan
In [34]: b
Out[34]:
array([[0.12820464, nan, 0.35926356, nan],
[ nan, nan, 0.2489097 , nan],
[0.20950568, 0.78342284, 0.14498205, 0.52107821],
[0.74684041, 0.83661847, nan, 0.66062565]])
This is a real 2d array of floats, not an array of arrays. That object array approach works for lists, but is not quality numpy
coding.
Upvotes: 1