jBloodless
jBloodless

Reputation: 113

Issue converting list to NumPy array

I have a list that consits of 2000 rows and 88200 columns:

testlist = list(split_audio_to_parts(audio, self.sample_rate, self.audio_index))

debugging output of testlist gives

[array([-0.00683594, -0.00689697, -0.00708008, ...,  0.        ,
    0.        ,  0.        ]), array([-0.01287842, -0.01269531, -0.01257324, ...,  0.        ,
    0.        ,  0.        ]), array([0.02288818, 0.01940918, 0.01409912, ..., 0.        , 0.        ,
   0.        ]), array([0.00772095, 0.00671387, 0.00695801, ..., 0.        , 0.        ,
   0.        ]),

and so on. split_audio_to_parts is a function:

def split_audio_to_parts(x, sample_rate, audio_index):
for i, row in audio_index.iterrows():
    x_part = x[int(row['start_samples']):int(row['end_samples'])]
    yield x_part

When I try to convert it to numpy array using samples = np.array(testlist) or samples = np.asarray(testlist), it gives me array of shape (2000,), although debugging shows that testlist consits of 2000 entries with 88200 positions. Why so? I'm using 64bit numpy and 64bit Python 3.6.5.

Upvotes: 3

Views: 2515

Answers (2)

kabanus
kabanus

Reputation: 25895

The problem is testlist is a list of different size arrays. For example checkout this code:

>>>import numpy as np
>>>import random 
>>>random.seed(3240324324)
>>> y=[np.array(list(range(random.randint(1,3)))) for _ in range(3)]
>>> y
[array([0, 1, 2]), array([0, 1, 2]), array([0])]
>>> np.array(y)
array([array([0, 1, 2]), array([0, 1, 2]), array([0])], dtype=object)
>>> np.array(y).shape
(3,)

and the array would be of object type instead of float. the only way for this to work is having same sized arrays.

If you really need to stuff these rows somehow into an array you can pad with zeros, for example:

>>> size = y[max(enumerate(y),key=lambda k:k[1].shape)[0]].shape[0]
>>> z=[np.append(x,np.zeros(size-x.shape[0])) for x in y]
>>> z
[array([ 0.,  1.,  2.]), array([0, 1, 2]), array([0, 0, 0])]
>>>np.array(z).shape
(3, 3)

but you would have to decide how you do this padding.

Upvotes: 1

jpp
jpp

Reputation: 164613

You have a list of arrays. If each array in your list does not have the same length, your conversion will not be successful.

Here is a minimal example.

A = [np.array([1, 2]), np.array([4, 5, 6])]

A_2 = np.array(A)
# array([array([1, 2]), array([4, 5, 6])], dtype=object)

A_2.shape
# (2,)

If the lengths of your arrays are aligned, you will find no problem:

B = [np.array([1, 2, 3]), np.array([4, 5, 6])]

B_2 = np.array(B)
# array([[1, 2, 3],
#        [4, 5, 6]])

B_2.shape
# (2, 3)

To check the sizes of your arrays, you can use set:

array_sizes = set(map(len, A))

Upvotes: 0

Related Questions