Astarno
Astarno

Reputation: 323

Numpy: map a list of arrays to an array of arrays

I am trying to map a list of numpy arrays (containing a single string) to an array of arrays. I want this because I need it to be in that specific format to save it to a .mat file.

I currently have the following:

var1 = [array(['String1'], dtype='<U9'), array(['String2'], dtype='<U9'), ...]
var2 = np.asarray(var1)

output when printing var2: 
[['String1']
 ['String2']
 ['String3']
 ...]

It seems like it is creating a list of lists instead of an array of arrays of some sort. Might it be that .asarray simply can't handle 2D arrays and I need another function? Or am I making a simple mistake here.

Expected output:
array([[array(['String1'], dtype='<U9'),
        array(['String2'], dtype='<U9'),
        array(['String3'], dtype='<U9'),
        ...]], dtype=object)

Upvotes: 1

Views: 228

Answers (2)

hpaulj
hpaulj

Reputation: 231395

You start with a list of arrays:

In [49]: var1 = [np.array(['String1'], dtype='<U9'), np.array(['String2'], dtype='<U9')]               
In [50]: var1                                                                                          
Out[50]: [array(['String1'], dtype='<U9'), array(['String2'], dtype='<U9')]

Making an array from that - 2d array with a string dtype (default np.array behavior):

In [51]: var2 = np.array(var1)                                                                         
In [52]: var2                                                                                          
Out[52]: 
array([['String1'],
       ['String2']], dtype='<U9')      # (2,1) shape

specifying object dtype still produces a (2,1) array

In [53]: var3 = np.array(var1, object)                                                                 
In [54]: var3                                                                                          
Out[54]: 
array([['String1'],
       ['String2']], dtype=object)      # the objects are python strings

To create an array of arrays, you have to first create an 'blank' array, and then fill it:

In [55]: var3 = np.empty(2, object)                                                                    
In [56]: var3                                                                                          
Out[56]: array([None, None], dtype=object)
In [57]: var3[:] = var1                                                                                
In [58]: var3                                                                                          
Out[58]: 
array([array(['String1'], dtype='<U9'), array(['String2'], dtype='<U9')],
      dtype=object)

If the list contained arrays of differing length you could use np.array, but as shown, that's not a robust construct (but common, if only by mistake):

In [61]: np.array([np.array(['String1'], dtype='<U9'), np.array(['String2', 'string3'], dtype='<U9')]) 
Out[61]: 
array([array(['String1'], dtype='<U9'),
       array(['String2', 'string3'], dtype='<U9')], dtype=object)

A numpy array of arrays is an odd bird, and requires a more convoluted construction.

Upvotes: 0

javidcf
javidcf

Reputation: 59701

var2 is a NumPy array, but when you print a NumPy array it just happens to show something similar to a list, although if you print a nested list with that content you will see that it is not printed in that vertical format.

The reason why printing var1 shows array(...) around each array in the list is because var1 is a list, not a NumPy array. When you print a list, you see the string representation of the object, which does not necessarily match how the object is shown on printing. If you do print(repr(var2)) you will see the array(...) around it.

In any case, you can always use type to check what type is your object.

Upvotes: 1

Related Questions