palash
palash

Reputation: 539

Cannot assign string elements to NumPy array?

I am trying to create an array of zeros and three-column types (integer, float, character). Reference question

Doubt
Why dtype=S here is creating a binary String?

arr = np.zeros((3,), dtype=('i4,f4,S'))
arr

>>array([[(0, 0., b''), (0, 0., b''), (0, 0., b'')]],
      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', 'S')])

Issue
Assignment of characters is not working, instead results in blank strings b''.

arr[:] = [(1, 2., 'A'),
               (2, 2., 'B'),
               (3, 3., 'C')]
arr

>>array([[(1, 2., b''), (2, 2., b''), (3, 3., b'')]],
      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', 'S')])

Doubt
Why is problem solved by using dtype='O', or dtype='a40' a python object?

x = np.zeros((3,), dtype=('i4,f4,O')) # same result goes with dtype='a40' 
new_data = [(1, 2., "A"), (2, 2., "B"), (3, 3., "C")]
x[:] = new_data
print(x)

>>[(1, 2., 'A') (2, 2., 'B') (3, 3., 'C')]

How a40 is different from S, O and U dtypes for NumPy string elements?

Upvotes: 0

Views: 911

Answers (1)

hpaulj
hpaulj

Reputation: 231385

Sometimes S or str is understood to mean, 'a long enough string to hold the values':

In [389]: np.array('foobar', dtype='S')                                                        
Out[389]: array(b'foobar', dtype='|S6')
In [390]: np.array('foobar', dtype='str')                                                      
Out[390]: array('foobar', dtype='<U6')

But a compound dtype isn't one of them:

In [392]: np.array('foobar', dtype=[('x','S')])                                                
Out[392]: array((b'',), dtype=[('x', 'S')])
In [393]: np.array('foobar', dtype=[('x','S10')])                                              
Out[393]: array((b'foobar',), dtype=[('x', 'S10')])

'O' creates a different array - one with list like references to Python strings:

In [401]: np.array('foobar', 'O')                                                              
Out[401]: array('foobar', dtype=object)
In [405]: np.array('foobar', [('x','O')])                                                      
Out[405]: array(('foobar',), dtype=[('x', 'O')])
In [406]: np.array(b'foobar', [('x','O')])                                                     
Out[406]: array((b'foobar',), dtype=[('x', 'O')])

Upvotes: 1

Related Questions