TIM
TIM

Reputation: 125

Structured Data type conversion produce extra dimension

So I'm trying to write a numpy array to a binary PLY file, type convertion is required. What I did was

A = array([0.        , 0.00333476, 0.29804176, 0.66598558])
A.astype([('idx', '<i4'), ('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

Output:

array([(0, 0.        , 0.        , 0.        ),
       (0, 0.00333476, 0.00333476, 0.00333476),
       (0, 0.29804176, 0.29804176, 0.29804176),
       (0, 0.6659856 , 0.6659856 , 0.6659856 )],
      dtype=[('idx', '<i4'), ('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

Expecting:

array((0, 0.00333476, 0.29804176, 0.6659856), dtype=[('idx', '<i4'), ('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

Btw, I'm on numpy version '1.24.4', is this a bug?

Upvotes: 0

Views: 57

Answers (1)

hpaulj
hpaulj

Reputation: 231625

Your array and desired dtype:

In [489]: A = np.array([0.        , 0.00333476, 0.29804176, 0.66598558])
     ...: dt = np.dtype([('idx', '<i4'), ('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

In [490]: A
Out[490]: array([0.        , 0.00333476, 0.29804176, 0.66598558])

As you note astype puts one number in all fields of a record, resulting in as many records as elements of A:

In [491]: A.astype(dt)
Out[491]: 
array([(0, 0.        , 0.        , 0.        ),
       (0, 0.00333476, 0.00333476, 0.00333476),
       (0, 0.29804176, 0.29804176, 0.29804176),
       (0, 0.6659856 , 0.6659856 , 0.6659856 )],
      dtype=[('idx', '<i4'), ('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

The correct way to provide data to a compound dtype is to write all values for one record in a tuple, or a list of tuples for 1d structured array:

In [493]: A1 = np.array([(0.        , 0.00333476, 0.29804176, 0.66598558)], dtype=dt); A1
Out[493]: 
array([(0, 0.00333476, 0.29804176, 0.6659856)],
      dtype=[('idx', '<i4'), ('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

recfunctions module has a convenience function that can convert A to structured:

In [494]: import numpy.lib.recfunctions as rf

In [495]: rf.unstructured_to_structured(A,dt)
Out[495]: 
array((0, 0.00333476, 0.29804176, 0.6659856),
      dtype=[('idx', '<i4'), ('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

So here a 1d array converts to a 0d, one record structured array.

People also try to use view. Since A.dtype is 'F8', that produces 2 records (with 4 byte elements):

In [499]: A.view(dt)
Out[499]: 
array([(          0, 0.       , -5.4416844e-17, 0.9192123),
       (-1090341701, 1.6490208, -3.0346762e+37, 1.7914963)],
      dtype=[('idx', '<i4'), ('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

If I first make a 'f4' array then I get a better result - but note it is 1d. view has added a leading dimension. unstructured_to_structured takes care of these messy details.

In [500]: A.astype('f4').view(dt)
Out[500]: 
array([(0, 0.00333476, 0.29804176, 0.6659856)],
      dtype=[('idx', '<i4'), ('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

Another way is to convert each row of a 2d array into a tuple, and use the list of tuples method:

In [504]: np.array([tuple(i) for i in [A]] , dt)
Out[504]: 
array([(0, 0.00333476, 0.29804176, 0.6659856)],
      dtype=[('idx', '<i4'), ('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

edit

Your astype produces the same thing as:

In [507]: np.array(A, dt)
Out[507]: 
array([(0, 0.        , 0.        , 0.        ),
       (0, 0.00333476, 0.00333476, 0.00333476),
       (0, 0.29804176, 0.29804176, 0.29804176),
       (0, 0.6659856 , 0.6659856 , 0.6659856 )],
      dtype=[('idx', '<i4'), ('x', '<f4'), ('y', '<f4'), ('z', '<f4')])

A has to be a tuple (or list of tuples) to fill the dt records as you want.

Upvotes: 1

Related Questions