user2309803
user2309803

Reputation: 645

NumPy how to reshape when some data is missing?

With the following source data -

In [53]: source_data = np.array([ 
    ...: [0, 0, 0, 10], 
    ...: [0, 0, 1, 11], 
    ...: [0, 1, 0, 12], 
    ...: [0, 1, 1, 13], 
    ...: [1, 0, 0, 14],  
    ...: [1, 0, 1, 15],  
    ...: [1, 1, 0, 16],  
    ...: [1, 1, 1, 17] 
    ...: ])

I can reshape as follows to make indexing more convenient -

In [62]: max = np.max(source_data, axis=0).astype(int)                                               

In [63]: max                                                                                         
Out[63]: array([ 1,  1,  1, 17])

In [64]: three_d = np.ravel(source_data[:,3]).reshape((max[0]+1, max[1]+1, max[2]+1))                

In [65]: three_d                                                                                     
Out[65]: 
array([[[10, 11],
        [12, 13]],

       [[14, 15],
        [16, 17]]])

but in case there are rows missing from the source data, for example -

In [68]: source_data2 = np.array([ 
    ...: [0, 0, 0, 10], 
    ...: [0, 0, 1, 11], 
    ...: [0, 1, 1, 13], 
    ...: [1, 1, 0, 16],  
    ...: [1, 1, 1, 17] 
    ...: ]) 

what is the most efficient way to transform it to the following?

array([[[10, 11],
        [nan, 13]],

       [[nan, nan],
        [16, 17]]])

Upvotes: 0

Views: 672

Answers (1)

hpaulj
hpaulj

Reputation: 231510

In [512]: source_data = np.array([  
     ...:     ...: [0, 0, 0, 10],  
     ...:     ...: [0, 0, 1, 11],  
     ...:     ...: [0, 1, 0, 12],  
     ...:     ...: [0, 1, 1, 13],  
     ...:     ...: [1, 0, 0, 14],   
     ...:     ...: [1, 0, 1, 15],   
     ...:     ...: [1, 1, 0, 16],   
     ...:     ...: [1, 1, 1, 17]  
     ...:     ...: ])  

The reshape works because the source_data is complete and in order; you are ignoring the coordinates in the first 3 columns.

But we can use them with:

In [513]: arr = np.zeros((2,2,2), int)                                                         
In [514]: arr[source_data[:,0], source_data[:,1], source_data[:,2]] = source_data[:,3]         
In [515]: arr                                                                                  
Out[515]: 
array([[[10, 11],
        [12, 13]],

       [[14, 15],
        [16, 17]]])

We can do the same with the next source:

In [516]: source_data2 = np.array([  
     ...:     ...: [0, 0, 0, 10],  
     ...:     ...: [0, 0, 1, 11],  
     ...:     ...: [0, 1, 1, 13],  
     ...:     ...: [1, 1, 0, 16],   
     ...:     ...: [1, 1, 1, 17]  
     ...:     ...: ])               

fill the target with the nan:

In [517]: arr = np.full((2,2,2), np.nan)                                                       
In [518]: arr                                                                                  
Out[518]: 
array([[[nan, nan],
        [nan, nan]],

       [[nan, nan],
        [nan, nan]]])
In [519]: arr[source_data2[:,0], source_data2[:,1], source_data2[:,2]] = source_data2[:,3]     
In [520]: arr                                                                                  
Out[520]: 
array([[[10., 11.],
        [nan, 13.]],

       [[nan, nan],
        [16., 17.]]])

Upvotes: 1

Related Questions