How to convert .mat binary format to 2D numpy array?

Question

I'm converting hand_dataset annotations to such a format I could use it in YOLOv3. Annotations are in .mat binary format and I want to extract it, convert and reorganize in .csv.

I have tried mat = scipy.io.loadmat(file), but extracting data goes in strange format.

mat = scipy.io.loadmat(file)
bboxes = np.transpose(mat['boxes'])

Gives output:

[[array([[(array([[488.42954942, 345.62261032]]), array([[461.57045058, 348.37738968]]), array([[465.57045058, 387.37738968]]), array([[492.42954942, 384.62261032]]))]],
      dtype=[('a', 'O'), ('b', 'O'), ('c', 'O'), ('d', 'O')])]

where shape is (2,1) and numpy.array type.

I am capable to extract points iterating over whole dataset annotations like that: points = np.array([point[0] for point in bboxes[0][0][0][0]])

Where hierarchy is:

print(bboxes[0])
print(bboxes[0][0])
print(bboxes[0][0][0])
print(bboxes[0][0][0][0])
print(bboxes[0][0][0][0][0])
print(bboxes[0][0][0][0][0][0][1])

Is there any "nicer" way to extract needed points?

Output from given prints:

[[(array([[488.42954942, 345.62261032]]), array([[461.57045058, 348.37738968]]), array([[465.57045058, 387.37738968]]), array([[492.42954942, 384.62261032]]))]]
[(array([[488.42954942, 345.62261032]]), array([[461.57045058, 348.37738968]]), array([[465.57045058, 387.37738968]]), array([[492.42954942, 384.62261032]]))]
(array([[488.42954942, 345.62261032]]), array([[461.57045058, 348.37738968]]), array([[465.57045058, 387.37738968]]), array([[492.42954942, 384.62261032]]))
[[488.42954942 345.62261032]]
345.6226103157693

Any help would be appreciate! Thanks!

hpaulj · Accepted Answer

I think I can recreate your array with

In [38]: array=np.array 
In [43]: data = np.zeros((1,1),object)                                          
In [44]: data[0,0] = array([[(array([[488.42954942, 345.62261032]]), array([[461
    ...: .57045058, 348.37738968]]), array([[465.57045058, 387.37738968]]), arra
    ...: y([[492.42954942, 384.62261032]]))]], 
    ...:       dtype=[('a', 'O'), ('b', 'O'), ('c', 'O'), ('d', 'O')])          
In [45]: data                                                                   
Out[45]: 
array([[array([[(array([[488.42954942, 345.62261032]]), array([[461.57045058, 348.37738968]]), array([[465.57045058, 387.37738968]]), array([[492.42954942, 384.62261032]]))]],
      dtype=[('a', 'O'), ('b', 'O'), ('c', 'O'), ('d', 'O')])]],
      dtype=object)

This is a (1,1) object dtype array, that contains another array. That array is also (1,1) shape, but with a compound dtype (a structured array).

In [51]: data.shape, data.dtype                                                 
Out[51]: ((1, 1), dtype('O'))

In MATLAB everything is 2d. loadmat has a squeeze parameter that can tell it to remove the unnecessary dimensions. Without that we get a lot of (1,1) shaped arrays.

MATLAB objects like cell and struct are returned as object dtype arrays of some sort. Regular MATLAB matrices are returned an numeric numpy arrays.

We can extract the one element from data with a 2d index (more idiomatic than data[0][0]):

In [52]: data1 = data[0,0]                                                      
In [53]: data1.shape, data1.dtype                                               
Out[53]: ((1, 1), dtype([('a', 'O'), ('b', 'O'), ('c', 'O'), ('d', 'O')]))

item() also works to remove the one item from an array:

In [54]: data.item().dtype                                                      
Out[54]: dtype([('a', 'O'), ('b', 'O'), ('c', 'O'), ('d', 'O')])

At this level the array is a structured array with 4 (named) fields, each object dtype.

Fields are (normally) indexed by name. But being object dtype we have yet another layer:

In [74]: data1['a']                                                             
Out[74]: array([[array([[488.42954942, 345.62261032]])]], dtype=object)
In [75]: data1['a'].item()                                                      
Out[75]: array([[488.42954942, 345.62261032]])
In [76]: data1['a'].item().squeeze()                                            
Out[76]: array([488.42954942, 345.62261032])

@aparpara's idea of using to tolist() may be the cleanest way of extracting those nested object fields:

In [85]: data1.tolist()                                                         
Out[85]: 
[[(array([[488.42954942, 345.62261032]]),
   array([[461.57045058, 348.37738968]]),
   array([[465.57045058, 387.37738968]]),
   array([[492.42954942, 384.62261032]]))]]

On a structured array tolist() creates a list (or nested list) of tuples, one tuple per 'record' of array.

Then we can use np.array or concatenate to join the arrays into one, and squeeze to remove excess dimensions:

In [87]: np.array(data1.tolist()).squeeze()                                     
Out[87]: 
array([[488.42954942, 345.62261032],
       [461.57045058, 348.37738968],
       [465.57045058, 387.37738968],
       [492.42954942, 384.62261032]])

The MATLAB source isn't a simple 2d numeric matrix. So the translation to a different language isn't going to be simple either. Some loadmat parameters can simplify the return structure. Beyond that we have to work our way down through layers, with item or [0,0] kind of indexing.

How to convert .mat binary format to 2D numpy array?

Answers (2)

Related Questions