Saeed
Saeed

Reputation: 718

How to convert mat file including 4916 pictures (375x375x3) into a numpy array?

I want to extract pictures in 5000BOLD which is a Matlab file including 4916 color pictures. When I import them in Python using the following:

import scipy.io as sio
mat = sio.loadmat('all_imgs.mat')

I get a dictionary type(mat): dict. How can I change this file into a numpy array which ables me to access to each picture and plot it? Since each picture is 375x375x3, I should have an array 375x375x3x4916.

Also, using mat.keys() I have the following, which I have no idea what each element means:

dict_keys(['__header__', '__version__', '__globals__', 'all_imgs'])

Is there any direct way to use this dictionary file mat directly and plot a specific picture using imshow?

Upvotes: 0

Views: 642

Answers (1)

hpaulj
hpaulj

Reputation: 231665

Looking at a smaller file:

In [33]: from scipy import io                                                                                   
In [34]: io.loadmat('../Downloads/all_img_names.mat')     

loading without assignment gives me a print display of the file contents (I don't recommend this with the big image file):

Out[34]: 
{'__header__': b'MATLAB 5.0 MAT-file, Platform: GLNXA64, Created on: Thu Oct 19 14:20:48 2017',
 '__version__': '1.0',
 '__globals__': [],
 'all_img_names': array([[array(['airplanecabin1.jpg'], dtype='<U18'),
         array(['scenes'], dtype='<U6')],
        [array(['airplanecabin3.jpg'], dtype='<U18'),
         array(['scenes'], dtype='<U6')],
        [array(['airplanecabin5.jpg'], dtype='<U18'),
         array(['scenes'], dtype='<U6')],
        ...,
        [array(['yogastudio2.jpg'], dtype='<U15'),
         array(['scenes'], dtype='<U6')],
        [array(['yogastudio3.jpg'], dtype='<U15'),
         array(['scenes'], dtype='<U6')],
        [array(['yogastudio4.jpg'], dtype='<U15'),
         array(['scenes'], dtype='<U6')]], dtype=object)}

Looking specifically at the all_img_names key. That corresponds to a variable of that name in the source MATLAB workspace. Note that is object dtype. Like cell it can contain other arrays:

In [36]: io.loadmat('../Downloads/all_img_names.mat')['all_img_names'].shape                                    
Out[36]: (4916, 2)

And looking at the first 'row' of that array, it too is object dtype, with 2 string arrays - the name of the file, and some sort of category label:

In [37]: io.loadmat('../Downloads/all_img_names.mat')['all_img_names'][0]                                       
Out[37]: 
array([array(['airplanecabin1.jpg'], dtype='<U18'),
       array(['scenes'], dtype='<U6')], dtype=object)

Hopefully that gives you an idea of how to explore the contents of mat['all_imgs'].

mat['all_imgs'][0,0]

may be a 2d array. But if it is some sort of image encoding, you may have to open it with cv2. But without downloading that big file, I can't help further.

Upvotes: 2

Related Questions