yusuf
yusuf

Reputation: 3781

"TypeError: buffer is too small for requested array" when attempting to read a .mat file using scipy.io.loadmat

I have such a code:

import tempfile
import subprocess
import shlex
import os
import numpy as np
import scipy.io

script_dirname = os.path.abspath(os.path.dirname(__file__))


def get_windows(image_fnames, cmd='selective_search'):
    f, output_filename = tempfile.mkstemp(suffix='.mat')
    os.close(f)
    fnames_cell = '{' + ','.join("'{}'".format(x) for x in image_fnames) + '}'
    command = "{}({}, '{}')".format(cmd, fnames_cell, output_filename)
    print(command)

    mc = "matlab -nojvm -r \"try; {}; catch; exit; end; exit\"".format(command)
    pid = subprocess.Popen(
        shlex.split(mc), stdout=open('/dev/null', 'w'), cwd=script_dirname)
    retcode = pid.wait()
    if retcode != 0:
        raise Exception("Matlab script did not exit successfully!")

    all_boxes = list(scipy.io.loadmat(output_filename)['all_boxes'][0])
    subtractor = np.array((1, 1, 0, 0))[np.newaxis, :]
    all_boxes = [boxes - subtractor for boxes in all_boxes]

    os.remove(output_filename)
    if len(all_boxes) != len(image_fnames):
        raise Exception("Something went wrong computing the windows!")
    return all_boxes

if __name__ == '__main__':

    import time

    image_filenames = [
        script_dirname + '/000015.jpg',
        script_dirname + '/cat.jpg'
    ] * 4
    t = time.time()
    boxes = get_windows(image_filenames)
    print(boxes[:2])
    print("Processed {} images in {:.3f} s".format(
        len(image_filenames), time.time() - t))

The code is tested, and it has to work.

When I run the code, I get the following error:

Traceback (most recent call last):
  File "selective_search.py", line 62, in <module>
    boxes = get_windows(image_filenames)

  File "selective_search.py", line 42, in get_windows
    all_boxes = list(scipy.io.loadmat(output_filename)['all_boxes'][0])

  File "/usr/lib/python2.7/dist-packages/scipy/io/matlab/mio.py", line 131, in loadmat
    MR = mat_reader_factory(file_name, appendmat, **kwargs)

  File "/usr/lib/python2.7/dist-packages/scipy/io/matlab/mio.py", line 55, in mat_reader_factory
    mjv, mnv = get_matfile_version(byte_stream)

  File "/usr/lib/python2.7/dist-packages/scipy/io/matlab/miobase.py", line 218, in get_matfile_version
    buffer = fileobj.read(4))

TypeError: buffer is too small for requested array

I'm using MATLAB 2015.

How can I fix the problem?

Upvotes: 3

Views: 12094

Answers (2)

yusuf
yusuf

Reputation: 3781

I have solved the problem.

I told you, I have run the code in different computer which has MATLAB 2014, and it worked.

In that computer, gcc version was 4.7, but in the current computer which has MATLAB 2015, gcc version is 4.9.

I have removed gcc 4.9, and installed 4.7, and the problem has been solved. :)

Upvotes: 2

hpaulj
hpaulj

Reputation: 231665

This script constructs a command line, calls MATLAB with that, and then reads the .mat produced.

I think you need to test the pieces:

  • does the MATLAB calling command look right?

  • does the MATLAB run fine?

  • is the .mat valid (read with MATLAB)?

  • can you read it from Python - just a plain loadmat?

How long ago was it when it ran? For what MATLAB version? You may need to change this script so it does not destroy the temporary file, giving you a chance to test it interactively.

One possibility is that the .mat is in a format that loadmat can't handle. MATLAB keeps changing the .mat format, the latest being some form of hd5?. You might need to change the MATLAB script so it uses an earlier format. I don't recall what kind error loadmat produces when loading a newer incompatible version.


The TypeError: buffer is too small for requested array error is something I'd expect from a np.ndarray() call, not the usual np.array. But digging into the loadmat code, scipy/io/matlab/mio5.py I see that it does use ndarray. That does point to some sort of file format incompatibility, either the MATLAB file version, or maybe a 32/64 bit machine difference.


The error is in the

def get_matfile_version(fileobj):

function, right at the start where it tries to read the 1st 4 bytes of the file:

# Mat4 files have a zero somewhere in first 4 bytes
fileobj.seek(0)
mopt_bytes = np.ndarray(shape=(4,),
                       dtype=np.uint8,
                       buffer = fileobj.read(4))

It's reading the bytes, and trying to create an array directly from them. That looks like a straight forward operation. Except, what would happen if the file was empty? The buffer would be 0 bytes, too small. If so, then the problem is the MATLAB failed to run or to save its file. I'll have to experiment.


BINGO - the .mat file is empty

In [270]: with open('test.mat','w') as f:pass  # empty file
In [271]: matlab.loadmat('test.mat')
---------------------------------------------------------------------------
...
/usr/lib/python3/dist-packages/scipy/io/matlab/miobase.py in get_matfile_version(fileobj)
    216     mopt_bytes = np.ndarray(shape=(4,),
    217                            dtype=np.uint8,
--> 218                            buffer = fileobj.read(4))
    219     if 0 in mopt_bytes:
    220         fileobj.seek(0)

TypeError: buffer is too small for requested array

So for some reason, the MATLAB script failed. mkstemp creates an empty temporary file. Normally the MATLAB script would over write it (or append?). But if the script fails (to run), then this file remains empty, producing this error when you try to read it.

If you used tempfile to get a file name, rather than create the file, you'd get an OSError, 'no such file'.

I don't think the scipy developers anticipated someone would try to load an empty .mat file, otherwise they would have caught and translated this error.

Upvotes: 4

Related Questions