Reputation: 104
I'm struggling with this problem: I've 2 large 2D numpy arrays (about 5 GB) and I want to save them in a .mat file loadable from Matlab I tried scipy.io and wrote
from scipy.io import savemat
data = {'A': a, 'B': b}
savemat('myfile.mat', data, appendmat=True, format='5',
long_field_names=False, do_compression=False, oned_as='row')
but I get the error: OverflowError: Python int too large to convert to C long
EDIT: Python 3.8, Matlab 2017b
Here the traceback
a.shape (600,1048261) of type <class 'numpy.float64'>
b.shape (1048261) of type <class 'numpy.float64'>
data = {'A': a, 'B': b}
savemat('myfile.mat', data, appendmat=True, format='5',
long_field_names=False, do_compression=False, oned_as='row')
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)
<ipython-input-19-4d1d08a54148> in <module>
1 data = {'A': a, 'B': b}
----> 2 savemat('myfile.mat', data, appendmat=True, format='5',
3 long_field_names=False, do_compression=False, oned_as='row')
~\miniconda3\envs\work\lib\site-packages\scipy\io\matlab\mio.py in savemat(file_name, mdict, appendmat, format, long_field_names, do_compression, oned_as)
277 else:
278 raise ValueError("Format should be '4' or '5'")
--> 279 MW.put_variables(mdict)
280
281
~\miniconda3\envs\work\lib\site-packages\scipy\io\matlab\mio5.py in put_variables(self, mdict, write_header)
847 self.file_stream.write(out_str)
848 else: # not compressing
--> 849 self._matrix_writer.write_top(var, asbytes(name), is_global)
~\miniconda3\envs\work\lib\site-packages\scipy\io\matlab\mio5.py in write_top(self, arr, name, is_global)
588 self._var_name = name
589 # write the header and data
--> 590 self.write(arr)
591
592 def write(self, arr):
~\miniconda3\envs\work\lib\site-packages\scipy\io\matlab\mio5.py in write(self, arr)
627 self.write_char(narr, codec)
628 else:
--> 629 self.write_numeric(narr)
630 self.update_matrix_tag(mat_tag_pos)
631
~\miniconda3\envs\work\lib\site-packages\scipy\io\matlab\mio5.py in write_numeric(self, arr)
653 self.write_element(arr.imag)
654 else:
--> 655 self.write_element(arr)
656
657 def write_char(self, arr, codec='ascii'):
~\miniconda3\envs\work\lib\site-packages\scipy\io\matlab\mio5.py in write_element(self, arr, mdtype)
494 self.write_smalldata_element(arr, mdtype, byte_count)
495 else:
--> 496 self.write_regular_element(arr, mdtype, byte_count)
497
498 def write_smalldata_element(self, arr, mdtype, byte_count):
~\miniconda3\envs\work\lib\site-packages\scipy\io\matlab\mio5.py in write_regular_element(self, arr, mdtype, byte_count)
508 tag = np.zeros((), NDT_TAG_FULL)
509 tag['mdtype'] = mdtype
--> 510 tag['byte_count'] = byte_count
511 self.write_bytes(tag)
512 self.write_bytes(arr)
OverflowError: Python int too large to convert to C long
I tried also with hdf5storage
hdf5storage.write(data, 'myfile.mat', matlab_compatible=True)
but it fails too.
EDIT:
gives this warning
\miniconda3\envs\work\lib\site-packages\hdf5storage\__init__.py:1306:
H5pyDeprecationWarning: The default file mode will change to 'r' (read-only)
in h5py 3.0. To suppress this warning, pass the mode you need to
h5py.File(), or set the global default h5.get_config().default_file_mode, or
set the environment variable H5PY_DEFAULT_READONLY=1. Available modes are:
'r', 'r+', 'w', 'w-'/'x', 'a'. See the docs for details.
f = h5py.File(filename)
Anyway, it creates a 5GB file but when I load it in Matlab I get a variable named with the file path and apparently without data.
Lastly I tried with h5py:
import h5py
hf = h5py.File('C:/Users/flavio/Desktop/STRA-pattern.mat', 'w')
hf.create_dataset('A', data=a)
hf.create_dataset('B', data=b)
hf.close()
but the output file in not recognized/readable in Matlab.
Is splitting the only solution? Hope there is a better way to fix this issue.
Upvotes: 2
Views: 2251
Reputation: 21
Anyone still looking for an answer, this works with hdf5storage
hdf5storage.savemat( save_path, data_dict, format=7.3, matlab_compatible=True, compress=False )
Upvotes: 2