swagner
swagner

Reputation: 51

How to save many np arrays of different size in one file (eg one np array)?

I want to save several numpy arrays with different shape in one file (using python 2.7).

a_1.shape = [4,130]

a_2.shape = [4,39]

a_3.shape = [4,60]

I can create a list with all arrays like so:

list=[a_1, a_2, a_3]

But then when I try to save it or make a np.array out of it...

all=np.array(list)

np.savetxt('./a_list',list)

... it returns the error:

could not broadcast input array from shape (4,39) into shape (4)

Is there another way to do this with keeping the shape of the individual arrays?

Thank you!

Upvotes: 3

Views: 1334

Answers (3)

swagner
swagner

Reputation: 51

As @hpaulj mentioned, the problem is that you can not create a non-rectangular numpy array. There are more threads covering this issue, such as this one. Given the example above, one could either use a different data structure or for instance fill up the smaller arrays with zeros to fit the largest one using this function (by @Multihunter found in the linked thread):

import numpy as np
def stack_uneven(arrays, fill_value=0.):
    '''
    Fits arrays into a single numpy array, even if they are
    different sizes. `fill_value` is the default value.

    Args:
            arrays: list of np arrays of various sizes
                (must be same rank, but not necessarily same size)
            fill_value (float, optional):

    Returns:
            np.ndarray
    '''
    sizes = [a.shape for a in arrays]
    max_sizes = np.max(list(zip(*sizes)), -1)
    # The resultant array has stacked on the first dimension
    result = np.full((len(arrays),) + tuple(max_sizes), fill_value)
    for i, a in enumerate(arrays):
      # The shape of this array `a`, turned into slices
      slices = tuple(slice(0,s) for s in sizes[i])
      # Overwrite a block slice of `result` with this array `a`
      result[i][slices] = a
    return result

To then go on and save it, np.save is not suitable because it only supports 1D or 2D arrays. pandas is a better alternative.

Upvotes: 2

Yasin Yousif
Yasin Yousif

Reputation: 967

Credits to: https://tonysyu.github.io/ragged-arrays.html

For your example,

import numpy as np

def stack_ragged(array_list, axis=0):
    lengths = [np.shape(a)[axis] for a in array_list]
    idx = np.cumsum(lengths[:-1])
    stacked = np.concatenate(array_list, axis=axis)
    return stacked, idx

def save_stacked_array(fname, array_list, axis=0):
    stacked, idx = stack_ragged(array_list, axis=axis)
    np.savez(fname, stacked_array=stacked, stacked_index=idx)

def load_stacked_arrays(fname, axis=0):
    npzfile = np.load(fname)
    idx = npzfile['stacked_index']
    stacked = npzfile['stacked_array']
    return np.split(stacked, idx, axis=axis)

Then

>>> array_list = [a_1,a_2,a_3]
>>> save_stacked_array('file.npz',array_list,axis=1)

# for loading
>>> array_list = load_stacked_array('file.npz',axis=1)

For higher dimensions you could change the axis

Upvotes: 1

Nick Rogers
Nick Rogers

Reputation: 328

I use pickle to save large number of numpy arrays in one np array as an external .pickle file for python 2.7. Below is an example,

import numpy 
import pickle
arr=numpy.array([numpy.array([1,2,3,4]),numpy.array([5,6,7,8])])
#saving arr as a .pickle file externally, wb-write binary
pickle.dump(arr,open("C:/Users/nick/Desktop/2darrays.pickle","wb"))

#Below is to read and retrieve its contents, rb-read binary
with open("C:/Users/nick/Desktop/2darrays.pickle", "rb") as f:
    X = pickle.load(f, encoding="latin1") 
print(X)

The output obtained is,

[[1 2 3 4]
 [5 6 7 8]]

Hope this helps.

Upvotes: 1

Related Questions