Gulzar
Gulzar

Reputation: 27946

How to save a list of numpy arrays of different sizes to disk?

I have numpy arrays of different lengths, for example

a = [1,2,3,4]
b = [5,6]
c = [7,7,7]
d = [12,24,43,54,66,77,88]

They are packed together in a list (or a dictionary)

the_list = [a,b,c,d]

Each array is about 500 elements long, and the list is about 1000-10000 arrays long.

I want to save this list to a single file on disk with the following requirements in order of importance:

  1. Runtime On Read
  2. Human readable file format
  3. Runtime on Write

Using pandas like so:

df = pd.DataFrame(the_list)
df.to_csv(path, header=None, index=False)

Only writes the first element of every array. I'm guessing there is a better (working) way, either with pandas, pickle, or something else

Upvotes: 1

Views: 3847

Answers (2)

mgilson
mgilson

Reputation: 309881

I'd probably go with numpy.savez. This isn't a human readible format so maybe it won't work for you, but it is really easy to use (you read the file using numpy.load).

If having it legible for humans is really important, I'd go with json -- It's a language agnostic interchange format that is well known and widely used (probably due to it's popularity in web development). You can write your own encoder/decoder using the builtin facilities in the json module (it's really quite easy), or you can let something like json-tricks do that work for you.

Upvotes: 2

WEN WEN
WEN WEN

Reputation: 126

This works in my pc:

the_list = [a,b,c,d]
df_list = pd.DataFrame({ i:pd.Series(value) for i, value in enumerate(the_list) })
df_list.to_csv('./df_list.csv')

the csv file

Upvotes: 0

Related Questions