JALAJ GAMBHIR
JALAJ GAMBHIR

Reputation: 33

Handling extremely large Numpy arrays

I want to create a Numpy kernel matrix of dimensions 25000*25000. I want to know what is the most efficient way to handle such large matrix in terms of saving it on disk and loading it. I tried dumping it with Pickle, but it threw an error saying it cannot serialize objects of size greater than 4 Gib.

Upvotes: 0

Views: 345

Answers (2)

GILO
GILO

Reputation: 2623

Why not try to save the array as a file instead of using pickle

np.savetxt("filename",array)

It then can be read by

np.genfromtxt("filename")

Upvotes: 1

Zihan Yang
Zihan Yang

Reputation: 31

u could try to save it in h5 file by pandas.HDFStore()

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(25000,25000).astype('float16'))
memory_use = round(df.memory_usage(deep=True).sum()/1024*3,2)
print('use{}G'.format(memory_use))
store = pd.HDFStore('test.h5', 'w)
store['data'] = df
store.close()

Upvotes: 1

Related Questions