Reputation: 1

How to save Pandas dataframe into a npz file?

I have some dataframes which are loaded from different npz files. I combine all the data into a single dataframe and apply some processing to it. Now I want to save the new combined dataframe into a new npz file. How do I do that?

Since the dataframe is large (5000 rows, 30 columns) I would also like to know the most efficient way of doing so.

I tried to look over the internet for the solution but the results are about how to convert pandas dataframe to numpy data.

Upvotes: 0

Answers (3)

Kilaru Vasudeva

Reputation: 121

If the data is huge and consists of numpy arrays as entries, it is recommended to store them in one of following formats. Again, which one to use depends on the requirement but all of them will serve you the need now.

Here is a way to store as pickle file and read it back:

df.to_pickle('df.pkl')
df = pd.read_pickle('df.pkl')

Here is a way to store as hdf file and read it back:

df.to_hdf('df.h5', key='df', mode='w')
df = pd.read_hdf('df.h5', 'df')

Here is a way to store as parquet file and read it back:

df.to_parquet('df.parquet.gzip', compression='gzip')
df = pd.read_parquet('df.parquet.gzip')

Upvotes: 3

mathfux

Reputation: 5949

If columns of df have distinct dtypes you need to pass them as a separate values:

np.savez('out', **{c: df[c].values for c in df.columns})
data = np.load('out.npz')
df = pd.DataFrame({file: data[file] for file in data.files})

For string types you need also allow pickle:

data = np.load('out.npz', allow_pickle=True)

For memory-efficient use you might also need to replace np.savez with np.savez_compressed.

Upvotes: 0

Marcello Zago

Reputation: 726

It seems that the best solution for your problem is to convert your dataframe to a numpy array and afterwards save it.

np.savez(file, df.to_numpy())

file has to be a file, in which you want to save your data and df is the dataframe in which you have your data.

Upvotes: 1

How to save Pandas dataframe into a npz file?

Answers (3)

Related Questions