Reputation: 1963
I want to append a pandas DataFrame
object to an existing h5py
file, whether as a subgroup or dataset, with all the index and header information. Is that possible? I tried the following:
import pandas as pd
import h5py
f = h5py.File('f.h5', 'r+')
df = pd.DataFrame([[1,2,3],[4,5,6]], columns=['A', 'B', 'C'], index=['X', 'Y'])
f['df'] = df
From another script, I would like to access f.h5
, but the output of f['df'][()]
is array([[1, 2, 3],[4, 5, 6]])
, which doesn't contain the header information.
Upvotes: 1
Views: 2222
Reputation: 380
You can write to an existing hdf5 file directly with Pandas via pd.DataFrame.to_hdf()
and read it back in with pd.read_hdf()
. You just have to make sure to read and write with the same key.
To write to the h5 file:
existing_hdf5 = "f.h5"
df = pd.DataFrame([[1,2,3],[4,5,6]],
columns=['A', 'B', 'C'], index=['X', 'Y'])
df.to_hdf(existing_hdf5 , key='df')
Then you can read by:
df2 = pd.read_hdf(existing_hdf5 , key='df')
print(df2)
A B C
X 1 2 3
Y 4 5 6
Note that you can also make the dataframe appendable using format="table"
which requires the option dependency of Pytables
Upvotes: 2