Reputation: 359
I realise now that the experimental PanelND objects are going to fit my need brilliantly, except it appears I can't save them:
p4d = pd.Panel4D(np.random.randn(2, 2, 5, 4),
labels=['Label1','Label2'],
items=['Item1', 'Item2'],
major_axis=pd.date_range('1/1/2000', periods=5),
minor_axis=['A', 'B', 'C', 'D'])
p4d.save('p4d')
...
PicklingError: Can't pickle <class 'pandas.core.panelnd.Panel4D'>: attribute lookup pandas.core.panelnd.Panel4D failed
And if I try to write it to a HDFStore, I get:
TypeError: cannot properly create the storer for: [_STORER_MAP] [group->/p4d (Group) u'',value-><class 'pandas.core.panelnd.Panel4D'>,table->None,append->False,kwargs->{}]
Other than saving the individual DataFrames and stitching them together, how can I persist the higher dimensional obects?
Edit: I see that store.append()
works for Panel4D but save()
doesn't, and nor does store.append()
for the example Panel5D. I really am after higher than 4D, so the problem still persists.
Edit: more info:
I am trying to create an arbitrary dimensioned panel, within nested loops across the dimensions, and then to be able to slice that data, again arbitrarily, so I can process it (collate, plot, optimise)
In (rough) code:
for a in range(1,10):
panel4ddict = {}
for b in range(101, 150):
paneldict = {}
for c in range(500, 501):
df = MakeDataFrame(a, b, c) # returns processed df
paneldict[c] = df
p3d = Panel(paneldict)
panel4ddict[b] = p3d
p4d = Panel4D(panel4ddict)
panel5ddict[a] = p4d
panel5d = Panel5D(panel5ddict)
sliced = panel5d[:,3,5:6]
# and then do some plotting of my sliced DF
Upvotes: 4
Views: 519
Reputation: 129018
Here is a way to store a Panel5D. Essentially you store each of the Panel4D as a separate group in the store, then reconstruct on read-back.
Note you might be better off storing this as DataFrame with multi-levels (3 or more) which in-effect contains the same information as a Panel5D, but unrolled long-wise.
In [1]: from pandas.core import panelnd, panel4d
from pandas.utils import testing as tm
In [2]: Panel5D = panelnd.create_nd_panel_factory(
...: klass_name='Panel5D',
...: axis_orders=['cool', 'labels', 'items', 'major_axis',
...: 'minor_axis'],
...: axis_slices={'labels': 'labels', 'items': 'items',
...: 'major_axis': 'major_axis',
...: 'minor_axis': 'minor_axis'},
...: slicer=panel4d.Panel4D,
...: axis_aliases={'major': 'major_axis', 'minor': 'minor_axis'},
...: stat_axis=2)
In [4]: p4d = panel4d.Panel4D(dict(L1=tm.makePanel(), L2=tm.makePanel()))
In [5]: p5d = Panel5D(dict(C1 = p4d, C2 = p4d+1))
In [6]: p5d
Out[6]:
<class 'pandas.core.panelnd.Panel5D'>
Dimensions: 2 (cool) x 2 (labels) x 3 (items) x 30 (major_axis) x 4 (minor_axis)
Cool axis: C1 to C2
Labels axis: L1 to L2
Items axis: ItemA to ItemC
Major_axis axis: 2000-01-03 00:00:00 to 2000-02-11 00:00:00
Minor_axis axis: A to D
In [7]: store = pd.HDFStore('test.h5',mode='w')
In [9]: for x in p5d.cool:
store.append(x,p5d[x])
...:
In [10]: store
Out[10]:
<class 'pandas.io.pytables.HDFStore'>
File path: test.h5
/C1 wide_table (typ->appendable,nrows->360,ncols->2,indexers->[items,major_axis,minor_axis])
/C2 wide_table (typ->appendable,nrows->360,ncols->2,indexers->[items,major_axis,minor_axis])
In [11]: store.close()
Upvotes: 1