Reputation: 1303
This should be quick, but none of the pivot/groupby work I'm doing is coming up with what I need.
I have a table like this:
Letter Period Amount
YrMnth
2014-12 B 6 0
2014-12 C 8 1
2014-12 C 9 2
2014-12 C 10 3
2014-12 C 6 4
2014-12 C 12 5
2014-12 C 7 6
2014-12 C 11 7
2014-12 D 9 8
2014-12 D 10 9
2014-12 D 1 10
2014-12 D 8 11
2014-12 D 6 12
2014-12 D 12 13
2014-12 D 7 14
2014-12 D 11 15
2014-12 D 4 16
2014-12 D 3 17
2015-01 B 7 18
2015-01 B 8 19
2015-01 B 1 20
2015-01 B 10 21
2015-01 B 11 22
2015-01 B 6 23
2015-01 B 9 24
2015-01 B 3 25
2015-01 B 5 26
2015-01 C 10 27
I want to pivot it, so that the Index is basically YrMonth and Letter, the Period are the columns, and the Amount are the values.
I understand Pivot in general, but am getting errors when I try to do it with multiple indexes. I made the index a column, and tried this:
In [76]: df.pivot(index=['YrMnth','Letter'], values='Amount', columns='Period')
But I came out with this error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-76-fc2a4c5f244d> in <module>()
----> 1 df.pivot(index=['YrMnth','Letter'], values='Amount', columns='Period')
/Users/chaseschwalbach/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in pivot(self, index, columns, values)
3761 """
3762 from pandas.core.reshape import pivot
-> 3763 return pivot(self, index=index, columns=columns, values=values)
3764
3765 def stack(self, level=-1, dropna=True):
/Users/chaseschwalbach/anaconda/lib/python2.7/site-packages/pandas/core/reshape.pyc in pivot(self, index, columns, values)
331 indexed = Series(self[values].values,
332 index=MultiIndex.from_arrays([index,
--> 333 self[columns]]))
334 return indexed.unstack(columns)
335
/Users/chaseschwalbach/anaconda/lib/python2.7/site-packages/pandas/core/series.pyc in __init__(self, data, index, dtype, name, copy, fastpath)
225 raise_cast_failure=True)
226
--> 227 data = SingleBlockManager(data, index, fastpath=True)
228
229 generic.NDFrame.__init__(self, data, fastpath=True)
/Users/chaseschwalbach/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in __init__(self, block, axis, do_integrity_check, fastpath)
3734 block = make_block(block,
3735 placement=slice(0, len(axis)),
-> 3736 ndim=1, fastpath=True)
3737
3738 self.blocks = [block]
/Users/chaseschwalbach/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in make_block(values, placement, klass, ndim, dtype, fastpath)
2452
2453 return klass(values, ndim=ndim, fastpath=fastpath,
-> 2454 placement=placement)
2455
2456
/Users/chaseschwalbach/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in __init__(self, values, placement, ndim, fastpath)
85 raise ValueError('Wrong number of items passed %d,'
86 ' placement implies %d' % (
---> 87 len(self.values), len(self.mgr_locs)))
88
89 @property
ValueError: Wrong number of items passed 138, placement implies 2
Upvotes: 6
Views: 2693
Reputation: 180401
If I understand you correctly,pivot_table
might be closer to what you need:
df = df.pivot_table(index=["YrMnth", "Letter"], columns="Period", values="Amount")
Which gives you:
Period 1 3 4 5 6 7 8 9 10 11 12
YrMnth Letter
2014-12 B NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN
C NaN NaN NaN NaN 4 6 1 2 3 7 5
D 10 17 16 NaN 12 14 11 8 9 15 13
2015-01 B 20 25 NaN 26 23 18 19 24 21 22 NaN
C NaN NaN NaN NaN NaN NaN NaN NaN 27 NaN NaN
As suggested in the comments:
df = pd.pivot_table(df, index=["YrMnth", "Letter"], columns="Period", values="Amount")
Period 1 3 4 5 6 7 8 9 10 11 12
YrMnth Letter
2014-12 B NaN NaN NaN NaN 0 NaN NaN NaN NaN NaN NaN
C NaN NaN NaN NaN 4 6 1 2 3 7 5
D 10 17 16 NaN 12 14 11 8 9 15 13
2015-01 B 20 25 NaN 26 23 18 19 24 21 22 NaN
C NaN NaN NaN NaN NaN NaN NaN NaN 27 NaN NaN
Also yields the same, if someone wants to clarify how the former will fail that would be great.
Upvotes: 5