Reputation: 18547
Using this dataframe
df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo",
"bar", "bar", "bar", "bar"],
"B": ["one", "one", "one", "two", "two",
"one", "one", "two", "two"],
"C": ["small", "large", "large", "small",
"small", "large", "small", "small",
"large"],
"D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
"E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
'''
A B C D E
0 foo one small 1 2
1 foo one large 2 4
2 foo one large 2 5
3 foo two small 3 5
4 foo two small 3 6
5 bar one large 4 6
6 bar one small 5 8
7 bar two small 6 9
8 bar two large 7 9
'''
When I run
print(pd.pivot_table(df, values='C', index=['A', 'B'],
columns=['C'], aggfunc='count'))
To count the number of small/large according to columns A and B (say, for A,B=(foo,one)
we have 1 small
, and 2 large
in column C)
it gives me error
ValueError: Grouper for 'C' not 1-dimensional
What's the problem and how to resolve it?
Upvotes: 5
Views: 13808
Reputation: 31011
You can not have C column as both values and columns. Probably you should change to:
print(pd.pivot_table(df, index=['A', 'B'], columns=['C'], aggfunc='count'))
Then the result is:
D E
C large small large small
A B
bar one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
foo one 2.0 1.0 2.0 1.0
two NaN 2.0 NaN 2.0
Upvotes: 6
Reputation: 13989
It sounds like what you're after is actually a groupby:
df.groupby(['A', 'B', 'C']).size()
A B C
bar one large 1
small 1
two large 1
small 1
foo one large 2
small 1
two small 2
dtype: int64
If you want to then place 'C' back in the columns, you can unstack:
df.groupby(['A', 'B', 'C']).size().unstack().fillna(0)
C large small
A B
bar one 1.0 1.0
two 1.0 1.0
foo one 2.0 1.0
two 0.0 2.0
Upvotes: 2