Reputation: 15843
I'd like to use pivot_table
to show an arbitrary value of a column in each cell. For example, given a DataFrame
like this:
df = pd.DataFrame({'x': ['x1', 'x1', 'x2'],
'y': ['a', 'b', 'c']})
To count the values of y
for each value of x
:
df.pivot_table(index='x', values='y', aggfunc=len)
y
x
x1 2
x2 1
So in place of [2, 1]
, I'd like to get ['a', 'c']
or ['b', 'c']
.
I tried these approaches, but all produce errors (notebook):
df.pivot_table(index='x', values='y', aggfunc=sample)
df.pivot_table(index='x', values='y', aggfunc=head)
df.pivot_table(index='x', values='y', aggfunc=lambda x: x[0])
Per https://stackoverflow.com/a/38982172/1840471, an alternative is using groupby
and agg
, and this produces the desired result in this case:
df.groupby(['x']).y.agg('head')
However, I'm looking to use pivot_table
because my full use case involves getting values in rows and columns.
Upvotes: 1
Views: 981
Reputation: 25259
How about using first
as follows:
df.pivot_table(index='x', values='y', aggfunc='first')
Out[67]:
y
x
x1 a
x2 c
Upvotes: 1