Reputation: 67
I need to pivot a dataframe (dfM) which looks something like
Task Question Answer analystID
x a 1 u
y b 2 i
z c 3 o
at first I thought my pivot
dfM = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID',
values = ['Answer'])
was getting that error (No numeric types to aggregate) because the numbers under the Answer column might have been strings so I tried
dfM.apply(pd.to_numeric, errors='ignore')
but I still ended up getting the same error.
Is there a way I can fix this error?
Upvotes: 1
Views: 2491
Reputation: 862771
I think you need convert Answer
column to int
or float
only, then remove []
from values
parameter for no MultiIndex
in columns:
dfM['Answer'] = dfM['Answer'].astype(int)
df = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID',
values = 'Answer')
print (df)
analystID i o u
Task Question
x a NaN NaN 1.0
y b 2.0 NaN NaN
z c NaN 3.0 NaN
If first solution failed, there are some non numeric values. So need to_numeric
with parameter errors='coerce'
for replace non numeric to NaN
s.
dfM['Answer'] = pd.to_numeric(dfM['Answer'], errors='coerce')
df = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID',
values = 'Answer')
print (df)
analystID i o u
Task Question
x a NaN NaN 1.0
y b 2.0 NaN NaN
z c NaN 3.0 NaN
One row solution with assign
:
df = pd.pivot_table(dfM.assign(Answer=dfM['Answer'].astype(int)),
index = ['Task', 'Question'],
columns = 'analystID',
values = 'Answer')
print (df)
analystID i o u
Task Question
x a NaN NaN 1.0
y b 2.0 NaN NaN
z c NaN 3.0 NaN
EDIT:
If use parameter errors='ignore'
and some non numeric value, still get error:
print (dfM)
Task Question Answer analystID
0 x a r u <-first ansswer value was changed to `r`
1 y b 2 i
2 z c 3 o
dfM['Answer'] = pd.to_numeric(dfM['Answer'], errors='ignore')
DataError: No numeric types to aggregate
Upvotes: 2
Reputation: 153460
Try to reassign your apply statement back to dfM.
dfM = dfM.apply(pd.to_numeric, errors='ignore')
dfM = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID',
values = ['Answer'])
Answer
analystID i o u
Task Question
x a NaN NaN 1.0
y b 2.0 NaN NaN
z c NaN 3.0 NaN
Upvotes: 2