Matthew
Matthew

Reputation: 67

No numeric types to aggregate in pivot table after changing data to floats

I need to pivot a dataframe (dfM) which looks something like

Task Question Answer analystID
x    a        1      u
y    b        2      i
z    c        3      o

at first I thought my pivot

 dfM = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID', 
                     values = ['Answer'])

was getting that error (No numeric types to aggregate) because the numbers under the Answer column might have been strings so I tried

dfM.apply(pd.to_numeric, errors='ignore')

but I still ended up getting the same error.

Is there a way I can fix this error?

Upvotes: 1

Views: 2491

Answers (2)

jezrael
jezrael

Reputation: 862771

I think you need convert Answer column to int or float only, then remove [] from values parameter for no MultiIndex in columns:

dfM['Answer'] = dfM['Answer'].astype(int)
df = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID', 
                     values = 'Answer')
print (df)
analystID        i    o    u
Task Question               
x    a         NaN  NaN  1.0
y    b         2.0  NaN  NaN
z    c         NaN  3.0  NaN

If first solution failed, there are some non numeric values. So need to_numeric with parameter errors='coerce' for replace non numeric to NaNs.

dfM['Answer'] = pd.to_numeric(dfM['Answer'], errors='coerce')
df = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID', 
                     values = 'Answer')
print (df)
analystID        i    o    u
Task Question               
x    a         NaN  NaN  1.0
y    b         2.0  NaN  NaN
z    c         NaN  3.0  NaN

One row solution with assign:

df = pd.pivot_table(dfM.assign(Answer=dfM['Answer'].astype(int)), 
                     index = ['Task', 'Question'], 
                     columns = 'analystID', 
                     values = 'Answer')
print (df)
analystID        i    o    u
Task Question               
x    a         NaN  NaN  1.0
y    b         2.0  NaN  NaN
z    c         NaN  3.0  NaN

EDIT:

If use parameter errors='ignore' and some non numeric value, still get error:

print (dfM)
  Task Question Answer analystID
0    x        a      r         u <-first ansswer value was changed to `r`
1    y        b      2         i
2    z        c      3         o

dfM['Answer'] = pd.to_numeric(dfM['Answer'], errors='ignore')

DataError: No numeric types to aggregate

Upvotes: 2

Scott Boston
Scott Boston

Reputation: 153460

Try to reassign your apply statement back to dfM.

dfM = dfM.apply(pd.to_numeric, errors='ignore')
dfM = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID', 
                     values = ['Answer'])

              Answer          
analystID          i    o    u
Task Question                 
x    a           NaN  NaN  1.0
y    b           2.0  NaN  NaN
z    c           NaN  3.0  NaN

Upvotes: 2

Related Questions