acolls_badger
acolls_badger

Reputation: 463

Pandas to numeric returning NaN

I have a pandas dataframe as per example:

mydf.head()

    Date        Merchant/Description Debit/Credit
0   10/05/2018  FAKE TRANSACTION 1  -£7.50
1   09/05/2018  FAKE TRANSACTION 2  -£5.79
2   09/05/2018  FAKE TRANSACTION 3  -£28.50
3   08/05/2018  FAKE TRANSACTION 4  -£3.99
4   08/05/2018  FAKE TRANSACTION 5  -£17.99

The column ['Debit/Credit'] has data type 'object'; it is a mixture of strings and NaN.

I wish to convert the strings to numbers. I use pandas.to_numeric to try to achieve this:

    cols = ['Debit/Credit']
    hsbcraw[cols] = hsbcraw[cols].apply(pd.to_numeric, errors='coerce')

This is turning all items in the column ['Debit/Credit'] to NaN:

mydf.head()

    Date        Merchant/Description Debit/Credit
0   10/05/2018  FAKE TRANSACTION 1   NaN
1   09/05/2018  FAKE TRANSACTION 2   NaN
2   09/05/2018  FAKE TRANSACTION 3   NaN
3   08/05/2018  FAKE TRANSACTION 4   NaN
4   08/05/2018  FAKE TRANSACTION 5   NaN

What is the error in my code or my approach?

Upvotes: 0

Views: 2002

Answers (3)

Ankur Sinha
Ankur Sinha

Reputation: 6639

I usually do it like by converting to floats like this:

df['Debit/Credit'] = df['Debit/Credit'].replace('£', '', regex = True).astype('float')

Upvotes: 0

Rakesh
Rakesh

Reputation: 82755

You can also use regex.

Ex:

import pandas as pd
df = pd.DataFrame({"Debit/Credit": ["-£7.50", "-£5.79", "-£28.50", "-£3.99", "-£17.99"]})
df["Debit/Credit"] = df["Debit/Credit"].str.extract("(\d*\.\d+)", expand=True).apply(pd.to_numeric)
print(df)

Output:

   Debit/Credit
0          7.50
1          5.79
2         28.50
3          3.99
4         17.99

Upvotes: 0

jezrael
jezrael

Reputation: 862511

Need replace £ by empty string before converting to numeric:

hsbcraw[cols]=hsbcraw[cols].replace('£','', regex=True).apply(pd.to_numeric, errors='coerce')

Upvotes: 3

Related Questions