Aarushi Goyal
Aarushi Goyal

Reputation: 55

not able to change object to float in pandas dataframe

just started learning python. trying to change a columns data type from object to float to take out the mean. I have tried to change [] to () and even the "". I dont know whether it makes a difference or not. Please help me figure out what the issue is. thanks!!

My code:

df["normalized-losses"]=df["normalized-losses"].astype(float)

error which i see: attached as imageenter image description here

Upvotes: 1

Views: 3739

Answers (4)

Rik
Rik

Reputation: 477

Use:

df['normalized-losses'] = df['normalized-losses'][~(df['normalized-losses'] == '?' )].astype(float)

Using df.normalized-losses leads to interpreter evaluating df.normalized which doesn't exist. The statement you have written executes (df.normalized) - (losses.astype(float)).There appears to be a question mark in your data which can't be converted to float.The above statement converts to float only those rows which don't contain a question mark and drops the rest.If you don't want to drop the columns you can replace them with 0 using:

df['normalized-losses'] = df['normalized-losses'].replace('?', 0.0)
df['normalized-losses'] = df['normalized-losses'].astype(float)

Upvotes: 3

Josh Friedlander
Josh Friedlander

Reputation: 11657

Welcome to Stack Overflow, and good luck on your Python journey! An important part of coding is learning how to interpret error messages. In this case, the traceback is quite helpful - it is telling you that you cannot call normalized after df, since a dataframe does not have a method of this name.

Of course you weren't trying to call something called normalized, but rather the normalized-losses column. The way to do this is as you already did once - df["normalized-losses"].

As to your main problem - if even one of your values can't be converted to a float, the columnwide operation will fail. This is very common. You need to first eliminate all of the non-numerical items in the column, one way to find them is with df[~df['normalized_losses'].str.isnumeric()].

Upvotes: 2

Harikrishna
Harikrishna

Reputation: 1140

The "df.normalized-losses" does not signify anything to python in this case. you can replace it with df["normalized-losses"]. Usually, if you try

df["normalized-losses"]=df["normalized-losses"].astype(float)

This should work. What this does is, it takes normalized-losses column from dataframe, converts it to float, and reassigns it to normalized column in the same dataframe. But sometimes it might need some data processing before you try the above statement.

Upvotes: 1

Daniel Roseman
Daniel Roseman

Reputation: 599610

You can't use - in an attribute or variable name. Perhaps you mean normalized_losses?

Upvotes: 0

Related Questions