Reputation: 4428
I have dataset df. within this dataset I have column Gross
I am completely new to Python,
I am trying to convert this column to float
and display sum()
dollarGross = lambda x: float(x[1:-1])
df.Gross = df.Gross.apply(dollarGross)
df.Gross.sum()
But I am getting this error:
<ipython-input-294-a9010792122a> in <lambda>(x)
----> 1 dollarGross = lambda x: float(x[1:-1])
2 df.Gross = df.Gross.apply(dollarGross)
3 df.Gross.sum()
TypeError: 'int' object is not subscriptable
What am I missing?
Upvotes: 2
Views: 9026
Reputation: 294338
Your error starts here:
df.Gross.apply(dollarGross)
df.Gross
is a pandas.Series
and when you use the apply
method, pandas iterates through each member of the series and passes that member to the "callable" (also known as a function, more on this in a bit) named dollarGross
. The critical thing to understand is what the members of the pandas.Series
are. In this case, they are integers. So each integer in the series gets passed to dollarGross
and gets called like this:
dollarGross(184)
This in turn looks like this:
float(184[1:-1])
Which makes no sense. You are trying to use [1:-1]
which is subscripting/slicing syntax on an integer. And that is what the error is telling you: Hey, you can't subscript an integer!
That is why it's good to tell us what you are trying to do. Because now we can help you do that. Remember I said you can pass a "callable" to apply
. Well, float
is the name of the class of float
objects... It's also a "callable" because we can do this float(184)
. So....
df.Gross.apply(float)
Should get things done. However, it's still probably better to do this
df.Gross.astype(float)
Or, if some of the members of df.Gross
cannot be interpreted as a float
value, it's probable better to use @MaxU's answer.
Upvotes: 3
Reputation: 210852
AFAIK pd.to_numeric() method provides us the most idiomatic way to convert strings to numerical values:
df['Gross'] = pd.to_numeric(df['Gross'], errors='coerce')
print(df['Gross'].sum())
Upvotes: 3
Reputation: 43
I think you should separate the columns using
dollarGross = df['Gross'] #I defined a new array to store the Gross Values
print(dollarGross.sum())
Upvotes: 0
Reputation: 356
I think you just have to write dollarGross = lambda x: float(x)
. If you use square brackets you try to access an array.
Upvotes: 2