Reputation: 11
in pandas_dataframe, I try to convert numerical value to categorical value
df['SalePrice_band']=0
df.loc[df['SalePrice']<50000 , 'SalesPrice_band']=1
df.loc[df['SalePrice']>=50000 & df['SalePrice']<100000 , 'SalesPrice_band'] = 2
df.loc[df['SalePrice']>=100000 & df['SalePrice']<125000 , 'SalesPrice_band'] = 3
df.loc[df['SalePrice']>=125000 & df['SalePrice']<150000 , 'SalesPrice_band'] = 4
df.loc[df['SalePrice']>=150000 & df['SalePrice']<175000 , 'SalesPrice_band'] = 5
But, above code occurs error as : ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
so, I read error message and checked
df.loc[df['SalePrice']<50000 , 'SalesPrice_band']=1
Above is fine.
df['SalePrice']>=50000 & df['SalePrice']<100000
But Here , I got error where I connect by two bool_bin
SO, I TRY Like this :
(df['SalePrice']>=50000 & df['SalePrice']<100000).all()
But DOESN'T WORK; Still error : ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
How can I handle it?
Upvotes: 1
Views: 158
Reputation: 10624
In Pandas, you must put all boolean checks inside parenthesis, like below:
df.loc[(df['SalePrice']>=50000) & (df['SalePrice']<100000) , 'SalesPrice_band'] = 2
instead of this:
df.loc[df['SalePrice']>=50000 & df['SalePrice']<100000 , 'SalesPrice_band'] = 2
The latest will lead to the error that you provided
Upvotes: 1