Reputation: 57
I have a problem with to convert this column to numeric. I tried this
pricesquare['PRICE_SQUARE'] =
pd.to_numeric(pricesquare['PRICE_SQUARE'])
ValueError: Unable to parse string "13 312 " at position 0
df["PRICE_SQUARE"] = df["PRICE_SQUARE"].astype(str).astype(int)
ValueError: invalid literal for int() with base 10: '13\xa0312.
Upvotes: 0
Views: 149
Reputation: 19250
You can replace the \xa0
unicode character with an empty space before converting to int
.
import pandas as pd
data = ["13\xa0312", "14\xa01234"]
pd.Series(data).str.replace("\xa0", "").astype(int)
0 13312
1 141234
dtype: int64
You can also use unicodedata.normalize
to normalize the unicode character to a space, then replace the space with empty space, and finally convert to int
.
import unicodedata
import pandas as pd
data = ["13\xa0312", "14\xa01234"]
pd.Series(data).apply(lambda s: unicodedata.normalize("NFKC", s)).str.replace(" ", "").astype(int)
0 13312
1 141234
dtype: int64
Upvotes: 1
Reputation: 23815
Use apply https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html?highlight=apply#pandas.DataFrame.apply and in the function remove the space before you convert to int.
Upvotes: 0