Reputation: 2024
Sample data in pandas dataframe:
df['Revenue']
0 $0.00
1 $12,982,681.00
2 NaN
3 $10,150,623.00
4 NaN
...
1713 NaN
1714 NaN
1715 NaN
1716 NaN
1717 NaN
Name: Revenue, Length: 1718, dtype: object
I need to change the column from currency format to integer so that I can run computations and aggregations.
# Fix format currency
if df['Revenue'].dtype == 'object':
df['Revenue'] = df['Revenue'].apply(lambda x: x.replace('$','')).apply(lambda x: x.replace(',','')).astype(np.int64)
When I run the above line of code to transform the datatype, I run into the following error:
3 # Fix format currency
4 if df['Revenue'].dtype == 'object':
5 df['Revenue'] = df['Revenue'].apply(lambda x: x.replace('$','')).apply(lambda x: x.replace(',','')).astype(np.int64)
AttributeError: 'float' object has no attribute 'replace'
Upvotes: 1
Views: 763
Reputation: 26676
You can try replace everything eexcept digits and the dot. If you are reading in file as csv, you can have this controlled at the read stage.
df['Revenue'].fillna(0).astype(str).replace('[^0-9\.]','', regex=True).str.split('\.').str[0].astype(int)
Revenue
0 0
1 12982681
2 0
3 10150623
4 0
1713 0
1714 0
1715 0
1716 0
1717 0
Upvotes: 1