Reputation: 988
I would like to remove decimal points from a pandas column Which is of string type
Df
Net Sales
123.45
34.89
65.00
I am trying to remove decimals from that column
Net Sales
123
34
65
df['Net Sales'] is already a string type.
I tried as below:
df['Net Sales'] = df['Net Sales'].astype(str).replace('\.0', '', regex=True)
But I get teh error as below:
ValueError: invalid literal for int() with base 10: '1567.55'
Thanks.
Upvotes: 3
Views: 8706
Reputation: 11128
You can coerce the datatype to int, Just a note in case you have nans in your data, the conversion to int doesn't work as they have float data type, so regex solution might be better.
df['Net Sales'] = df['Net Sales'].astype('int')
or in case of regex:
df['Net Sales'] = df['Net Sales'].astype('str').replace(r'\.\d+$', '', regex=True).astype('int')
Example:
import pandas as pd
df = pd.DataFrame({"Net Sales" : [1.5, 2.5]})
df['Net Sales'] = df['Net Sales'].astype('int')
df['Net Sales'] = df['Net Sales'].astype('str').replace(r'\.\d+$', '', regex=True).astype('int')
Output:
# Net Sales
#0 1
#1 2
Upvotes: 3
Reputation: 863741
If values are strings first convert to floats and then to integers:
df['Net Sales'] = df['Net Sales'].astype(float).astype(int)
If values are floats use:
df['Net Sales'] = df['Net Sales'].astype(int)
Your solution should be changed with \d+
for match digits after .
:
df['Net Sales'] = df['Net Sales'].astype(str).replace('\.\d+', '', regex=True).astype(int)
print (df)
Net Sales
0 123
1 34
2 65
Or youcan use split
by dot and select first list by indexing:
df['Net Sales'] = df['Net Sales'].astype(str).str.split('.').str[0].astype(int)
Upvotes: 9