Lilly
Lilly

Reputation: 988

Remove decimals fom pandas column(String type)

I would like to remove decimal points from a pandas column Which is of string type

Df

Net Sales
123.45
34.89
65.00

I am trying to remove decimals from that column

Net Sales
123
34
65

df['Net Sales'] is already a string type.

I tried as below:

df['Net Sales'] = df['Net Sales'].astype(str).replace('\.0', '', regex=True)

But I get teh error as below:

ValueError: invalid literal for int() with base 10: '1567.55'

Thanks.

Upvotes: 3

Views: 8706

Answers (2)

PKumar
PKumar

Reputation: 11128

You can coerce the datatype to int, Just a note in case you have nans in your data, the conversion to int doesn't work as they have float data type, so regex solution might be better.

df['Net Sales'] = df['Net Sales'].astype('int') 

or in case of regex:

df['Net Sales'] = df['Net Sales'].astype('str').replace(r'\.\d+$', '', regex=True).astype('int')

Example:

import pandas as pd

df = pd.DataFrame({"Net Sales" : [1.5, 2.5]})

df['Net Sales'] = df['Net Sales'].astype('int')

df['Net Sales'] = df['Net Sales'].astype('str').replace(r'\.\d+$', '', regex=True).astype('int')

Output:

#   Net Sales
#0  1
#1  2

Upvotes: 3

jezrael
jezrael

Reputation: 863741

If values are strings first convert to floats and then to integers:

df['Net Sales'] = df['Net Sales'].astype(float).astype(int)

If values are floats use:

df['Net Sales'] = df['Net Sales'].astype(int)

Your solution should be changed with \d+ for match digits after .:

df['Net Sales'] = df['Net Sales'].astype(str).replace('\.\d+', '', regex=True).astype(int)
print (df)
   Net Sales
0        123
1         34
2         65

Or youcan use split by dot and select first list by indexing:

df['Net Sales'] = df['Net Sales'].astype(str).str.split('.').str[0].astype(int)

Upvotes: 9

Related Questions