Reputation: 11
In a dataframe, in a specific columns I have values of sizes like 19M, 2.8M. M means millions and so on with the other possibilities (m, K.
).
I'm trying to convert these into numbers with regex but what the function converts is in numbers like 19000000.0. I have to eliminate all the .0
.
Here the code:
conversion = re.compile('(?P<amount>\d+\.{0,1}\d*)(?P<unit>\w{0,1})')
def unita(unit):
if unit == 'M':
return 1000000
if unit == 'k':
return 1000
return 1
def to_numeric(elem):
m = conversion.search(elem)
if m is None:
return None
unit = m.group('unit')
mult = unita(unit)
amount = float(m.group('amount'))
return int(amount * mult)
Upvotes: 1
Views: 74
Reputation: 630
For the columns in the dataframe that you want to convert to integers, use
df['column'] = df['column'].astype(int)
Upvotes: 1