kaminsknator
kaminsknator

Reputation: 1183

convert pandas dataframe column from hex string to int

I have a very large dataframe that I would like to avoid iterating through every single row and want to convert the entire column from hex string to int. It doesn't process the string correctly with astype but has no problems with a single entry. Is there a way to tell astype the datatype is base 16?

IN:
import pandas as pd
df = pd.DataFrame(['1C8','0C3'], columns=['Command0'])
df['Command0'].astype(int)
OUT:
ValueError: invalid literal for int() with base10: '1C8'

This works but want to avoid row iteration.

for index, row in df.iterrows():
    print(row['Command0'])

I'm reading this in from a CSV pd.read_csv(open_csv, nrows=20) so if there is a way to read it in and explicitly tell it what the format is then that would be even better!

Upvotes: 22

Views: 37632

Answers (3)

mirekphd
mirekphd

Reputation: 6831

The reverse operation (float to hex lossless conversion) would be:

df['Command0'].apply(float.hex)

Upvotes: 2

jpp
jpp

Reputation: 164773

You can use apply as per @Andrew's solution, but lambda isn't necessary and adds overhead. Instead, use apply with a keyword argument:

res = df['Command0'].apply(int, base=16)

print(res)

0    456
1    195
Name: Command0, dtype: int64

With pd.read_csv, you can use functools.partial:

from functools import partial

df = pd.read_csv(open_csv, nrows=20, converters={'Command0': partial(int, base=16)})

Upvotes: 20

andrew
andrew

Reputation: 4089

You could use apply.

df.Command0.apply(lambda x: int(x, 16))
>>>
0    456
1    195
Name: Command0, dtype: int64

And you can do this with pd.read_csv call using the converters parameter:

df = pd.read_csv("path.txt", converters={"Command0": lambda x: int(x, 16)})

Upvotes: 15

Related Questions