Reputation: 1183
I have a very large dataframe that I would like to avoid iterating through every single row and want to convert the entire column from hex string to int. It doesn't process the string correctly with astype but has no problems with a single entry. Is there a way to tell astype the datatype is base 16?
IN:
import pandas as pd
df = pd.DataFrame(['1C8','0C3'], columns=['Command0'])
df['Command0'].astype(int)
OUT:
ValueError: invalid literal for int() with base10: '1C8'
This works but want to avoid row iteration.
for index, row in df.iterrows():
print(row['Command0'])
I'm reading this in from a CSV pd.read_csv(open_csv, nrows=20)
so if there is a way to read it in and explicitly tell it what the format is then that would be even better!
Upvotes: 22
Views: 37632
Reputation: 6831
The reverse operation (float to hex lossless conversion) would be:
df['Command0'].apply(float.hex)
Upvotes: 2
Reputation: 164773
You can use apply
as per @Andrew's solution, but lambda
isn't necessary and adds overhead. Instead, use apply
with a keyword argument:
res = df['Command0'].apply(int, base=16)
print(res)
0 456
1 195
Name: Command0, dtype: int64
With pd.read_csv
, you can use functools.partial
:
from functools import partial
df = pd.read_csv(open_csv, nrows=20, converters={'Command0': partial(int, base=16)})
Upvotes: 20
Reputation: 4089
You could use apply
.
df.Command0.apply(lambda x: int(x, 16))
>>>
0 456
1 195
Name: Command0, dtype: int64
And you can do this with pd.read_csv
call using the converters
parameter:
df = pd.read_csv("path.txt", converters={"Command0": lambda x: int(x, 16)})
Upvotes: 15