CptSnuggles
CptSnuggles

Reputation: 137

Convert single values of object column from hex to int with pandas

So, I have this DataFrame based on an imported .csv file which sort of looks like this:

import numpy as np

index              ip     port    ...
    0     192.168.0.1        0
    1     192.168.0.1   0xcc09
                  ...
    n-1   192.168.0.1    0x1bb
    n     192.168.0.1      443

 #   Column            Non-Null Count   Dtype
---  ------            --------------   -----
 0   ip                x     non-null  object
 1   port              y     non-null  object
...

so it basically has a lot of rows and columns with one column being a port number. Unfortunately as you can see in rows 1 and n-1, the base .csv data set has some of these ports written in hex and not in decimal, so pandas is throwing an error if I try to df = df.astype({"port": np.int64,}).

My question: How can I locate all the rows of the DataFrame which are written in hex and convert them in-place to their int(x, 16) value, to finally be able to convert the whole column to np.int64? The final DataFrame should look like this:

import numpy as np

index              ip     port    ...
    0     192.168.0.1        0
    1     192.168.0.1    52233
                  ...
    n-1   192.168.0.1      443
    n     192.168.0.1      443

 #   Column            Non-Null Count   Dtype
---  ------            --------------   -----
 0   ip                x     non-null  object
 1   port              y     non-null  int64
...

Upvotes: 1

Views: 182

Answers (1)

jezrael
jezrael

Reputation: 863246

Use custom lambda function with test x in data:

df['port'] = df['port'].apply(lambda x: int(x, 16) if 'x' in x else x)
print (df)
  index           ip   port
0     0  192.168.0.1      0
1     1  192.168.0.1  52233
2   n-1  192.168.0.1    443
3     n  192.168.0.1    443

Upvotes: 2

Related Questions