Reputation: 137
So, I have this DataFrame based on an imported .csv file which sort of looks like this:
import numpy as np
index ip port ...
0 192.168.0.1 0
1 192.168.0.1 0xcc09
...
n-1 192.168.0.1 0x1bb
n 192.168.0.1 443
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 ip x non-null object
1 port y non-null object
...
so it basically has a lot of rows and columns with one column being a port number. Unfortunately as you can see in rows 1
and n-1
, the base .csv data set has some of these ports written in hex and not in decimal, so pandas is throwing an error if I try to df = df.astype({"port": np.int64,})
.
My question: How can I locate all the rows of the DataFrame which are written in hex and convert them in-place to their int(x, 16)
value, to finally be able to convert the whole column to np.int64
? The final DataFrame should look like this:
import numpy as np
index ip port ...
0 192.168.0.1 0
1 192.168.0.1 52233
...
n-1 192.168.0.1 443
n 192.168.0.1 443
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 ip x non-null object
1 port y non-null int64
...
Upvotes: 1
Views: 182
Reputation: 863246
Use custom lambda function with test x
in data:
df['port'] = df['port'].apply(lambda x: int(x, 16) if 'x' in x else x)
print (df)
index ip port
0 0 192.168.0.1 0
1 1 192.168.0.1 52233
2 n-1 192.168.0.1 443
3 n 192.168.0.1 443
Upvotes: 2