Reputation: 117
Below is my df.
import pandas as pd
df = pd.DataFrame ({
'IP':['10.140.34.210;0.0.0.0','0.0.0.0;0.0.0.0;10.0.1.87;0.0.0.0;0.0.0.0','0.0.0.0;172.31.48.174',
'10.140.67.244;0.0.0.0', '1.1.1.1','3.3.3.3'],
})
print(df)
IP
0 10.140.34.210;0.0.0.0
1 0.0.0.0;0.0.0.0;10.0.1.87;0.0.0.0;0.0.0.0
2 0.0.0.0;172.31.48.174
3 10.140.67.244;0.0.0.0
4 1.1.1.1
5 3.3.3.3
What I would like to achieve is to keep in the IP column just the correct IP address without any 0.0.0.0. This is the expected output.
IP
0 10.140.34.210
1 10.0.1.87
2 172.31.48.174
3 10.140.67.244
4 1.1.1.1
5 3.3.3.3
I tried with split but it doesn't do the job.
df = df['IP'].str.split(';',expand=True)
print(df)
0 1 2 3 4
0 10.140.34.210 0.0.0.0 None None None
1 0.0.0.0 0.0.0.0 10.0.1.87 0.0.0.0 0.0.0.0
2 0.0.0.0 172.31.48.174 None None None
3 10.140.67.244 0.0.0.0 None None None
4 1.1.1.1 None None None None
5 3.3.3.3 None None None None
Any idea? Thank you!
Upvotes: 0
Views: 48
Reputation: 22493
If thats the only exceptional case you need to get rid of, use replace
with regex:
print(df["IP"].replace(";?0\.0\.0\.0;?","", regex=True))
0 10.140.34.210
1 10.0.1.87
2 172.31.48.174
3 10.140.67.244
4 1.1.1.1
5 3.3.3.3
Name: IP, dtype: object
Upvotes: 3
Reputation: 651
Try
df[~df.IP.str.contains("0.0.0.0")]
IP
4 1.1.1.1
5 3.3.3.3
Upvotes: 0