Reputation: 1
I am new to Python and stackoverflow, very new.
I want to extract the destination port:
2629 > 0 [SYN] Seq=0 Win=512 Len=100
0 > 2629 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
0 > 2633 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
I want to retrieve destination ports for every line: '0' , '2629', '2633' using python regex and ignore the rest (the number that appears after '>' and before '['.
re.findall("\d\d\d\d\d|\d\d\d\d|\d\d\d|\d\d|\d", str)
but this is very generic one. What is the best regex for such scenario?
Upvotes: 0
Views: 408
Reputation: 7204
you could use a regex like this:
dff=io.StringIO("""2629 > 0 [SYN] Seq=0 Win=512 Len=100
0 > 2629 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
0 > 2622 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
0 > 2633 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0""")
dff.seek(0)
for line in dff:
print(re.search('(^\d+\s+\>\s+)(\d+)', line).groups()[1])
Upvotes: 0
Reputation: 662
You could use the split function on string for this specific case. A quick implementation would be:
dest_ports = []
lines = [
"2629 > 0 [SYN] Seq=0 Win=512 Len=100",
"0 > 2629 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0",
"0 > 2633 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0"
]
for line in lines:
dest_ports.append(line.split('> ')[1].split(' [')[0])
Which would yield the answer:
dest_ports = ['0', '2629', 2633']
Upvotes: 1