Clyde
Clyde

Reputation: 143

Joining the output from multiple regular expressions

I have a log file from a firewall in txt form, for example:

src=10.10.10.1 srcPort=15003 dst=20.20.20.1 service=443 host=FirewalName proto=tcp
src=30.30.30.1 srcPort=18003 dst=40.40.40.1 service=8080 host=FirewalName proto=tcp

I have the regular expressions built to extract the information I need (src, dst, service), but I need to be able to join the output of the regular expressions and write the output to file with a tab in between each and "TCP" before each service= object, for each line in the log file, so the output to the new file looks like:

10.10.10.1    20.20.20.1    TCP 443
30.30.30.1    40.40.40.1    TCP 8080

Also, I need to be able to differentiate between "TCP" and "UDP" in the service= portion of the input file, so that what is written to the output file is correct for example: if the third line of the input file is:

src=50.50.50.1 srcPort=21003 dst=60.60.60.1 service=161 host=FirewalName proto=udp

I'm stuck here and need help.

import re
import sys

with open("SFD-IPs.txt", "r") as file:
    text = file.read()

sources = re.findall(r'src=(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})', text)
dest = re.findall(r'dst=(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})', text)
service = re.findall(r'service=(\d+)', text)


with open("output.txt", "w") as TufinReq:
    TufinReq.write(sIP)

f=open("output.txt", "r")
del_list = ["src=", "dst=", "service="]
list = []
for line in f:
    for word in del_list:
        if word in line:
            line = line.replace(word, "")
    list.append(line)
f.close()
f=open("output.txt", "w+")
for line in list:
    f.write(line)
f.close()

Upvotes: 1

Views: 46

Answers (2)

Scott Weaver
Scott Weaver

Reputation: 7361

you can accomplish this with a single regex search and replace, with a much simpler pattern.

the regular expression:

src=([\d\.]+).*dst=([\d\.]+).*service=(\d+).*proto=(.*)

the replacement string:

$1    $2    $4 $3

demo

Upvotes: 0

ettanany
ettanany

Reputation: 19816

First you need to split each item of sources and dest to retrieve the desired data like below:

sources = [item.split('=')[1] for item in sources]
dest = [item.split('=')[1] for item in dest]

Now you can use zip() built-in function as follows:

with open('output.txt', 'w') as f:
    for item in zip(sources, dest, service):
        f.write('{}\t{}\tTCP\t{}\n'.format(*item))

If you want to add also the protocol, you can do the following:

proto = re.findall(r'proto=(\w+)', text)
proto = [item.upper() for item in proto]

with open('output.txt', 'w') as f:
    for item in zip(sources, dest, proto, service):
        f.write('{}\t{}\t{}\t{}\n'.format(*item))

Output:

The content of output.txt file for:

text = '''src=10.10.10.1 srcPort=15003 dst=20.20.20.1 service=443 host=FirewalName proto=tcp
src=30.30.30.1 srcPort=18003 dst=40.40.40.1 service=8080 host=FirewalName proto=tcp
src=50.50.50.1 srcPort=21003 dst=60.60.60.1 service=161 host=FirewalName proto=udp'''

Is like this:

10.10.10.1  20.20.20.1  TCP 443
30.30.30.1  40.40.40.1  TCP 8080
50.50.50.1  60.60.60.1  UDP 161

Upvotes: 1

Related Questions