Stanislav Aleksiev
Stanislav Aleksiev

Reputation: 13

Regex Patterns with Python

I'm trying to make a script/program where I can find matching IP Addresses in two text files:

And I want to use regex and I'm not really sure how to do it.

Example:

import re


def check(fname1, fname2):
    f2 = open(fname2)
    f1 = open(fname1)
    pattern = ('\d{1,3}\.\d{1,3}\.\d{1,3}')

    for line in f1:
        p1 = re.match(pattern, line)
        out_p1 = p1.group(0)
        for item in f2:
            p2 = re.match(pattern, item)
            out_p2 = p2.group(0)
            if out_p1 in out_p2:
                print(line, item)

So I'm trying to match an IP address from the first text file with a subnet from the second text file. Then I want to output the IP address with it's matching subnet.

Like so:

#IP      #Subnet
1.1.1.1, 1.1.1.0/28
8.8.10.5, 8.8.8.0/23

Upvotes: 1

Views: 90

Answers (2)

Shawn Mehan
Shawn Mehan

Reputation: 4568

Leaving aside the pulling of lines of data from your two text files into program memory (simply f1 = open(fname1, 'r').readlines() for example), assume you have two lists of lines.

import re

f1 = ['1.1.1.1', '192.168.1.1', '192.35.192.1', 'some other line not desired']


f2 = ['1.1.1.0/28', '1.2.2.0/28', '192.168.1.1/8', 'some other line not desired']



def get_ips(text):
    # this will match on any string containing three octets
    pattern = re.compile('\d{1,3}\.\d{1,3}\.\d{1,3}')
    out = []
    for row in text:
        if re.match(pattern, row):
            out.append(re.match(pattern, row).group(0))
    return out


def is_match(ip, s):
    # this will return true if ip matches what it finds in string s
    if ip in s:
        return True


def check(first, second):
    # First iterate over each IP found in the first file
    for ip in get_ips(first):
        # now check that against each subnet line in the second file
        for subnet in second:
            if is_match(ip, row):
                print('IP: {ip} matches subnet: {subnet}')

Note that I have tried to break up some of the functionality to separate concerns. You should be able to modify each function separately. This is assuming that you get your lines into some lists of strings. I also am not certain what you really want to match in F2 so this should allow you to modify is_match() while leaving the other parts unaffected.

Good luck.

Upvotes: 0

jmcgriz
jmcgriz

Reputation: 3358

By running that nested loop, you're going to do a lot of unnecessary processing, it'd make more sense to append all of the matches from the first file into a list, then check against that list with the matches from the second file. This is an approximation of the process here using two local lists:

import re

input1 = ['1.1.1.1', '233.123.4.125']
input2 = ['1.1.1.1/123', '123.55.2.235/236']
pattern = ('^(\d{1,3}\.?){4}')
matchlist = []


for line in input1:
  p1 = re.match(pattern, line)
  matchlist.append(p1.group(0))

print(matchlist)

for item in input2:
  p2 = re.match(pattern, item)
  t = p2.group(0)
  if t in matchlist:
    print t

Upvotes: 1

Related Questions