Reputation: 103
I want to get a list and filter it (In this case it's a list of a record, a domain name and an ip). I want the list to be something like so:
10.0.0.10 ansible0 ben1.com
ansible1 ben1.com 10.0.0.10
Aka you can put the ip the zone and the record anywhere and it will still catch them.
Now i got 2 regex, one that catches the domain (with the dot) and the IP:
Domain: [a-zA-Z0-9][a-zA-Z0-9-]{1,61}[a-zA-Z0-9]\.[a-zA-Z]{2,}
Simple IP: (?:[0-9]{1,3}\.){3}[0-9]{1,3}
With these i can catch in python all the domain names and put them into a list and all ips.
Now i only need to catch the "subdomain" (In this case ansible1 and ansible0).
I want it to be able to have numbers and characters like - _ *
and so on, anything but a .
.
How can i do it via regex?
Upvotes: 1
Views: 67
Reputation: 785276
You can use this regex with 3 alternations and 3 named groups:
(?P<domain>[a-zA-Z0-9][a-zA-Z0-9-]{1,61}[a-zA-Z0-9]\.[a-zA-Z]{2,})|
(?P<ip>(?:[0-9]{1,3}\.){3}[0-9]{1,3})|
(?P<sub>[^\s.]+)
Named groups domain
and ip
are using regex you've provided. 3rd group is (?P<sub>[^\s.]+)
that is matching 1+ of any characters that are not dot and not whitespace.
Code:
import re
arr = ['10.0.0.10 ansible0 ben1.com', 'ansible1 ben1.com 10.0.0.10']
rx = re.compile(r'(?P<domain>[a-zA-Z0-9][a-zA-Z0-9-]{1,61}[a-zA-Z0-9]\.[a-zA-Z]{2,})|(?P<ip>(?:[0-9]{1,3}\.){3}[0-9]{1,3})|(?P<sub>[^\s.]+)')
subs = []
for i in arr:
for m in rx.finditer(i):
if (m.group('sub')): subs.append(m.group('sub'))
print (subs)
Output:
['ansible0', 'ansible1']
Upvotes: 1