Matching a Repeating Pattern After Specific String

Question

I am trying to match a repeated pattern (of ip mumbers) but only after a specific string has occured. I could do this first by split and running regex on the second part but I am wondering if I can do this with a single regex. Example is

import re

s4 = """
ddddddhhhhhhf jjjjjj 111.222.33.444 dddddd ddddddddddd
ccccccccccc
xxxxxxxxxx xxxxxxxxxj kkkkkk kkkkkk xxxxx111.222.888.444yyyy
xxxxxxxxxx xxxxxxxxxj kkkkkk kkkkkk xxxxx111.555.888.444yyyy
dddddd jjjjjjj 333.222.33.444 111.222.33.444 jjjjjjjjjjjj
"""

I would like to match all ip numbers after ccccc. If I do

regex = "cccccc.*?(\d+\.\d+\.\d+\.\d+)+"
res = re.findall(regex, s4, re.DOTALL)

I only get 111.222.888.444. If I used

regex = "(\d+\.\d+\.\d+\.\d+)+"

I would get all ip numbers which is not what I need. Which regex syntax is necessary to make this work?

Thanks,

anubhava · Accepted Answer

You may use this regex based on alternation strategy to match and discard text till first match on LHS and keep matching regex on RHS in capture group:

(?s)^.*?c{11}|(\d+\.\d+\.\d+\.\d+)

RegEx Demo

Code:

>>> print (filter(None, re.findall(r'^(?s).*?c{11}|(\d+\.\d+\.\d+\.\d+)', s4)))
['111.222.888.444', '111.555.888.444', '333.222.33.444', '111.222.33.444']

Code Demo

filter is used to discard an empty match from output.

Matching a Repeating Pattern After Specific String

Answers (1)

Related Questions