regular expression to capture groups in selected lines

Question

I have multi line string below (in python) and looking for regex to extract src, dst and severity. So in the example below group1 be '10.4.180.5' , group 2 '34.23.21.10' and group 3 'critical'

    src: 10.4.180.25
    dst: 34.23.21.10
    natsrc: 20.160.129.5
    natdst: 34.33.21.10
... more lines
    severity: critical
... more lines

If I try regex like /src: (\b\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\b) dst: (\b\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\b) / with gm flags it will find me src and dst but not severity which is few lines down (lines omitted for clarity). Is there a way to do it without including all of these lines between src, dst and severity ?

Wiktor Stribiżew · Accepted Answer

You missed need to actually match any number of lines that do not start with severity after what your pattern matches. Besides, you may shorten the pattern by using {3} limiting quantifier in order not to repeat \.\d{1,3} so many times. Note than between a whitespace and a digit, the word boundary is implicit, it is already there, no need to use \b.

Use

src:\s*(\d{1,3}(?:\.\d{1,3}){3})
dst:\s*(\d{1,3}(?:\.\d{1,3}){3})(?:
(?!severity).+)*?
severity:\s*(.+)

See the regex demo

Details

src: - a literal substring
\s* - 0+ whitespaces
(\d{1,3}(?:\.\d{1,3}){3}) - Group 1: IP-like pattern
- a newline
dst:\s* - dst: with 0+ whitespaces after it
(\d{1,3}(?:\.\d{1,3}){3}) - Group 1: IP-like pattern
(?: (?!severity).+)*? - 0+ sequences (as few as possible) of
- (?!severity) - a newline not followed with severity
- .+ - the whole line
severity:\s* - a newline, severity: substring and 0+ whitespaces
(.+) - Group 3: 1 or more chars up to the end of the line

Note you do not need any DOTALL modifier with this regex.

regular expression to capture groups in selected lines

Answers (2)

Related Questions