Reputation: 2646
Input String :
-line[8qWWQ5-swd-WER-DWDS]]<-SUCCESS#[14][MY_SAMPLE_TEST]
-line[8qWWQ5-swd-WER-DWDS]]<-SUCCESS#[4][MY_SAMPLE_TEST2]
-line[8qWWQ5-swd-WER-DWDS]]<Failed#[17][[14]SERVERERROR(TYPE-241)
Expected output :
MY_SAMPLE_TEST
MY_SAMPLE_TEST2
SERVERERROR
My regular expression : (?<=#).*
The above regular expression I can get everything after # , also I tried :
rex = (?<=#\[...\[).*(?=])
which gives me correct output for the first line i.e : MY_SAMPLE_TEST but as 2nd line has only one digit i.e 4 so it doesn't matches , similar problem with 3rd line
It is possible to write a single expression which could give the expected out put ? , Any help would be great
Upvotes: 1
Views: 113
Reputation: 1
Made an assumption that the matched string should be ending just before it finds a closing ] or starting ( characters. Here is the working regex :
#(?:\[+\d+\]+)*\[?([^\(\]]+)(?:\(.+\))?\]?
It worked on the samples provided without assuming spaces or underscores in the text to be extracted. Here is a demo link : https://regexr.com/47muk
Upvotes: 0
Reputation: 626738
You may capture these values using
#(?:\[+\d+]+)*\[*([^][()]+)
See the regex demo
Details
#
- a hash sign(?:\[+\d+]+)*
- 0 or more repetitions of:
\[+
- 1+ [
chars\d+
- 1+ digits]+
- 1+ ]
chars\[*
- 0+ [
chars([^][()]+)
- Group 1: one or more chars other than (
, )
, [
and ]
import re
strs = ['-line[8qWWQ5-swd-WER-DWDS]]<-SUCCESS#[14][MY_SAMPLE_TEST]', '-line[8qWWQ5-swd-WER-DWDS]]<-SUCCESS#[4][MY_SAMPLE_TEST2]', '-line[8qWWQ5-swd-WER-DWDS]]<Failed#[17][[14]SERVERERROR(TYPE-241)']
rx = re.compile(r'#(?:\[+\d+]+)*\[*([^][()]+)')
for s in strs:
m = rx.search(s)
if m:
print(m.group(1))
Output:
MY_SAMPLE_TEST
MY_SAMPLE_TEST2
SERVERERROR
Upvotes: 1