Reputation: 123
I have 2 strings which are 2 records
string1 = "abc/BS-QANTAS\\/DS-12JUL15\\dfd"
string2 = "/DS-10JUN15\\/BS-AIRFRANCE\\dfdsfsdf"
BS is booking airline DS is Date
I want to use a single regex and extract the booking source & date. Please let me know if it is feasible. I have tried lookaheads and still couldn't achieve
The target language is Splunk and not Javascript. Whatever may be the language please post I'll give a try in Splunk
Upvotes: 1
Views: 716
Reputation: 75222
Here's a more scalable (and more readable, IMO) alternative to miroxlav's answer:
(?:\/BS-(?P<source>\w+)|\/DS-(?P<date>\w+)|[^\/\v]+)+
I'm assuming the fields you're interested in always start with a slash. That allows me to use [^/]+
to safely consume the junk between/around them.
This is effectively three regexes in one, wrapped in a group, to give each one a chance to match in turn, and applied multiple times. If the first alternative matches, you're looking at a "source airline" field, and the name is captured in the group named "source". If then second alternative matches, you're looking at the date, which is captured in the "date" group.
But, because the fields aren't in a predetermined order, the regex has to match the whole string to be sure of matching both fields (in fact, I should have used start and end anchors--^
and $
--to enforce that; I've added them below). The third alternative, [^/]+
, allows it to consume the parts that the first two can't, thus making an overall match possible. Here's the updated regex:
^(?:\/BS-(?P<source>\w+)|\/DS-(?P<date>\w+)|[^\/\v]+)+$
...and the updated demo. As noted in the comment, the \v
is there only because I'm combining your two examples into one multiline string and doing two matches. You shouldn't need it in real life.
Upvotes: 1
Reputation: 12194
This gives you both strings filled either in match groups airline1
+date1
or in airline2
+date2
:
((BS-(?<airline1>\w+).*DS-(?<date1>[\w]+))|(DS-(?<date2>[\w]+).*BS-(?<airline2>\w+)))
Since there are only 2 groups, I used simple permutation.
This regex will take last of occurrences, if there are more. If you need earliest one (using lookbehind), let me know.
Upvotes: 0