Reputation: 71
I've got the following text:
instance=hostname1, topic="AB_CD_EF_12345_ZY_XW_001_000001"
instance=hostname2, topic="AB_CD_EF_1345_ZY_XW_001_00001"
instance=hostname1, topic="AB_CD_EF_1235_ZY_XW_001_000001"
instance=hostname2, topic="AB_CD_EF_GH_4567_ZY_XW_01_000001"
instance=hostname1, topic="AB_CD_EF_35678_ZY_XW_001_00001"
instance=hostname2, topic="AB_CD_EF_56789_ZY_XW_001_000001"
I would like to capture numbers from the sample above. I've tried to do so with the regular expressions below and they work well as separate queries:
Regex: *.topic="AB_CD_EF_([^_]+).*
Matches: 12345 1345 1235
Regex: *.topic="AB_CD_EF_GH_([^_]+).*
Matches: 4567 35678 56789
But I need a regex which can give me all numbers, ie:
12345 1345 1235 4567 35678 56789
Upvotes: 1
Views: 2100
Reputation: 71
The regex worked for me :
/.*topic="(?:[AB_CD_EF_(GH_)]{2,3}_)+([^_]]+).*/
Upvotes: 0
Reputation: 27723
Another option that we might call, would be an expression similar to:
topic=".*?[A-Z]_([0-9]+)_.*?"
and our desired digits are in this capturing group ([0-9]+)
.
Please see the demo for additional explanation.
Upvotes: 1
Reputation: 425063
Make GH_
optional:
.*topic="AB_CD_EF_(GH_)?([^_]+).*
which matches all your target numbers.
See live demo.
You could be more general by allowing any number of "letter letter underscore" sequences using:
.*topic="(?:[A-Z]{2}_)+([^_]+).*
See live demo.
Upvotes: 2
Reputation: 1943
From the examples and conditions you've given I think you're going to need a very restrictive regex, but this may depend on how you want to adapt it. Take a look at the following regex and read the breakdown for more information on what it does. Use the first group (there is only one in this regex) as a substitution to retrieve the numbers you are looking for.
Regex
^instance\=hostname[0-9]+\,\s*topic\=\“[A-Z_]+([0-9]+)_[A-Z_]+[0-9_]+\”$
Try it out in this DEMO.
Breakdown
^ # Asserts position at start of the line
hostname[0-9]+ # Matches any and all hostname numbers
\s* # Matches whitespace characters (between 0 and unlimited times)
[A-Z_]+ # Matches any upper-case letter or underscore (between 1 and unlimited times)
([0-9]+) # This captures the number you want
$ # Asserts position at end of the line
Although this does answer the question you have asked I fear this might not be exactly what you're looking for but without further information this is the best I can give you. In any case after you've studied the breakdown and played around the demo a it it should prove to be of some help.
Upvotes: 0