Reputation: 11
I'm trying to using RegEx to capture some information between two 'tags'. Example: Some text and some more text Error message: http 404 not found Procedures: some text some text
What I need is to analyse this entire field, find the phrases "Error message:" and "Procedures:" and capture what is between them, in this case "http 404 not found". I need to show only the text between the tags and it is not necessary to show the tags.
I tried many things using RegEx and Grok but my attempts were not successful.
Does anyone have any idea how I can accomplish this?
Thanks a lot!
Upvotes: 1
Views: 1362
Reputation: 1367
As you have asked also for a Grok pattern I considered submitting another answer. In this case my solution is based on a grok patter using regular expressions to match the non relevant parts including at the end or the beginning the expected tags. The solution is:
(?<notImportant1>[A-Za-z ]* Error message:) (?<textBetweenTags>[A-Za-z0-9 ]*) (?<notImportant2>Procedures: [A-Za-z ]*)
It will provide you something like:
Here you have extracted as 'textBetweenTags' the substring you in which you were interested. Realize that if the text before or after the tags includes numbers or other symbols the regex should change.
EDIT: By the way, don't know if you are aware of the tool, but you can test the pattern here.
Upvotes: 0
Reputation: 1156
@capture = $text =~ m/(Error message:)(.*?)(Procedures:)/s;
$capture = join '', @capture;
$capture =~ s/<.*?>//g;
Just capture it, save it in array and do whatever you wish with it... Here I convert matches back to string, so I can apply another regex, which removes tags. You may of course apply such substitution regex on each of your arrays' indexes.
Hope that this code does not contain errors, I didn't compile it and hope you can find it's alternatives if you are using different language than perl 5.
Upvotes: 0