Reputation: 4614
I have the following text
<pattern name="pattern1"/>
<success>success case 1</success>
<failed> failure 1</failed>
<failed> failure 2</failed>
<unknown> unknown </unknown>
<pattern name="pattern4"/>
<pattern name="pattern5"/>
<success>success case 3</success>
<pattern name="pattern2"/>
<success>success case 2</success>
<otherTag>There are many other tags.</otherTag>
<failed> failure 3</failed>
<pattern name="pattern3"/>
<unknown>unkown</unknown>
And the regular expression <failed>[\w|\W]*?</failed>
matches all the lines contains failed tag.
What do I need to to if I want all failed tags and the pattern tag above the failed tag. if there is no failed tag underneath a pattern tag, then the pattern tag should not be matched? Basically, I want the following output:
<pattern name="pattern1"/>
<failed> failure 1</failed>
<failed> failure 2</failed>
<pattern name="pattern2"/>
<failed> failure 3</failed>
I am doing this in javascript, I do not mind of doing some intermediate steps.
edit start Almost all repliers suggest me to take a different approach. I am unsure which approach I should take. JQuery, regex or others. I am giving more information here for better decision making. The data format would change, but would not change often. The data is from a schematron validition report of file type ".SVRL" The structure of the file are have the following schema defined using "RELAX NG compact syntax"
schematron-output = element schematron-output {
attribute title { text }?,
attribute phase { xsd:NMTOKEN }?,
attribute schemaVersion { text }?,
human-text*,
ns-prefix-in-attribute-values*,
(active-pattern,
(fired-rule, (failed-assert | successful-report)*)+)+
}
the maps to active-pattern, and matches to failed-assert and successful-report respectively.
Now with additional information, which approach should I be taking? Thanks very much for helping out. :)
edit end
Upvotes: 1
Views: 1397
Reputation: 1745
Here are the RegExp you need:
<(pattern|failed)\b[^>]*(?:/>|>[^<]*</\1>)
Just escape the slashes when using in Javascript regular expression notation:
var regExp = /<(pattern|failed)\b[^>]*(?:\/>|>[^<]*<\/\1>)/gi;
var matchesArray = testString.match(regExp);
This regular expression will find whole <pattern> and <failed> tags, either if they are empty tags or not (<empty/> or <notEmpty></notEmpty>). It also considers possible element attributes.
Upvotes: 1
Reputation: 38112
You can use the regex "|" operator (meaning "or") to create a regex that will match one or more expressions. For example ...
/^<failed>[\w|\W]*?<\/failed>|^<pattern[^>]*>/
... should do what you're asking (based on the example you've given above).
But, as other commenters have said, parsing XML with regexs is a slippery slope. You'll probably want to look into other options, like using the DocumentFragment class to parse your string for you.
Upvotes: 1
Reputation: 117333
You should look into methods other than regular expressions to parse XML, particularly if:
See this answer for information about XML parsing in Javascript.
The easy solution is "use jQuery". If for some reason you don't want to load jQuery to do this, then start here.
Upvotes: 1