Reputation: 17883
So I have an xml file with the following simplified xml file contents:
<CollectionItems>
<CollectionItem>
<Element1>Value1</Element1>
<Element2>
<SubElement1>SubValue1</SubElement1>
<SubElement2>SubValue2</SubElement2>
<SubElement3>SubValue3</SubElement3>
</Element2>
<Element3>Value3</Element3>
</CollectionItem>
<CollectionItem>
<Element1>Value1</Element1>
<Element2>
<SubElement1>SubValue1</SubElement1>
<SubElement2 />
<SubElement3>SubValue3</SubElement3>
</Element2>
<Element3>Value3</Element3>
</CollectionItem>
<CollectionItem>
<Element1>Value1</Element1>
<Element2>
<SubElement1>SubValue1</SubElement1>
<SubElement2>SubValue2</SubElement2>
<SubElement3>SubValue3</SubElement3>
</Element2>
<Element3>Value3</Element3>
</CollectionItem>
</CollectionItems>
I am attempting to write a regex in .Net which matches any CollectionItem where SubElement2 is empty (the middle CollectionItem in this example).
I have the following regex so far (SingleLine mode enabled):
<CollectionItem>.+?<SubElement2 />.+?</CollectionItem>
The problem is that it is matching the opening of the first CollectionItem through the close of the second CollectionItem. I understand why it's doing this, but I don't know how to modify the regex to make it match only the center CollectionItem.
Edit: As to why regex as opposed to something else:
Thanks!
Upvotes: 0
Views: 118
Reputation: 336428
You could use
<CollectionItem>((?!<CollectionItem>).)+?<SubElement2 />.+?</CollectionItem>
This ensures that no further <CollectionItem>
comes between the starting tag and the <SubElement2 />
tag.
Upvotes: 2
Reputation: 72920
This is XML - why are you trying to do this with Regex? Wouldn't XPath make more sense?
Upvotes: 3
Reputation: 1502845
Why are you trying to use a regular expression? You've got a perfectly good domain model (XML) - why not search that instead? So for example in LINQ to XML:
var collectionsWithEmptySubElement2 =
document.Descendants("SubElement2")
.Where(x => x.IsEmpty)
.Select(x => x.Ancestors("CollectionItem").FirstOrDefault());
or
var collectionsWithEmptySubElement2 =
document.Descendants("CollectionItem")
.Where(x => x.Descendants("SubElement2").Any(sub => sub.IsEmpty));
Upvotes: 5