Keithin8a
Keithin8a

Reputation: 961

How to filter an xml list based on whether it has a particular attribute

I have been trying to wrap my brain around something for a few hours now.

I am writing an app which will strip out the comments from a word document and write them in a table in another document for auditing purposes. As a requirement it needs to contain a line reference to where the comment came from and also if it is a reply it needs to contain a reference to the parent comment.

I have managed to find all the 3 document parts from the word document using DocumentFormat.OpenXml library. However I am coming stuck when trying to get the reply comments.

The XML which contains the references to the comments and their parents is as follows

<w15:commentsEx xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" mc:Ignorable="w14 w15 w16se wp14">
  <w15:commentEx w15:paraId="739FE385" w15:done="0" />
  <w15:commentEx w15:paraId="64E7F09D" w15:done="0" />
  <w15:commentEx w15:paraId="04DC26C3" w15:done="0" />
  <w15:commentEx w15:paraId="55A4D8B0" w15:paraIdParent="04DC26C3" w15:done="0" />
</w15:commentsEx>

Now I think my problem is due to the fact that they all have namespaces so I have to use a where clause to get the local name of the attribute. For example

CommentsEx.Descendants().Where(x => x.Name.LocalName == "commentEx")

I have an list of type MyComment which has the comment text, author, xmlId (paraId in the xml) and a reference to its parent (paraIdParent in the xml) and I want to now get a list of all the comments which have parents. I have tried getting a list of commentEx and then calling the following linq statement

var replyComments = comment.Attributes()
                .Where(x => x.Name.LocalName == "paraIdParent").ToList();

but that just returns me a list of the attributes themselves, not a list of commentEx's that contain the attribute.

If I try to just get the value of the attribute it causes it to crash because the attribute doesn't exist on all tags.

So I guess in summary. I need to traverse the commentsEx and look for comments which have parents. I need to then use the attribute paraId to get the correct comment from my list to be able to add a link to the parent by using the paraIdParent. But I can't get it to work. Am I using the wrong tools? should I not be using linq?

Upvotes: 0

Views: 493

Answers (2)

Rahul Singh
Rahul Singh

Reputation: 21795

I guess LINQ-to-XML will make your task much easy. You can specify the namespace of w15 along with node name. You can make use of XNamespace class for this:-

XDocument xdoc = XDocument.Load(@"YourXMLPath");
XNamespace ns = "http://schemas.microsoft.com/office/word/2012/wordml";
IEnumerable<XElement> replyComments = xdoc.Root.Elements(ns + "commentEx")
                    .Where(x => (string)x.Attribute(ns + "paraIdParent") != null);

Update:

You can just check for null instead as (string)x.Attribute(ns + "paraIdParent") will return null if attribute is not found.

Upvotes: 1

Caleb Mauer
Caleb Mauer

Reputation: 672

Try something like this:

var replyComments = (from comment in CommentsEx.Descendants()
                    where comment.Name.LocalName == "commentEx"
                    from attrib in comment.Attributes()
                    where attrib.Name.LocalName == "paraIdParent"
                    select comment).ToList();

Upvotes: 1

Related Questions