Jonah
Jonah

Reputation: 1545

Accessing a .docx file's XML programatically?

If you take a .docx file, rename it to .zip, and unzip it, you can view its .xml files. I'm building a program to programmatically inspect these XML properties (no existing API seems to suffice as our company is using a 3rd party program that attaches custom XML to files, and that program does not have an API).

Is there a clean way to access this XML without programmatically saving copies of files as .zip files, opening them, taking out only the XML and then deleting the rest?

Upvotes: 1

Views: 354

Answers (2)

Pavithran R
Pavithran R

Reputation: 66

use openxml sdk to fetch all the xml elements

WordprocessingDocument document = WordprocessingDocument.Open(this.FilePath, true); MainDocumentPart mainPart = document.MainDocumentPart; List<OpenXmlElement> ParagraphElements = new List<OpenXmlElement>(); foreach (var i in mainPart.Document.ChildElements.FirstOrDefault().ChildElements) { ParagraphElements.Add(i); }

Here is your complete solution, From ParagraphElements all XML elements can be retrieved. This's easy way to access XML elements present in it.

Upvotes: 1

Stephen Wilson
Stephen Wilson

Reputation: 1514

Have you tried the Open XML SDK for Office?

Allows you to access the xml files inside .docx files.

Upvotes: 1

Related Questions