Hultan
Hultan

Reputation: 125

Replacing text with a certain style in a word document does not work for all OpenXmlPart:s

I am trying to replace a bunch of "fields" in a Word document that is stored in a database. The "fields" are really just text that is formatted with a style (I believe it is called Quick Style in Word) with a specific name chosen by us.

This class works fine for all the header parts and footer parts, but it will not work for the body (MainDocumentPart) for some reason. I can see when I debug that the styles are found in the body, just as they are in the header parts and footer parts, and the texts are inserted but when I check the Word document afterwards, only the Headers and Footers are updated. The body still contains the old values.

The XML in the Word document might look like this :

  <w:p w:rsidR="00394599" w:rsidRPr="00162F1F" w:rsidRDefault="00394599" w:rsidP="000663BC">
    <w:pPr>
      <w:pStyle w:val="NovaIssuedBy"/>
    </w:pPr>
    <w:r>
      <w:t>NovaIssuedBy</w:t>
    </w:r>
  </w:p>

Of course it is the text NovaIssuedBy in the w:t element that should be replaced, and as I said, this code works for similar "fields" in the headers and footers.

The sub UpdateNOVAFieldsInternal goes through all parts (I think) in the document, all headers and the body and the footers. Every part (called section in this function) is checked if it contains certain styles and a text is replaced if needed.

The sub CheckSection checks a section for all the styles that we have predefined and replaces text if needed.

The sub FindStyleReplaceTextInSection does the magic, it finds all parts marked with the style StyleName and replaces them with the text in the argument text.

Does anybody have any idea why this code is working well for the header parts and footer parts, but not for the body (MainDocumentPart)? Does anyone have a better way to solve this "problem" of updating certain texts at specific places in a Word document (not just once, but repeatedly) than to use Styles and Style names like we do in this solution?

Option Strict On
Option Infer On

Imports Nova.Datasets
Imports DocumentFormat.OpenXml.Packaging
Imports DocumentFormat.OpenXml.Wordprocessing
Imports DocumentFormat.OpenXml
Imports System.Collections.Generic
Imports System.Xml
Imports System.IO
Imports System.Text
Imports System.Xml.Linq
Imports System.Linq

Imports <xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">

Public Class NovaFields
    Private m_Document As EmptyDocument.Data_DocumentRow = Nothing
    Private m_Data As Byte()

    Public Sub New(ByRef document As EmptyDocument.Data_DocumentRow)
        m_Document = document

        With m_Document
            If Not .FileExtension.ToUpper() = "DOCX" Then
                'Exception!
                'This is not a DOCX file!
                Throw New ApplicationException("This is not a DOCX file!")
            End If

            m_Data = .FileData
        End With

    End Sub

    Public Sub UpdateNOVAFields(ByVal parameters As NovaParameters)
        UpdateNOVAFieldsInternal(parameters)

        m_Document.FileData = m_Data
    End Sub

    ''' <summary>
    ''' This will replace all "fields" that are set in parameters in the document in m_data
    ''' </summary>
    ''' <param name="parameters"></param>
    ''' <remarks></remarks>
    Private Sub UpdateNOVAFieldsInternal(ByVal parameters As NovaParameters)
        Using documentStream As New MemoryStream()
            ' Read all the bytes, except the last Zero-byte that "closes the file", hence the -1
            documentStream.Write(m_Data, 0, m_Data.Length - 1)

            Using document As WordprocessingDocument = WordprocessingDocument.Open(documentStream, True)
                ' Assign a reference to the existing document body. 
                Dim body As Body = document.MainDocumentPart.Document.Body

                Dim headerPart As OpenXmlPart
                Dim footerPart As OpenXmlPart

                ' Check each Header-part
                For Each headerPart In document.MainDocumentPart.HeaderParts
                    CheckSection(parameters, headerPart)
                Next headerPart

                ' Check the Body-part
                CheckSection(parameters, document.MainDocumentPart)

                ' Check each Footer-part
                For Each footerPart In document.MainDocumentPart.FooterParts
                    CheckSection(parameters, footerPart)
                Next footerPart

                ' Close and save the document
                document.Close()
            End Using

            ' We must add an extra Zero-byte at the end of the stream to "close the file"
            documentStream.Position = documentStream.Length
            documentStream.WriteByte(0)
            m_Data = documentStream.ToArray()

        End Using
    End Sub
    ''' <summary>
    ''' Check the section provided for all parameters(styles)
    ''' </summary>
    ''' <param name="parameters">The parameters to use</param>
    ''' <param name="section">The section to check</param>
    ''' <remarks></remarks>
    Private Sub CheckSection(parameters As NovaParameters, ByRef section As OpenXmlPart)
        ' A bunch of if-statements like the one below are removed just to shorten the text

        ' IssuedBy
        If (parameters.IssuedBySet) Then
            FindStyleReplaceTextInSection(parameters.IssuedByStyleName, parameters.IssuedBy, section)
        End If

    End Sub

    ''' <summary>
    ''' This function will replace the text in a section formatted with a style called styleName in the section provided
    ''' </summary>
    ''' <param name="styleName">The name of the style to replace the text in</param>
    ''' <param name="text">The new text that will be replacing the old text in the document</param>
    ''' <param name="section">The section to scan for a style with the name styleName</param>
    ''' <remarks></remarks>
    Private Sub FindStyleReplaceTextInSection(styleName As String, text As String, ByRef section As OpenXmlPart)
        Try
            Dim xDoc As XDocument = XDocument.Load(XmlReader.Create(section.GetStream()))

            ' Get all Style elements with an attribute that starts with styleName (sometimes Word adds "Char" after the style name)
            Dim foundStyles As IEnumerable(Of XElement) = _
            From element In xDoc.Root.Descendants() _
            Where Not String.IsNullOrEmpty(element.@w:val) AndAlso element.@w:val.StartsWith(styleName) _
            Select element

            Dim w As XNamespace = "http://schemas.openxmlformats.org/wordprocessingml/2006/main"

            For Each item In foundStyles
                ' Get the Style-elements parents parent
                Dim parent As XElement = item.Parent.Parent

                ' Check if it is a Run element or Paragraph element
                If parent.Name.LocalName = "r" Then
                    ' Run

                    ' Remove old text elements
                    parent...<w:t>.Remove()
                    ' Add a new text element with the text provided
                    parent.Add(<w:t><%= text %></w:t>)
                Else
                    ' Paragraph, has an extra layer around the Run element

                    ' Remove old text elements
                    parent...<w:t>.Remove()

                    ' Tried different ways of doing it here

                    ' First way of doing it, seems to work only for Header and Footer
                    Dim run As XElement = parent.Element(w + "r")
                    run.Add(<w:t><%= text %></w:t>)

                    ' Second way of doing it, this works too for Header and Footer
                    'parent.<w:r>.FirstOrDefault().Add(<w:t><%= text %></w:t>)
                End If
            Next

            ' Save the XML into the package.
            Using writer As XmlWriter = XmlWriter.Create(section.GetStream(FileMode.Create, FileAccess.Write))
                xDoc.Save(writer)
            End Using
        Catch ex As Exception
            Debug.Print("Error in FindStyleReplaceTextInSection!")
        End Try
    End Sub
End Class

Edit: Visual studio 2010 + Framework 3.5

Upvotes: 1

Views: 267

Answers (1)

Hultan
Hultan

Reputation: 125

For some reason the body-part must be checked before the Header and Footer. I just moved the Body-part up before the Header-part, and now it works!

' Check the Body-part
CheckSection(parameters, document.MainDocumentPart)

' Check each Header-part
For Each headerPart In document.MainDocumentPart.HeaderParts
    CheckSection(parameters, headerPart)
Next headerPart

' Check each Footer-part
For Each footerPart In document.MainDocumentPart.FooterParts
    CheckSection(parameters, footerPart)
Next footerPart

Upvotes: 1

Related Questions