DKE
DKE

Reputation: 11

Issue with Document Splitting VBA Code - Incorrect Number of Documents Extracted

I'm encountering an issue with a VBA code that I'm using in MS Word to split a document into separate files based on a specific criteria. The document I'm working with, "WHITGL0T1.docx," has 47 pages. The goal is to extract separate documents whenever the phrase "Ntra Ref" is found.

I have tried using the following code:

Sub SplitDocument()
    Dim SourcePath As String
    Dim DestinationPath As String
    Dim SourceDoc As Document
    Dim i As Integer
    Dim isDocumentStart As Boolean
    Dim docNumber As Integer
    Dim startPage As Integer
    Dim endPage As Integer

    SourcePath = "C:\Users\u1285829\OneDrive - MMC\Desktop\Spain\Eurosys PDF\"
    DestinationPath = "C:\Users\u1285829\OneDrive - MMC\Desktop\Spain\Splitted documents\"
    docNumber = 1
    isDocumentStart = False

    Set SourceDoc = Documents.Open(SourcePath & "WHITGL0T1.docx")

    For i = 1 To SourceDoc.ComputeStatistics(wdStatisticPages)
        If SourceDoc.Range.GoTo(wdGoToPage, wdGoToAbsolute, i).Find.Execute(FindText:="Ntra Ref",            MatchWholeWord:=True) Then
            If Not isDocumentStart Then
                startPage = i
                isDocumentStart = True
            Else
                endPage = i - 1
                SourceDoc.ExportAsFixedFormat2 OutputFileName:=DestinationPath & "Document_" & docNumber &     ".pdf", ExportFormat:=wdExportFormatPDF, Range:=wdExportFromTo, From:=startPage, To:=endPage
                docNumber = docNumber + 1
                startPage = i
            End If
        End If
    Next i

    If isDocumentStart Then
        endPage = SourceDoc.ComputeStatistics(wdStatisticPages)
        SourceDoc.ExportAsFixedFormat2 OutputFileName:=DestinationPath & "Document_" & docNumber & ".pdf",      ExportFormat:=wdExportFormatPDF, Range:=wdExportFromTo, From:=startPage, To:=endPage
    End If

    SourceDoc.Close SaveChanges:=False

    MsgBox "The document has been split into separate documents based on the 'Ntra Ref' criteria and saved   in the destination folder.", vbInformation
End Sub

The code seems to be working, but it is extracting all 47 pages as separate documents, even though the phrase "Ntra Ref" appears only 44 times in the document. I expected to have 44 separate documents extracted.

I have reviewed the code and made sure that the logic is correct. However, I'm unable to identify the issue causing this behavior. I suspect that there might be a problem with the condition or loop that determines when to start and end a document extraction.

Could someone please review the code and provide any insights or suggestions on how to fix this issue? I would greatly appreciate any help or guidance.

Thank you in advance!

Upvotes: 1

Views: 74

Answers (1)

jonsson
jonsson

Reputation: 1301

(Problems outlined below the code). Instead, try starting with:

Sub SplitDocument()
    Dim SourcePath As String
    Dim DestinationPath As String
    Dim SourceDoc As Document
    Dim i As Integer
    Dim isDocumentStart As Boolean
    Dim docNumber As Integer
    Dim startPage As Integer
    Dim endPage As Integer
    Dim r As Range

    SourcePath = "C:\Users\u1285829\OneDrive - MMC\Desktop\Spain\Eurosys PDF\"
    DestinationPath = "C:\Users\u1285829\OneDrive - MMC\Desktop\Spain\Splitted documents\"
    docNumber = 1
    isDocumentStart = False

    Set SourceDoc = Documents.Open(SourcePath & "WHITGL0T1.docx")
    Set r = SourceDoc.Range
    docNumber = 0
    Do While r.Find.Execute(FindText:="Ntra Ref", MatchWholeWord:=True)
      docNumber = docNumber + 1
      SourceDoc.ExportAsFixedFormat2 _
        OutputFileName:=DestinationPath & "Document_" & docNumber &     ".pdf", _
        ExportFormat:=wdExportFormatPDF, _
        Range:=wdExportFromTo, _
        From:=r.Information(wdActiveEndPageNumber), _
        To:=r.Information(wdActiveEndPageNumber)
    Loop

    SourceDoc.Close SaveChanges:=False

    MsgBox "The document has been split into separate documents based on the 'Ntra Ref' criteria and saved   in the destination folder.", vbInformation
End Sub

(Or if that is picking up the wrong pages, you could probably resort to using Selection.Find and wdExportCurrentPage in which case you do not need to retrieve the page numbers).

I think you are assuming that Range.Find will only find text in the currently specified Range, whereas in fact if (say) the first occurence of "Ntra Ref" is on page 2, the first .Execute will succeed, but the startPage is still set to 1, the code will output a PDF for page 1, and so on.

NB,if an "Ntra Ref" spans an automatic page break, this code will emit the second of the two pages.

Upvotes: 0

Related Questions