Jebathon
Jebathon

Reputation: 4561

VBA Excel - Delimit and Parse sections in Word Document in-order to input data into Excel

I am trying to figure out a way using VBA to parse a Word Document so I can put its contents into an array. For this example I have two companies in a Word Document (as seen below the code) and I want to put the fields into an array.

Public Sub ParseCompanies()

Dim Company_Array(1 To 2) As String 'stores individual company fields
Dim Companies_Array() 'array for all companies

Dim oWord As Object, oDoc As Object
Set oWord = CreateObject("Word.Application")

Set oDoc = oWord.Documents.Open("C:/Temp/test.docx", Visible:=True)

Dim singleLine
Dim lineText As String

'need to rewrite this section
For Each singleLine In oDoc.Paragraphs

   lineText = singleLine.Range.Text
   Debug.Print lineText    

Next singleLine


End Sub

Word file contents cut and pasted onto Stack Overflow:


Company: Aladin Carpets

Product: Magic Carpets


Company: Aerials Seashells

Product: Seashells


The way the current script runs can be seen below in the VBA Debugger Output

enter image description here

Is there an efficient way to do this? A way to delimit the lines or section splitters in the word document in order to parse the individual companies?

Upvotes: 1

Views: 298

Answers (2)

Sgdva
Sgdva

Reputation: 2800

Solution:
If the output is as stated -I copied your data but I get different results-, this should work, if not, just adjust the element being saved in the array:

Public Sub ParseCompanies()

Dim Products_Array() As String 'stores individual company fields
Dim Companies_Array() As String 'array for all companies
Dim CounterElements As Long: CounterElements = 1
Dim CounterParagraphs As Long

Dim oWord As Object, oDoc As Object
Set oWord = CreateObject("Word.Application")
On Error GoTo Err01ParseCompanies
Set oDoc = oWord.Documents.Open("C:\Users\lz630z\Desktop\Company.docx", visible:=True)

Dim singleLine
Dim lineText As String

'need to rewrite this section
For CounterParagraphs = 1 To oDoc.Paragraphs.Count
   If InStr(oDoc.Paragraphs(CounterParagraphs).Range.Text, "Company") Then ReDim Preserve Companies_Array(CounterElements): Companies_Array(CounterElements) = oDoc.Paragraphs(CounterParagraphs + 2)
   If InStr(oDoc.Paragraphs(CounterParagraphs).Range.Text, "Product") Then ReDim Preserve Products_Array(CounterElements): Products_Array(CounterElements) = oDoc.Paragraphs(CounterParagraphs + 2): CounterElements = CounterElements + 1

Next CounterParagraphs

If 1 = 2 Then ' 99. If error
Err01ParseCompanies:
MsgBox "Word Error", vbCritical
End If '99. If error
Set oDoc = Nothing
Set oWord = Nothing
End Sub

Summary of changes/suggestions
For each won't work here, since according to the screenshot is going to be 2 rows after it found the first result, it's better to have everything controlled in this scenario and save the elements in the array accordingly, changed a For/To approach (I assumed you arrays meant to be as defined now). Whenever you are referring to one, for the size the other will be accordingly.
IG: Companies_Array(1) will be Aladin Carpets and Products_Array(1) will be Magic Carpets

Upvotes: 2

ARich
ARich

Reputation: 3279

If your word document items are truly paragraph delimited, you could use the Split method to fill your array and then loop through it to manipulate the data. For example, this just fills the array and prints the elements to the immediate window:

Public Sub ParseCompanies()
    Dim wordList() As String
    Dim i As Long
    Dim oWord As Word.Application
    Dim oDoc As Word.Document

    Set oWord = CreateObject("Word.Application")
    Set oDoc = oWord.Documents.Open("C:\Users\test\Desktop\Company.docx", Visible:=False)

    wordList = Split(oDoc.Content.Text, vbCr) 'split using carriage return (paragraphs)

    For i = 0 To UBound(wordList, 1)
        Debug.Print wordList(i)
    Next i

    oWord.Quit
End Sub

I can't speak to the performance of this method on a large file, so it may require testing before this can be considered a viable option.

Upvotes: 0

Related Questions