Mrbobdou
Mrbobdou

Reputation: 13

Copying table data from Word into Excel

It's been years since I've done VB and I'm having trouble figuring out why I'm not capturing the correct content.

I want to loop through a Word document and parse out all the tables that hold requirements, so they can be tracked in Excel.

I have a Header 1 style, followed by paragraphs and tables.

For each Header 1 style found, I want to copy the text within the Heading 1, as well as any tables with the word "Requirement" in it. The Heading 1 text should be the first column for each row in the table.

The current issues I'm having:

I included some screenshots for reference.

Word

enter image description here

Excel

enter image description here

Sub CopyAllRequirementTablesToExcel()
    Dim tbl As Table
    Dim cell As cell
    Dim found As Boolean
    Dim xlApp As Object
    Dim xlBook As Object
    Dim xlSheet As Object
    Dim i As Integer
    Dim j As Integer
    Dim row As Integer
    Dim hasVerticallyMergedCells As Boolean
    Dim startRow As Integer
    Dim endRow As Integer
    Dim headingText As String
    Dim rng As Range
    Dim para As Paragraph
    Dim foundHeading1 As Boolean
    Dim paraIndex As Integer
    
    ' Create a new instance of Excel if not already running
    On Error Resume Next
    Set xlApp = GetObject(, "Excel.Application")
    If Err.Number <> 0 Then
        Set xlApp = CreateObject("Excel.Application")
    End If
    On Error GoTo 0
    
    ' Reference the first workbook and sheet
    Set xlBook = xlApp.Workbooks.Add
    Set xlSheet = xlBook.Sheets(1)
    
    ' Initialize the row to start pasting in Excel
    row = 1
    
    ' Loop through each table in the Word document
    For Each tbl In ActiveDocument.Tables
        found = False
        hasVerticallyMergedCells = False
        foundHeading1 = False
        
        ' Set the index of the paragraph containing the table
        paraIndex = tbl.Range.Paragraphs.Count
        
        ' Loop backward through paragraphs until we find the first instance of the next Heading 1 style
        Do Until foundHeading1 Or paraIndex = 1
            If tbl.Range.Paragraphs(paraIndex).Style = "Heading 1" Then
                headingText = Trim(tbl.Range.Paragraphs(paraIndex).Range.Text)
                foundHeading1 = True
            Else
                paraIndex = paraIndex - 1
            End If
        Loop
        
        ' If a Heading 1 style is not found, set headingText to an empty string
        If Not foundHeading1 Then
            headingText = "No Heading 1 found"
        End If
        
        ' Debugging: Print out the Heading 1 text
        Debug.Print "Heading 1 Text: " & headingText
        
        ' Check if any cell in the table contains "Requirement"
        For Each cell In tbl.Range.Cells
            If InStr(1, cell.Range.Text, "Requirement", vbTextCompare) > 0 Then
                found = True
                Exit For
            End If
        Next cell
        
        ' If "Requirement" is found in the table, check for vertically merged cells
        If found Then
            If tbl.Columns.Count > 1 Then ' Check if the table has more than one column
                For i = 2 To tbl.Rows.Count ' Skip the first row
                    For j = 1 To tbl.Columns.Count
                        startRow = tbl.cell(i, j).Range.Information(wdStartOfRangeRowNumber)
                        endRow = tbl.cell(i, j).Range.Information(wdEndOfRangeRowNumber)
                        If startRow <> endRow Then
                            hasVerticallyMergedCells = True
                            Exit For
                        End If
                    Next j
                    If hasVerticallyMergedCells Then Exit For
                Next i
            End If
            
            ' Skip the table if it has vertically merged cells
            If Not hasVerticallyMergedCells Then
                ' Insert the heading text as the first column
                xlSheet.Cells(row, 1).Value = headingText
                
                ' Copy the table to Excel, starting from the second column and second row
                For i = 2 To tbl.Rows.Count ' Skip the first row
                    For j = 1 To tbl.Columns.Count
                        ' Set the value of the cell in Excel to the formatted text of the cell in Word
                        xlSheet.Cells(row + i - 2, j).Value = Trim(tbl.cell(i, j).Range.Text)
                    Next j
                Next i
                
                ' Update the row to the next empty row
                row = row + tbl.Rows.Count - 1 ' Subtract 1 to remove the empty row between tables
            End If
        End If
    Next tbl
    
    ' Make Excel visible
    xlApp.Visible = True
    
    ' Clean up
    Set xlSheet = Nothing
    Set xlBook = Nothing
    Set xlApp = Nothing
    
    ' Notify if no table with 'Requirement' heading is found
    'If Not found Then
        'MsgBox "No table with 'Requirement' heading found.", vbInformation
    'End If
End Sub 

Update 5/23/24 Thanks everyone for your suggestions, I started over and got much closer.

I converted the values to ASCII and found that it's actually Chr(13), so I finally fixed the odd characters by replacing that with vbCrLf and it works. I also had to remove the last vbCr, as well as set WrapText = True.

The issue I'm running into now, is that Range.Text ignores Word's auto generated number formatted values. So if I have a list 1.1.1, 1.1.2, 1.1.3, it simply copies nothing over.

In the code below, I added "If y = 2 Then" in hopes to use the Range.ListFormat.ListString to grab the formatted value, but it's still showing up empty.

I also believe I may not be incrementing the rows/columns appropriately.

Minor tweaks to Word to show multi line fix: https://i.imgur.com/n53svUI.png Updated output of Excel: https://i.imgur.com/oI7aA4U.png I tried uploading/pasting the docm file but received an error when pasting the URL. (sorry, new here)

            Option Explicit
        Sub CopySDDFromWordToExcel()
            Dim oTab As Table, colHead As New Collection, sHead As String
            Dim i As Long, r As Range
            Dim row As Integer
            Dim column As Integer
            Dim x As Integer
            Dim y As Integer
            Dim xlApp As Object
            Dim xlBook As Object
            Dim xlSheet As Object

            ' Create a new instance of Excel if not already running
            On Error Resume Next
            Set xlApp = GetObject(, "Excel.Application")
            If Err.Number <> 0 Then
                Set xlApp = CreateObject("Excel.Application")
            End If
            On Error GoTo 0
            
            row = 1
            column = 1
            
            ' Reference the first workbook and sheet
            xlApp.Visible = True
            Set xlBook = xlApp.Workbooks.Add
            Set xlSheet = xlBook.Sheets(1)
            
            Set r = ActiveDocument.Content
            ' Use Find instead of looping through all paragraphs to find all Heading 1 instances
            With r.Find
                .ClearFormatting
                .Style = wdStyleHeading1
                .Forward = True
                .Wrap = wdFindStop
                Do While .Execute
                    colHead.Add .Parent.Duplicate
                    r.Collapse Direction:=wdCollapseEnd
                Loop
            End With
            If colHead.Count = 0 Then
                MsgBox "Can't find Heading 1"
                Exit Sub
            End If
            For i = 1 To colHead.Count
                sHead = colHead(i).text
                If i = colHead.Count Then
                    colHead(i).End = ThisDocument.Range.End
                Else
                    colHead(i).End = colHead(i + 1).Start - 1
                End If
                If colHead(i).Tables.Count > 0 Then
                    Debug.Print "-----"
                    Debug.Print sHead
                    Debug.Print "-- Table: ", i, "---"
                    
                    For Each oTab In colHead(i).Tables
                        With oTab.Range.Find
                            .ClearFormatting
                            .text = "Requirement"
                            If .Execute Then
                                Debug.Print "-----"
                                Debug.Print "Found table"
                                Debug.Print "Start: "; .Parent.Start
                                Debug.Print "End: "; .Parent.End
                                
                                ' Copy the table to Excel, starting from the second column and second row
                                For x = 2 To oTab.Rows.Count ' Loop through all rows, skipping the 1st row
                                    'xlSheet.Cells(row, 1).Value = sHead ' Insert heading text as the first column
                                        For y = 1 To oTab.Columns.Count ' Loop through all columns
                                            ' Insert heading text as the first column
                                            If y = 1 Then
                                                xlSheet.Cells(row + x - 2, y).Value = sHead ' Insert heading text as the first column
                                            End If
                                            ' Insert requirements # ID as the second column
                                            If y = 2 Then
                                                xlSheet.Cells(row + x - 2, y + 1).Value = oTab.cell(x, y).Range.ListFormat.ListString
                                            End If
                                            ' Insert remaining cells
                                            If y > 2 Then
                                                xlSheet.Cells(row + x - 2, y + 1).Value = Replace(RemoveLastCarriageReturn(Trim(oTab.cell(x, y).Range.text)), Chr(13), vbCrLf)
                                                xlSheet.Cells(row + x - 2, y + 1).WrapText = True
                                            End If
                                            
                                            'xlSheet.Cells(row + x - 2, y + 1).WrapText = True
                                            'ConverToASCII (xlSheet.Cells(row + x - 2, y + 1).Value)
                                        Next y
                                        row = row + 1
                                Next x
                                
                                ' Increment row counter
                                'row = row + 1 ' Increment by the number of rows in the table plus one for the next row
                                ' column = column + oTab.Columns.Count + 1
                            End If
                        End With
                    Next
                End If
            Next

            ' Make Excel visible
            xlApp.Visible = True
            
            ' Clean up
            Set xlSheet = Nothing
            Set xlBook = Nothing
            Set xlApp = Nothing

        End Sub

        Function RemoveLastCarriageReturn(ByVal inputString As String) As String
            Dim lastCRPosition As Integer
            lastCRPosition = InStrRev(inputString, vbCr)
            
            If lastCRPosition > 0 Then
                RemoveLastCarriageReturn = Left(inputString, lastCRPosition - 1)
            Else
                RemoveLastCarriageReturn = inputString
            End If
        End Function

        Function ConverToASCII(inputString As String) As String
            Dim asciiValues As String
            Dim i As Integer
            
            ' Initialize the string to store ASCII values
            asciiValues = ""
            
            ' Loop through each character in the input string
            For i = 1 To Len(inputString)
                ' Retrieve the ASCII value of the character and append it to the result string
                asciiValues = asciiValues & Asc(Mid(inputString, i, 1)) & " "
            Next i
            
            ' Return the string containing ASCII values
            ' ConverToASCII = asciiValues
            
            Debug.Print "The ASCII values of '" & inputString & "' are: " & asciiValues
        End Function

Upvotes: 0

Views: 112

Answers (2)

taller
taller

Reputation: 18803

  • Below code shows how to use Find method to location Heading 1 style and check if requirement in the table.
  • It's more efficient than looping through paragraphs in doc and cells in table.
  • btw, cell.Range.Text ends with vbCr + Chr(7). You need code to replace it.
Option Explicit
Sub demo()
    Dim oTab As Table, colHead As New Collection, sHead As String
    Dim i As Long, r As Range
    Set r = ActiveDocument.Content
    With r.Find
        .ClearFormatting
        .Style = wdStyleHeading1
        .Forward = True
        .Wrap = wdFindStop
        Do While .Execute
            colHead.Add .Parent.Duplicate
            r.Collapse Direction:=wdCollapseEnd
        Loop
    End With
    If colHead.Count = 0 Then
        MsgBox "Can't find Heading 1"
        Exit Sub
    End If
    For i = 1 To colHead.Count
        sHead = colHead(i).Text
        If i = colHead.Count Then
            colHead(i).End = ThisDocument.Range.End
        Else
            colHead(i).End = colHead(i + 1).Start - 1
        End If
        If colHead(i).Tables.Count > 0 Then
            Debug.Print "-----"
            Debug.Print sHead
            For Each oTab In colHead(i).Tables
                With oTab.Range.Find
                    .ClearFormatting
                    .Text = "requirement"
                    .MatchCase = False
                    If .Execute Then
                        Debug.Print "Found table"
                        Debug.Print "Start: "; .Parent.Start
                        Debug.Print "End: "; .Parent.End
                    End If
                End With
            Next
        End If
    Next
End Sub

Output:

-----
Heading 1 Text1

Found table
Start:  66 
End:  77 
-----
Heading 1 Text3

Found table
Start:  209 
End:  220 

Sample Doc:

enter image description here


Update:

  • Your code is close to finish. Two UDFs are not necessary.
Sub CopySDDFromWordToExcel2()
    Dim oTab As Table, colHead As New Collection, sHead As String
    Dim i As Long, r As Range, sTxt As String
    Dim iRow As Long, iColCnt As Long
    Dim iColumn As Long
    Dim x As Long
    Dim y As Long
    Dim xlApp As Object
    Dim xlBook As Object
    Dim xlSheet As Object
    Application.ScreenUpdating = False
    ' Create a new instance of Excel if not already running
    On Error Resume Next
    Set xlApp = GetObject(, "Excel.Application")
    If Err.Number <> 0 Then
        Set xlApp = CreateObject("Excel.Application")
    End If
    On Error GoTo 0
    iRow = 1
    iColumn = 1
    ' Reference the first workbook and sheet
    xlApp.Visible = True
    Set xlBook = xlApp.Workbooks.Add
    Set xlSheet = xlBook.Sheets(1)
    Set r = ActiveDocument.Content
    ' Use Find instead of looping through all paragraphs to find all Heading 1 instances
    With r.Find
        .ClearFormatting
        .Style = wdStyleHeading1
        .Forward = True
        .Wrap = wdFindStop
        Do While .Execute
            colHead.Add .Parent.Duplicate
            r.Collapse Direction:=wdCollapseEnd
        Loop
    End With
    If colHead.Count = 0 Then
        MsgBox "Can't find Heading 1"
        Exit Sub
    End If
    For i = 1 To colHead.Count
        sHead = colHead(i).Text
        If i = colHead.Count Then
            colHead(i).End = ThisDocument.Range.End
        Else
            colHead(i).End = colHead(i + 1).Start - 1
        End If
        If colHead(i).Tables.Count > 0 Then
            Debug.Print "-----"
            Debug.Print sHead
            Debug.Print "-- Table: ", i, "---"
            For Each oTab In colHead(i).Tables
                With oTab.Range.Find
                    .ClearFormatting
                    .Text = "Requirement"
                    If .Execute Then
                        Debug.Print "-----"
                        Debug.Print "Found table"
                        Debug.Print "Start: "; .Parent.Start
                        Debug.Print "End: "; .Parent.End
                        iColCnt = oTab.Columns.Count
                        ' Copy the table to Excel, starting from the second iColumn and second iRow
                        For x = 2 To oTab.Rows.Count ' Loop through all rows, skipping the 1st iRow
                                xlSheet.Cells(iRow, 1).Value = sHead ' Insert heading text as the first iColumn
                                For y = 1 To iColCnt ' Loop through all columns
                                    ' Insert heading text as the first iColumn
                                    sTxt = oTab.cell(x, y).Range.Text
                                    sTxt = Left(sTxt, Len(sTxt) - 2)
                                    If Len(sTxt) = 0 Then
                                        sTxt = oTab.cell(x, y).Range.ListFormat.ListString
                                    Else
                                        sTxt = Replace(sTxt, vbCr, Chr(10))
                                    End If
                                    xlSheet.Cells(iRow, y + 1).Value = sTxt
                                    If y > 2 Then
                                        xlSheet.Cells(iRow, y + 1).WrapText = True
                                    End If
                                Next y
                                iRow = iRow + 1
                        Next x
                    End If
                End With
            Next
        End If
    Next
    ' Make Excel visible
    xlApp.Visible = True
    Application.ScreenUpdating = True
    ' Clean up
    Set xlSheet = Nothing
    Set xlBook = Nothing
    Set xlApp = Nothing
End Sub

Upvotes: 0

Dick Kusleika
Dick Kusleika

Reputation: 33165

I echo @taller's recommendation of using Find. But if you wanted to loop through paragraphs, you need to loop through ActiveDocument.Paragraphs, not tbl.Paragraphs. Your code only looks inside the table and never finds anything with Header 1 style. You could change it to

    ' Set the index of the paragraph containing the table
    paraIndex = ActiveDocument.Range(0, tbl.Range.Start).Paragraphs.Count
    
    ' Loop backward through paragraphs until we find the first instance of the next Heading 1 style
    Do Until foundHeading1 Or paraIndex = 0
        Debug.Print paraIndex, Left$(ActiveDocument.Paragraphs(paraIndex).Range.Text, 50)
        If ActiveDocument.Paragraphs(paraIndex).Style = "Heading 1" Then
            headingText = Trim(tbl.Range.Paragraphs(paraIndex).Range.Text)
            foundHeading1 = True
        Else
            paraIndex = paraIndex - 1
        End If
    Loop

That finds the count of paragraphs from the start of the document to the start of the table, then loops backward through them. Again, just for information. Find is the better method.

Upvotes: 0

Related Questions