BigBoyB
BigBoyB

Reputation: 261

VBA mapping/transferring sentences in Word to columns in Excel

I'm trying to transfer info from the following format in Word to Excel columns "a", "b", "c", "d" while ignoring the number in front which is the index of the entry (21 in this case)

enter image description here

So far this is what I got but it's only for the bold text on the top left but I don't know how to get the other substrings. Any help with this will be appreciated.

Sub TheBoldAndTheExcelful()
  Dim docCur As Document
  Dim snt As Range
  Dim i As Integer
  'Requires a reference to the 'Microsoft Excel XX.0 Object Library'
  Dim appXL As Excel.Application, xlWB As Excel.Workbook, xlWS As Excel.Worksheet

  'This assumes excel is currently closed
  Set appXL = CreateObject("Excel.Application")
  appXL.Visible = True
  Set xlWB = appXL.Workbooks.Add
  Set xlWS = xlWB.Worksheets(1)

  On Error GoTo ErrHandler
  Application.ScreenUpdating = False

  Set docCur = ActiveDocument

  For Each snt In docCur.Sentences
    If snt.Bold = True Then
      i = i + 1
      xlWS.Cells(i, 1).Value = snt.Text
    End If
  Next snt

ExitHandler:
  Application.ScreenUpdating = True
  Set snt = Nothing
  Exit Sub

ErrHandler:
  MsgBox Err.Description, vbExclamation
  Resume ExitHandler
End Sub

Upvotes: 0

Views: 111

Answers (1)

Tigregalis
Tigregalis

Reputation: 666

In your example,

  For Each snt In docCur.Sentences
    If snt.Bold = True Then
      i = i + 1
      xlWS.Cells(i, 1).Value = snt.Text
    End If
  Next snt

Let's rewrite that first

  For Each snt In docCur.Sentences
    If snt.Bold = True Then
      i = i + 1
      xlWS.Cells(i, COLUMN_A).Value = snt.Text
    End If
  Next snt

You are only including the bold sentence (If snt.Bold = True), and writing to COLUMN_A alone.

What you want is the bold sentence and the three sentences that follow after it, and you want to write to four columns.

So change this section to:

'  Dim j As Long ' - make sure to have already declared this, or just uncomment this line

  For j = 1 to docCur.Sentences.Count ' perhaps docCur.Paragraphs instead?
    If docCur.Sentences(j).Bold = True Then
      i = i + 1
      ' used 1+n and j+n for ease of understanding, but you can make these constant with a real solution; or you could even put this in another loop if you wanted, e.g. For n = 0 to 3, ...
      xlWS.Cells(i, 1+0).Value = docCur.Sentences(j+0).Text
      xlWS.Cells(i, 1+1).Value = docCur.Sentences(j+1).Text
      xlWS.Cells(i, 1+2).Value = docCur.Sentences(j+2).Text
      xlWS.Cells(i, 1+3).Value = docCur.Sentences(j+3).Text
    End If
  Next j

Or, to maximise performance:

'  Dim j As Long ' - make sure to have already declared this, or just uncomment this line

  With docCur.Sentences ' perhaps docCur.Paragraphs instead?
    For j = 1 To .Count
      If .Item(j).Bold = True Then
        i = i + 1
        xlWS.Cells(i, 1).Resize(, 4).Value = Array(.Item(j + 0).Text, .Item(j + 1).Text, .Item(j + 2).Text, .Item(j + 3).Text)
      End If
    Next j
  End With

Based on comments, changes:

  1. Problem: "Also some sentences that I have go a little on the second line so technically there would be 5 sentences total since formatting. Any way to concatenate the two lines which actually should represent the same sentence?":
    Solution: Concatenate with &:
    Example:
    Fourth item of Array(...) changes
    from .Item(j + 3).Text
    to .Item(j + 3).Text & .Item(j + 4).Text)

  2. Problem: "Instead when creating the last column, everything ends in some funny looking crosses (like an Egyptian Ankh). Any idea how to remove those?":
    Solution: Either remove the last character in the problem sentence using Left(string, Len(string)-1), or use Replace(string, [problem character], "")
    Example:
    Problem item (presuming sentence 4) in Array(...) changes
    from .Item(j + 3).Text
    to Left(.Item(j + 3).Text, Len(.Item(j + 3).Text) - 1)

Updated:

'  Dim j As Long ' - make sure to have already declared this, or just uncomment this line

  With docCur.Sentences ' perhaps docCur.Paragraphs instead?
    For j = 1 To .Count
      If .Item(j).Bold = True Then
        i = i + 1
        xlWS.Cells(i, 1).Resize(, 4).Value = Array(.Item(j + 0).Text, .Item(j + 1).Text, .Item(j + 2).Text, Left(.Item(j + 3).Text, Len(.Item(j + 3).Text) - 1) & .Item(j + 4).Text)
      End If
    Next j
  End With

If this isn't a complete fix, please provide a sample file.

Upvotes: 1

Related Questions