michal roesler
michal roesler

Reputation: 499

How extract email address from Word document using vba

My goal is to extract all email addresses from the Word.ActiveDocument and put them into one single cell in the Excel Sheet.

The code is run from Excel VBA editor. It needs to search for email addresses, extract them from the document and fill the Excel cell Activesheet.Range("C31"). Only one cell is available, no matter how many email addresses have been found.

The addresses found need to be delimited using ", " the coma and the space.

I'm trying to do this by finding @ in the document and then building up the range forward and backwards to have all the email address in the range variable. Building the address to the right was quite easy using rng.MoveEndUntil Cset:="," because in my document there is always a coma after the email address.

But how to get the missing left side of the email address into the range variable?? I've used rng.MoveStart Unit:=wdWord, Count:=-1 but what if the email will be [email protected] or [email protected] It will not work.

This is what I have now.

Sub FindEmail035()         '[0-9;A-z;,._-]{1;}\@[0-9;A-z;._-]{1;}
                           '[0-9;A-z;,._-]{1;}\@[0-9;A-z;._-]{1;}
Dim WordApp As Word.Application
Dim WordDoc As Word.Document
Dim ExcelApp As Excel.Application

Dim rng As Word.Range
Dim emailAdr As String
Dim ws As Worksheet

Set WordApp = GetObject(, "Word.Application")
Set ExcelApp = GetObject(, "Excel.Application")
Set WordDoc = WordApp.ActiveDocument
Set rng = WordApp.ActiveDocument.Content
Set ws = ExcelApp.ActiveSheet

ExcelApp.Application.Visible = True

    With rng.Find
        .Text = "@"
        .Wrap = wdFindAsk
        .Forward = True
        .MatchWildcards = False
        .Execute

        Debug.Print rng.Text
        If .Found = True Then
            'rng.Expand (wdWord)
            'Debug.Print rng.Text
            rng.MoveStart Unit:=wdWord, Count:=-1
            Debug.Print rng.Text
            rng.MoveEndUntil Cset:=","
            'rng.MoveEnd Unit:=wdWord, Count:=1
            'rng.MoveEndUntil Cset:=" ", Count:=wdBackward
        End If
   End With     'how to create loop that will extract all the email addresses in the document??
   ws.Range("C31").Value = rng

End Sub

What loop should I use to get the number of mails present in the document and later build up the ranges with email addresses inside?

This is the place in the document where the mail addresses reside.

enter image description here

Upvotes: 0

Views: 1132

Answers (2)

macropod
macropod

Reputation: 13515

Assuming the email addresses are plain text, you can use Word VBA code like:

Sub Demo()
Dim StrOut As String
With ActiveDocument.Range
  With .Find
    .ClearFormatting
    .Replacement.ClearFormatting
    .Text = "<[0-9A-ÿ.\-]{1,}\@[0-9A-ÿ\-.]{1,}([^13 -/\:-\@\\-`\{-¿])"
    .Replacement.Text = ""
    .Forward = True
    .Format = False
    .Wrap = wdFindStop
    .MatchWildcards = True
    .Execute
  End With
  Do While .Find.Found
    StrOut = StrOut & Trim(.Text) & " "
    .Collapse wdCollapseEnd
    .Find.Execute
  Loop
End With
StrOut = Replace(Trim(StrOut), " ", "; ")
MsgBox StrOut
End Sub

Upvotes: 0

jclasley
jclasley

Reputation: 678

You're on the right track. The easiest thing here is to move the start of the range with .MoveStartUntil Cset:=" " Count:=wdBackward so that you move back through the range until you hit the space before the email address. That is of course assuming consistent formatting and no arbitrary spaces.

I would also just search through the ActiveDocument.Content and then Set rng every time .Found = True because you don't want it overriding your range (which it does when searching a range). Or Dim a new range srchRng or something and then set that to the found results.

 With rng.Find
        .Text = "@"
        .Wrap = wdFindAsk
        .Forward = True
        .MatchWildcards = False
        .Execute

        Debug.Print rng.Text
        If .Found = True Then
            rng.MoveStartUntil Cset:=" ", Count:=wdBackward
            rng.MoveEndUntil Cset:=","
        End If

Upvotes: 2

Related Questions