Reputation: 499
It's a follow up for this question and a print screen embedded there with document view still applies. The code is run from Excel VBA editor.
There is unknown number of email addresses in the Word document and:
I need to extract all of them,
concatenate into one string that contains all of the email addresses, separated using ", "
and fill the string into Excel cell Activesheet.Range("C31")
Currently I have a code that finds the @
sign and builds email address around that. This is how it looks like:
Sub FindEmail036() '[0-9;A-z;,._-]{1;}\@[0-9;A-z;._-]{1;}
'[0-9;A-z;,._-]{1;}\@[0-9;A-z;._-]{1;}
Dim WordApp As Word.Application
Dim WordDoc As Word.Document
Dim ExcelApp As Excel.Application
Dim rng As Word.Range
Dim emailAdr As String
Dim ws As Worksheet
Dim iCount As Integer
Set WordApp = GetObject(, "Word.Application")
Set ExcelApp = GetObject(, "Excel.Application")
Set WordDoc = WordApp.ActiveDocument
Set rng = WordApp.ActiveDocument.Content
Set ws = ExcelApp.ActiveSheet
ExcelApp.Application.Visible = True
With rng.Find
.Text = "@"
.Wrap = wdFindContinue
.Forward = True
.MatchWildcards = False
.Execute
Debug.Print rng.Text
If .Found = True Then
rng.MoveStartUntil Cset:=" ", Count:=wdBackward
Debug.Print rng.Text
rng.MoveEndUntil Cset:=","
Debug.Print rng.Text
'rng.MoveEndUntil Cset:=" ", Count:=wdBackward
End If
End With 'how to create loop that will extract all the email addresses in the document??
ws.Range("C31").Value = rng
End Sub
This code only extracts the first email address and is not looking for next email addresses. I know this because I'm debugging using F8 and Immediate window and I can see, this code is just finishing search after it finds @
and constructs first complete email address.
I guess some loop is necessary but I don't know how to do write it.
I've also found this source but I don't understand much from it. https://wordmvp.com/FAQs/MacrosVBA/NoTimesTextInDoc.htm
Upvotes: 0
Views: 312
Reputation: 13515
I already effectively answered this in your other thread:
Sub Demo()
Dim wdApp As Word.Application, StrOut As String
Set wdApp = GetObject(, "Word.Application")
With wdApp.ActiveDocument.Range
With .Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "<[0-9A-ÿ.\-]{1,}\@[0-9A-ÿ\-.]{1,}([^13 -/\:-\@\\-`\{-¿])"
.Replacement.Text = ""
.Forward = True
.Format = False
.Wrap = wdFindStop
.MatchWildcards = True
.Execute
End With
Do While .Find.Found
StrOut = StrOut & Trim(.Text) & " "
.Collapse wdCollapseEnd
.Find.Execute
Loop
End With
StrOut = Replace(Trim(StrOut), " ", ", ")
ActiveSheet.Range("C31").Value = StrOut
End Sub
Note how little differs between this code and the code I posted in your other thread.
Upvotes: 1
Reputation: 4355
Other responders have identified the cause of your problem so I won't reiterate that. However, your requirement is a common pattern in VBA/Word, namely find something then do something as a consequence of the find (other than a replace). I generally wrap this pattern in a function or sub depending upon what action is required once the find text has been found..
If you haven't used a scripting.dictionary before than I would use early binding (as in the code below) so that you get access to intellisense for the methods and properties. This means using Tools.Reference to add the Microsoft Scripting.Runtime library to the VBIDE.
You'll see that we recalculate the end of the document each time we run through the While loop. This is good practise because we don't know in advance the impact that the find actions will have on the length of the document.
The DoEvents in the While loop ensures that you can quickly break out of the loop if things go wrong.
The function below uses a Word wildcard search to search for email addresses. The find is precise so there is no need to adjust the ends of the found range to get only the email address.
If the action in the found do loop was complicated then I would break this out to a separate function passing the found range to the function as .Duplicate. In this particular case that would also mean that I would move the scripting dictionary from a local variable to a module scope variable
Public Function GetEmailAddressesAsString(ByVal ipDoc As Word.Document) As String
Const EmailAddress As String = "<[0-9A-Za-z._]{1,}\@[0-9A-Za-z.\_]{1,}>"
With ipDoc.StoryRanges(wdMainTextStory)
With .Find
.ClearFormatting
.Wrap = wdFindStop
.MatchWildcards = True
.text = EmailAddress
End With
Dim myAddresses As Scripting.Dictionary
Set myAddresses = New Scripting.Dictionary
Do While .Find.Execute
DoEvents
myAddresses.Add myAddresses.Count, .text
.MoveStart Count:=.Characters.Count + 1
.End = ipDoc.StoryRanges(wdMainTextStory).End
Loop
End With
GetEmailAddressesAsString = Join(myAddresses.Items, ",")
End Function
Upvotes: 1
Reputation: 7567
I recommend using regular expressions.
Check Reference: Microsoft VBscript Regular Expressions X.X
Sub FindEmail()
Dim WordApp As Word.Application
Dim WordDoc As Word.Document
Dim ExcelApp As Excel.Application
Dim StrInput As String, sPattern As String
Dim oEmail As MatchCollection
Dim Ws As Worksheet
Dim vR()
Dim n As Long, i As Long
Set WordApp = GetObject(, "Word.Application")
Set ExcelApp = GetObject(, "Excel.Application")
Set WordDoc = WordApp.ActiveDocument
StrInput = WordDoc.Content
Set Ws = ExcelApp.ActiveSheet
sPattern = "([A-z0-9.]{1,})(@)([A-z0-9]{0,})(.)([A-z0-9]{1,})"
Set oEmail = GetRegEx(StrInput, sPattern)
For i = 0 To oEmail.Count - 1
n = n + 1
ReDim Preserve vR(1 To n)
vR(n) = oEmail.Item(i)
Next
'Ws.Range("c31").Resize(n) = WorksheetFunction.Transpose(vR)
Ws.Range("c31") = Join(vR, ", ") '<~~ single string
End Sub
Function GetRegEx(StrInput As String, strPattern As String) As Object
Dim RegEx As New RegExp
Set RegEx = New RegExp
With RegEx
.Global = True
.IgnoreCase = False
.MultiLine = True
.Pattern = strPattern
End With
If RegEx.Test(StrInput) Then
Set GetRegEx = RegEx.Execute(StrInput)
End If
End Function
Your word document has multiple lines, so I set mutiline = true in the regex setting. The regular expression therefore stores all of its contents in matchcollection. Put this stored item in a dynamic array and do the next thing. You can store an array in multiple cells, or create a single character using the join function.
Upvotes: 1
Reputation: 678
It's finishing because the way Range.Find
works is that it sets the range equal to what it finds. So it finds the @
, sets the range equal to it, and now there are no more @
in the range. You need another range to manipulate, because manipulating your search range will only screw up your results.
You can loop with a Do While .Found = True
(my preferred method).
Make sure that you set .Wrap = wdFindStop
or you will have an infinite loop.
I'd put the results in a dictionary.
Dim eAddresses As Object: Set eAddresses = CreateObject("Scripting.Dictionary")
Dim rng As Range
Set rng = ActiveDocument.Content
Dim srchRng As Range
Dim addressNum As Long
addressNum = 1
With rng.Find
.Text = "@"
.Wrap = wdFindStop
.Forward = True
.MatchWildcards = False
.Execute
Debug.Print rng.Text
Do While .Found
Set srchRng = rng.Duplicate
srchRng.MoveStartUntil Cset:=" ", Count:=wdBackward
Debug.Print srchRng.Text
srchRng.MoveEndUntil Cset:=","
If Not eAddresses.Exists(srchRng.Text) Then
eAddresses.Add srchRng.Text, addressNum
addressNum = addressNum + 1
End If
.Execute
Loop
End With
End Sub
As a side note, when you push these to production, I'd definitely pull out all the Debug.Print
statements. It makes for a cluttered immediate window, especially if you plan on printing useful metrics and/or errors to the immediate window (which I recommend).
Upvotes: 1