Learner04390
Learner04390

Reputation: 17

Copying Text String From Website Into Excel VBA

I need to have VBA look into the HTML of a website, find a certain string within the text, and place into an Excel cell a value containing that string, and X characters to the left of that string, let's say 20 for the sake of example.

For example, if I need to find the string "elit" in a site containing the following string:

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

The code would need to return a value of "sectetur adipiscing elit" to a specified cell. That is, the string itself, and 20 characters to the left of the string.

Here's what I've come up with so far (I know .select is not best practice but it works for me):

Sub String_Checker()
Sheets("Sheet1").Range("a2").Select
Dim IE As Object
Do Until IsEmpty(ActiveCell)
    Set IE = CreateObject("internetexplorer.Application")
    IE.Visible = True
    IE.navigate "https://website.com"
     Do Until (IE.readyState = 4 And Not IE.Busy)
    
          DoEvents
Loop
Set objDoc = IE.document
strMyPage = objDoc.body.innerHTML
Dim s As String: s = ActiveCell.Offset(0, 1).Value
ActiveCell.Offset(0, 2).Value = Left(strMyPage, 20)

IE.Quit
ActiveCell.Offset(1, 0).Select
Loop
End Sub

That gives me the last 20 characters of the HTML, but I need to get the code to start "looking" at the specified string, which would consistently be ActiveCell.Offset(0,1).Value in the Excel. Any help would be appreciated. Thanks!

Upvotes: 0

Views: 353

Answers (1)

VBasic2008
VBasic2008

Reputation: 54807

String From innerHTML

  • This is a bad solution from top to bottom, but my investigation led to it and hopefully it should do the trick.

The Code

Option Explicit

Sub String_Checker()

' I only ran this from VBE. Sometimes the following error would occur:
' Run-time error '2125463506 (8150002e)':
' The text associated with this error code could not be found.
' I don't know why.
    
    Dim IE As Object
    Set IE = CreateObject("InternetExplorer.Application")
    'IE.Visible = True
    IE.navigate "https://www.wikipedia.com"
    Do Until (IE.readyState = 4 And Not IE.Busy)
        DoEvents
    Loop
    Dim objdoc As Object
    Set objdoc = IE.document
    Dim strMyPage As String
    strMyPage = objdoc.body.innerHTML
    IE.Quit
    
    Const pLeft As Long = 20
    
    Dim wb As Workbook
    Set wb = ThisWorkbook ' The workbook containing this code.
    Dim ws As Worksheet
    Set ws = wb.Worksheets("Sheet1")
    
    Dim cel As Range
    Set cel = ws.Range("A2")
    Dim s As String
    Dim pStart As Long
    Dim pLen As Long
    
    Do Until IsEmpty(cel)
        s = cel.Offset(0, 1).Value
        pStart = InStr(1, strMyPage, s, vbTextCompare) - pLeft
        If pStart > 0 Then
        ' The string ('s') was found.
            pLen = InStr(1, strMyPage, s, vbTextCompare) + Len(s) - pStart
            s = Mid(strMyPage, pStart, pLen)
            On Error Resume Next
            ' Here I would receive the following error:
            ' Run-time error '1004': Application-defined or object-defined error
            ' It would occur when the first character would be "=".
            cel.Offset(0, 2).Value = s
            If Err.Number <> 0 Then
                cel.Offset(0, 2).Value = "'" & s ' Maybe this can always be used.
            End If
            On Error GoTo 0
        Else
        ' The string ('s') was NOT found.
        End If
        Set cel = cel.Offset(1)
    Loop

End Sub

Upvotes: 2

Related Questions