10101
10101

Reputation: 2402

Loop through website links and get PDF's to my computer

This topic is related to Loop through links and download PDF's

I am trying to convert my current VBA code into VBScript. I have already understood that I have to remove the variable types (As ... part of Dim statements) and use CreatObject to get those objects but otherwise everything should port as-is. DoEvents will also have to be replaced with something like Wscript.sleep.

I came up with some problems. Currently while running VBS file I am getting an error saying "Object required: 'MSHTML'". Pointing to line 65, where I have Set hDoc = MSHTML.HTMLDocument. I have tried to search on Google but got nothing helpful for this one.

How I should proceed with this one?

DownloadFiles("https://www.nordicwater.com/products/waste-water/")

Sub DownloadFiles(p_sURL)
    Set xHttp = CreateObject("Microsoft.XMLHTTP")
    Dim xHttp 
    Dim hDoc
    Dim Anchors 
    Dim Anchor 
    Dim sPath
    Dim wholeURL

    Dim internet
    Dim internetdata
    Dim internetlink
    Dim internetinnerlink 
    Dim arrLinks 
    Dim sLink 
    Dim iLinkCount 
    Dim iCounter 
    Dim sLinks

    Set internet = CreateObject("InternetExplorer.Application")
    internet.Visible = False
    internet.navigate (p_sURL)

        Do Until internet.ReadyState = 4
        Wscript.Sleep 100
        Loop

        Set internetdata = internet.document
        Set internetlink = internetdata.getElementsByTagName("a")

        i = 1

        For Each internetinnerlink In internetlink
            If Left(internetinnerlink, 36) = "https://www.nordicwater.com/product/" Then

                If sLinks <> "" Then sLinks = sLinks & vbCrLf
                sLinks = sLinks & internetinnerlink.href
                i = i + 1

            Else
            End If

    Next

    wholeURL = "https://www.nordicwater.com/"
    sPath = "C:\temp\"

    arrLinks = Split(sLinks, vbCrLf)
    iLinkCount = UBound(arrLinks) + 1

    For iCounter = 1 To iLinkCount
    sLink = arrLinks(iCounter - 1)
        'Get the directory listing
        xHttp.Open "GET", sLink
        xHttp.send

        'Wait for the page to load
        Do Until xHttp.ReadyState = 4
        Wscript.Sleep 100
        Loop

        'Put the page in an HTML document
        Set hDoc = MSHTML.HTMLDocument
        hDoc.body.innerHTML = xHttp.responseText

        'Loop through the hyperlinks on the directory listing
        Set Anchors = hDoc.getElementsByTagName("a")

        For Each Anchor In Anchors

            'test the pathname to see if it matches your pattern
            If Anchor.pathname Like "*.pdf" Then

                xHttp.Open "GET", wholeURL & Anchor.pathname, False
                xHttp.send

                With CreateObject("Adodb.Stream")
                    .Type = 1
                    .Open
                    .write xHttp.responseBody
                    .SaveToFile sPath & getName(wholeURL & Anchor.pathname), 2 '//overwrite
                End With

            End If

        Next

    Next

End Sub

Function:

Function getName(pf)
    getName = Split(pf, "/")(UBound(Split(pf, "/")))
End Function

Upvotes: 1

Views: 132

Answers (1)

&#201;tienne Laneville
&#201;tienne Laneville

Reputation: 5031

Instead of Set hDoc = MSHTML.HTMLDocument, use:

Set hDoc = CreateObject("htmlfile")

In VBA/VB6 you can specify variable and object types but not with VBScript. You have to use CreateObject (or GetObject: GetObject function) to instantiate objects like MSHTML.HTMLDocument, Microsoft.XMLHTTP, InternetExplorer.Application, etc instead of declaring those using Dim objIE As InternetExplorer.Application for example.

Another change:

If Anchor.pathname Like "*.pdf" Then

can be written using StrComp function:

If StrComp(Right(Anchor.pathname, 4), ".pdf", vbTextCompare) = 0 Then

or using InStr function:

If InStr(Anchor.pathname, ".pdf") > 0 Then

Also, at the beginning of your sub, you do the following:

Set xHttp = CreateObject("Microsoft.XMLHTTP")
Dim xHttp 

You should declare your variables before assigning them values or objects. In VBScript this is very relaxed, your code will work because VBScript will create undefined variables for you but it's good practice to Dim your variables before using them.

Except for Wscript.sleep commands, your VBScript code will work in VB6/VBA so you can debug your script in VB6 or VBA apps (like Excel).

Upvotes: 3

Related Questions