user45663
user45663

Reputation: 1

Importing web based data using vba

The website I am using is www.msci.com. The website makes use of several forms which can be altered. Each combination of choices creates a certain matrix of values (the desired data). I want to create the same variables in my excel sheet, such that I just have to fill in some codes in my sheet its column B to create a data table, which should be placed somewhere else in the same sheet.

I think I have to incorporate some html code in my macro to fill in these scroll down menus (forms). I discovered each of the choices underlying a form their html ID, and tried to incorporate those in my code. I think the code partly works, however changing the date from the calendar template surely is not working. My code so far :

Sub getMSCIdata()

Dim mktval As String
Dim curr As String
Dim indlvl As String
Dim calendarinput As String

curr = Range("$B$3")
mktval = Range("$B$2")
indlvl = Range("$B$4")
calendarinput = Range("$B$5")   

With ActiveSheet.QueryTables.Add(Connection:= _

    "URL;http://www.mscibarra.com/webapp/indexperf/pages/IEIPerformanceRegional.jsf?scope=0&mktval&size=36&style=C&calendarinput&curr&indlvl&lang=en" _
    , Destination:=Range("$A$10"))
    .Name = _
    "IEIPerformanceRegional.jsf?scope=0&mktval&size=36&style=C&calendarinput&curr&indlvl&lang=en"
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlSpecifiedTables
    .WebFormatting = xlWebFormattingNone
    .WebTables = """templateForm:tableResult0"""
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = True
    .WebDisableRedirections = False
    .Refresh BackgroundQuery:=False     

End With  
End Sub

Upvotes: 0

Views: 1266

Answers (1)

Tom Robinson
Tom Robinson

Reputation: 1920

If the web page is too complex for QueryTable to extract the data, you can write your own VBA to extract the data directly from the HTML document.

Use Tools/References to add the reference "Microsoft HTML Object Library".

Create a UserForm containing a large WebBrowserControl named WB.

Add this code to the form:

    Private Sub UserForm_Initialize()
    WB.navigate ("http://www.mscibarra.com/webapp/indexperf/pages/IEIPerformanceRegional.jsf?scope=0&mktval&size=36&style=C&calendarinput&curr&indlvl&lang=en")
    End Sub

    Private Sub WB_DocumentComplete(ByVal pDisp As Object, URL As Variant)
    Dim tBody As HTMLBody, row As HTMLTableRow
    Set tBody = WB.document.getElementById("templateForm:tableResult0:tbody_element")
    If tBody Is Nothing Then Stop
    For Each row In tBody.rows
        Debug.Print CellText(row, 1), CellText(row, 2), CellText(row, 0)
    Next
    End Sub

    ' returns an empty string instead of an error
    Private Function CellText(row As HTMLTableRow, ByVal cellIndex As Long) As String
    Dim Cell As HTMLTableCell
    On Error Resume Next
    Set Cell = row.Cells.Item(cellIndex)
    CellText = Trim(Cell.innerText)
    End Function

Show the form. The web page should load in a few seconds. The DocumentComplete event will run the code to extract and print columns Index Code, Last and MSCI Index. The debug window should show:

    990300        1,811.383     EAFE
    991100        1,785.575     EAFE + CANADA
    144097        1,372.105     EAFE ex ISRAEL
    991600        2,034.280     EAFE ex UK
    991300        1,487.429     EASEA INDEX (EAFE ex JAPAN)
    106400        182.491       EMU
    106507        169.293       EMU ex GERMANY
    990600        399.741       EU
    106569        1,076.915     EURO
    990500        1,641.595     EUROPE
    144115        1,422.575     EUROPE & MIDDLE EAST
    106331        189.663       EUROPE ex EMU
    995200        1,445.779     EUROPE ex SWITZERLAND
    991700        1,854.892     EUROPE ex UK
    990900        2,915.545     FAR EAST
    113647        1,529.146     G7 INDEX
    991200        1,740.757     KOKUSAI INDEX (WORLD ex JP)
    990700        6,054.493     NORDIC COUNTRIES
    990200        2,113.327     NORTH AMERICA
    990800        2,351.421     PACIFIC
    991400        1,288.304     PACIFIC ex JAPAN
    106570        1,163.646     PAN-EURO
    990100        1,721.971     WORLD
    701609        1,859.470     WORLD WITH USA GROSS
    996200        1,744.360     WORLD ex AUSTRALIA
    701610        1,844.715     WORLD ex AUSTRALIA WITH USA GROSS
    106330        213.390       WORLD ex EMU
    106332        1,745.644     WORLD ex EUROPE
    144079        1,637.763     WORLD ex ISRAEL
    991500        1,754.637     WORLD ex UK
    991000        1,820.809     WORLD ex USA

It is now straightforward to put those values into a worksheet.

This technique can be extended to websites protected by HTTP Auth and websites requiring a login which sets a cookie.

You aren't limited to just extracting data from the web page. You can use VBA to fill in form elements and click Submit buttons.

Add a cmdNextPage command button button to the form, and add this code:

    Private Sub cmdNextPage_Click()
    Dim theForm As HTMLFormElement, el As HTMLObjectElement
    Set theForm = WB.document.forms("templateForm")
    With theForm.elements
        .Item("templateForm:_id78").value = "2115"    ' set [Market] to "Frontier Markets (FM)"
        .Item("templateForm:_id88").value = "Dec 1, 2014"       ' set [As of]
    End With
    theForm.submit
    End Sub

When testing this, I find it does not work. The form elements get updated, but the Submit doesn't do anything. There is something else going on in that web page that I haven't tracked down. You won't be able to use DocumentComplete to detect when the page updates, because it is using AJAX to update the results table. If you use Fiddler to see what is going over the wire, you may be able to duplicate the AJAX requests in your code. I'm sorry I don't have time to troubleshoot this any further.

Upvotes: 1

Related Questions