sam
sam

Reputation: 33

Get web data based on link in excel cell

I'd like to create an Excel sheet, where in one column there is a link to a website (https://grieferwert.com/?product=beacon-3 for example) and in the next column, Excel should automatically import the Price History Chart Data. This should be possible for multiple links in different rows from websites with the same structure (https://grieferwert.com/?product=sand-dk for example). How can I make the web query to be based on the link in the adjacent cell?

The table on this screenshot is what the "add data from web" does, on the left the link to the website.

I've tried to do it manually, so for every link I inserted a data query manually. This is possible but takes a long time for all the entries I'd like to include.

No code was written manually here, all board functions of the newest Excel distribution. Nevertheless I could extract the following from the advanced power query editor:

let
Quelle = Web.Page(Web.Contents("https://grieferwert.com/?product=sand-dk")),
Data0 = Quelle{0}[Data],
#"Geänderter Typ" = Table.TransformColumnTypes(Data0,{{"Type", type text}, {"Price", type text}, {"When", type date}})
in
#"Geänderter Typ"

I expect the data table to change according to the change in the link cell; so when I change the link (interchange the two mentioned above), the data should change accordingly after refreshing the data. In the end, I'd like to do some simple calculations with the numbers from the table elsewhere.

Upvotes: 1

Views: 9472

Answers (2)

QHarr
QHarr

Reputation: 84465

Put your urls in column A starting in A1, then every 7th row, then run the code below (you can attach this code to a form control button so as to run by a button push)

Open the VBE with Alt+F11. Right click in project explorer pane and add a standard module then add code below into standard module.

Whilst VBE is open: You need to go VBE > Tools > References > add a reference to Microsoft HTML Object Library

The code uses XHR to retrieve the webpage HTML then matches the table using its class attribute

Set hTable = html.querySelector(".product_pane")

querySelector applies the css class selector to the HTMLDocument, held in variable html, to retrieve the match.

I then use the clipboard to copy paste the table to the worksheet.


VBA:

Option Explicit
Public Sub GetTables()
    Dim urls(), i As Long, html As HTMLDocument, hTable As Object
    Dim ws As Worksheet, clipboard As Object, lastRow As Long

    Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")  
    Set ws = ThisWorkbook.Worksheets("Sheet1")
    ws.Columns("C:E").ClearContents
    lastRow = ws.Cells(ws.rows.Count, "A").End(xlUp).Row
    urls = Application.Transpose(ws.Range("A1:A" & lastRow).Value)
    Set html = New HTMLDocument

    With CreateObject("MSXML2.XMLHTTP")
        For i = LBound(urls) To UBound(urls) Step 6
            .Open "GET", urls(i), False
            .send
            html.body.innerHTML = .responseText
            Set hTable = html.querySelector(".product_pane")
            clipboard.SetText hTable.outerHTML '5 rows per table
            clipboard.PutInClipboard
            ws.Range("C" & i).PasteSpecial
        Next
    End With
End Sub

Example layout in sheet 1:

*the values in columns C:E are returned after running the code

Upvotes: 2

Stavros Jon
Stavros Jon

Reputation: 1697

The following is for addressing this part of your question:

Excel should automatically import the Price History Chart Data

The page you are poking offers a very convenient way to download the data you are interested in. You can do it by sending an XML HTTP request (XHR) to the corresponding server.

You can easily see that, by inspecting the network traffic in your browser's developer tools. There, you will see a bunch of requests being sent when you load the page. Most of them have to do with requesting style sheets (css), or png and gif files. Others call scripts that need to be executed (js). The one you need is the XHR request which gets the data in JSON format as a response.

enter image description here

Taking a closer look at the request you can see that in consists of the URL, the headers and a body. The body contains the parameters of the request. In this case you only need a product code.

enter image description here

Given the above, you can structure the request in VBA and send it to the server to get the data of interest. Changing the product code parameter will get you the corresponding data for each product.

To do that you need to import a JSON parser to your project. Following the instructions will do the trick.

Then you need a couple of references (VB editor>Tools>References):

  1. Microsoft WinHTTP Services version 5.1 (to create and manipulate HTTP requests)
  2. Microsoft scriptin runtime (needed by JSON parser)

Then the code would look like that:

Option Explicit

Sub downloadPriceData()
Dim req As New WinHttpRequest
Dim URL As String, reqBody As String, productCode As String
Dim respTxt As String
Dim respJSON As Object
URL = "https://grieferwert.com/wp-admin/admin-ajax.php"
productCode = getProductCode("sand-dk") 'Changing the product name will get the corresponding data.

'productCode = "200" 'Hard coded version. Each product has its own code. You can see the code by inspectig the request's body as shown in the screenshots

reqBody = "action=wooPriceHistoryAjax&subaction=getGraph&product=" & productCode & "&period=1"
With req
    .Open "POST", URL, False
    .setRequestHeader "Content-Type", "application/x-www-form-urlencoded; charset=UTF-8" 'This is the only header that's absolutely essential to the request
    .send reqBody
    respTxt = .responseText
End With
Set respJSON = JsonConverter.ParseJson(respTxt)
Debug.Print respJSON("status")
Debug.Print respJSON("json")("rows")(1)("price")("value") 'This prints the first data point. The structure of the JSON can be seen in the screenshot.

End Sub


Public Function getProductCode(productName As String) As String 'This functions finds the product code, given the name of the product
Dim req As New WinHttpRequest
Dim doc As New HTMLDocument
Dim div As HTMLDivElement
Dim URL As String

URL = "https://grieferwert.com/?product=" & productName
With req
    .Open "GET", URL, False
    .send
    doc.body.innerHTML = .responseText
End With
Set div = doc.getElementById("ph_chart_container")
getProductCode = div.Attributes("data-product").Value
End Function

The above code for demonstration purposes will only print one data point in your immediate window. Storing the product names in a worksheet and looping through them instead of looping through links will get you the data for all the products you're interested in.

This should get you kickstarted to work through getting the rest of the data.

EDIT

Edited the initial post to include an automated way to get the product code, given the product's name.

Upvotes: 1

Related Questions