Reputation: 21119
I am trying to retrieve regular (126,37€) and reduced (101,10€) price information from this website.
Simplified HTML code looks like this:
<div class="vw-productFeatures ">
<ul class="feature-list -price-container">
<li class="feature -price">
<span class="value">126,37</span>
</li>
</ul>
<ul class="feature-list vw-productVoucher">
<li class="voucher-information">Mit Code
<span class="voucher-reduced-price">101,10</span>
</li>
</ul>
</div>
So, I basically go step by step (div class -> ul class -> li class -> span class) and get the innerText at the end.
I am able to get the regular price, however, spanclass.innerText
of reduced price returns empty.
This is the code I am working with:
Function getHTMLelemFromCol(HTMLColIn As MSHTML.IHTMLElementCollection, tagNameIn As String, classNameIn As String) As MSHTML.IHTMLElement
Dim HTMLitem As MSHTML.IHTMLElement
For Each HTMLitem In HTMLColIn
If (HTMLitem.tagName = tagNameIn) Then
If (HTMLitem.className = classNameIn) Then
Set getHTMLelemFromCol = HTMLitem
Exit For
End If
End If
Next HTMLitem
End Function
Function getPrice(webSite As String, divClass As String, ulClass As String, liClass As String, spanClass As String) As String
Dim XMLPage As New msxml2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim HTMLitem As MSHTML.IHTMLElement
Dim HTMLObjCol As MSHTML.IHTMLElementCollection
XMLPage.Open "GET", webSite, False
XMLPage.send
HTMLDoc.body.innerHTML = XMLPage.responseText
Set HTMLObjCol = HTMLDoc.getElementsByClassName(divClass)
Set HTMLitem = getHTMLelemFromCol(HTMLObjCol, "DIV", divClass) ' Find the div class we are interested in first
Set HTMLitem = getHTMLelemFromCol(HTMLitem.Children, "UL", ulClass) ' Find the ul class we are interested in
Set HTMLitem = getHTMLelemFromCol(HTMLitem.Children, "LI", liClass) ' Find the li class we are interested in
Set HTMLitem = getHTMLelemFromCol(HTMLitem.Children, "SPAN", spanClass) ' Find the span class we are interested in
getPrice = HTMLitem.innerText
End Function
Sub Run()
Dim webSite As String, divClass As String, ulClass As String, liClass As String, spanClass As String, regularPrice As String, reducedPrice As String
webSite = "https://www.rakuten.de/produkt/msi-b450-tomahawk-max-atx-mainboard-4x-ddr4-max-64gb-1x-dvi-d-1x-hdmi-14-1x-usb-c-31-2843843890"
divClass = "vw-productFeatures "
' Get the regular price
ulClass = "feature-list -price-container"
liClass = "feature -price"
spanClass = "value"
regularPrice = getPrice(webSite, divClass, ulClass, liClass, spanClass)
' Get the reduced price
ulClass = "feature-list vw-productVoucher -hide"
liClass = "voucher-information"
spanClass = "voucher-reduced-price"
reducedPrice = getPrice(webSite, divClass, ulClass, liClass, spanClass)
Debug.Print "Regular price: " & regularPrice
Debug.Print "Reduced price: " & reducedPrice
End Sub
The output I am getting:
Regular price: 126,37
Reduced price:
Debugger shows that it is able to find the correct span class, but it does not have any attribute (including innerText) that has the price information.
How can I get the reduced price information?
Upvotes: 1
Views: 656
Reputation: 5677
Sometimes when much of the page's content is dependent on API calls, it is easier to use browser automation.
It's non-ideal from a performance perspective, but faster to get operational, and works in a pinch. The alternative approach is to monitor the web traffic between you and the server, and see if you can emulate the web requests to get the reduced price. This would be faster, but may take a bit of time to figure out how this works.
There are trade-offs for each approach to consider. Below is some Internet Explorer Automation code that is working for me to retrieve the data I believe you are after.
Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub GetReducedPrice()
Dim text As String
With CreateObject("internetexplorer.application")
.navigate "https://www.rakuten.de/produkt/msi-b450-tomahawk-max-atx-mainboard-4x-ddr4-max-64gb-1x-dvi-d-1x-hdmi-14-1x-usb-c-31-2843843890"
Do While .Busy And .readyState <> 4: DoEvents: Loop
Sleep 1000 ' wait a little bit too
text = .document.querySelector(".voucher-reduced-price").innerText
.Quit
End With
Debug.Print "the reduced price is: " & text
End Sub
Result is:
the reduced price is: 101,10
Upvotes: 1
Reputation: 12255
There's no -hide
class for reduce price:
ulClass = "feature-list vw-productVoucher"
You can use simple selectors to get both prices with querySelector
(example) instead of complex methods with unnecessary iterations.
regularPrice = HTMLDoc.querySelector(".-price .value").innerText
reducedPrice = HTMLDoc.querySelector(".voucher-reduced-price").innerText
Update:
Vaucher is https://tags.tiqcdn.com/utag/rakuten/main/prod/utag.js here and calculated based on product_shop_id
and dates.
Upvotes: 0