Reputation: 21
I have a question on how to scrape data from this web page:
It appears to be held in the iframe and a bunch of javascripting is present on the screen.
When ever I try and collect the elements in the span or div or tr tags which are held under the iframe I dont seem to be able to collect the data that is within.
What I am targeting is the innertext which is held inside the elements of class="pane-legend-item-value pane-legend-line main".
Obviously the innertext will change depending where the cursor is on the screen at that particular time. So what I have tried to do is set up an IE which already has the page loaded and the cursor in the correct location, at the end of the graph (to give me the last data point) you can then move the cursor off the screen, I have then written some simple code to grab that IE window and then tried GetElements, it is at this point that I cannot obtain any data.
This is my code so far, its very rough as I've been attempting to edit as I read more options but have not had any wins :( ... Any ideas or help would be much appreciated! (screen shot is also at the bottom)
Sub InvestingCom()
Dim IE As InternetExplorer
Dim htmldoc As MSHTML.IHTMLDocument 'Document object
Dim eleColth As MSHTML.IHTMLElementCollection 'Element collection for th tags
Dim eleColtr As MSHTML.IHTMLElementCollection 'Element collection for tr tags
Dim eleColtd As MSHTML.IHTMLElementCollection 'Element collection for td tags
Dim eleRow As MSHTML.IHTMLElement 'Row elements
Dim eleCol As MSHTML.IHTMLElement 'Column elements
Dim elehr As MSHTML.IHTMLElement 'Header Element
Dim iframeDoc As MSHTML.HTMLDocument
Dim frame As HTMLIFrame
Dim ieURL As String 'URL
'Take Control of Open IE
marker = 0
Set objShell = CreateObject("Shell.Application")
IE_count = objShell.Windows.Count
For x = 0 To (IE_count - 1)
On Error Resume Next
my_url = objShell.Windows(x).document.Location
my_title = objShell.Windows(x).document.Title
If my_title Like "*" & "*" Then 'compare to find if the desired web page is already open
Set IE = objShell.Windows(x)
marker = 1
Exit For
Else
End If
Next
'Extract data
Set htmldoc = IE.document 'Document webpage
' I have tried span, tr, td etc tags and various other options
' I have never actually tried collecting an HTMLFrame but googled it however was unsuccessful
End Sub
Upvotes: 0
Views: 1787
Reputation: 22440
It was really difficult for me to deal with two nested iframes
from that page to collect the desired content. But anyways, I've fixed it finally. Run the below code and get the content you have requested for:
Sub forexpros()
Dim IE As New InternetExplorer, html As HTMLDocument
Dim frm As Object, frmano As Object, post As Object
With IE
.Visible = True
.navigate "http://tvc4.forexpros.com/init.php?family_prefix=tvc4&carrier=64694b96ed4909e815f1d10605ae4e83&time=1513525898&domain_ID=70&lang_ID=70&timezone_ID=31&pair_ID=171&interval=86400&refresh=4&session=session&client=1&user=200743128&width=650&height=750&init_page=instrument&m_pids=&watchlist=&site=https://au.investing.com&version=1.11.2"
Do Until .readyState = READYSTATE_COMPLETE: Loop
Application.Wait (Now + TimeValue("0:00:05"))
Set frm = .document.getElementsByClassName("abs") ''this is the first iframe
.navigate frm(0).src
Do Until .readyState = READYSTATE_COMPLETE: Loop
Application.Wait (Now + TimeValue("0:00:05"))
Set html = .document
End With
Set frmano = html.getElementsByTagName("iframe")(0).contentWindow.document ''this is the second iframe
For Each post In frmano.getElementsByClassName("pane-legend-item-value pane-legend-line main")
Debug.Print post.innerText
Next post
IE.Quit
End Sub
Upvotes: 2