Reputation: 150
I use below code to read and extract data from websites. But in specific URL (http://www.iamf.ir) there is a problem!
Dim HTML_Content As HTMLDocument
Dim dados As Object
'Create HTMLFile Object
Set HTML_Content = New HTMLDocument
'Get the WebPage Content to HTMLFile Object
With CreateObject("msxml2.xmlhttp")
.Open "GET", "http://www.iamf.ir", False
.send
HTML_Content.body.innerHTML = .responseText
Debug.Print .responseText ' it's OK
Debug.Print HTML_Content.body.innerHTML ' it show nothing! (problem is here)
End With
Upvotes: 1
Views: 473
Reputation: 12685
This should be the answer to your question, though I don't think it really solves your problem.
The XMLHTTP request you do to this website respond with an empty body, as you can notice from the line Debug.Print .responseText
:
<HTML>
<HEAD>
<TITLE>امین آشنا ایرانیان</TITLE>
<META NAME="Keywords" CONTENT="">
<META HTTP-EQUIV="Refresh" CONTENT="0;URL=http://www.iafi.ir">
<META NAME="Description" CONTENT="">
</HEAD>
<BODY> <-- body is empty
</BODY>
</HTML>
This is why, when you print the .body.innerHTML
of your HTML_document
, you get an empty string.
Some websites are built in a way that only the full stack execution (i.e. also JavaScript execution, which doesn't happen when you perform an XMLHTTP request) is able to render correctly what you see in your browser. In your specific case, you might need to get the information performing a slower but always working scraping based on an invisible browser. You can check out this answer I wrote some time ago to have an idea.
Upvotes: 1