Reputation: 36
I'm working on a google search from excel VBA. The text which I'm interested to extract is inside a span tag:
<div class="f kv_Swb" style="white-space:nowrap">
...
<span class="st">
<span class="f">no relevant text</span>
this is the text it matters, it has a keyword i need
</span>
</div>
There are many nested div tags.
It is a string inside an element class st
, but outside an element class f
. As I said, I used a VBA script like this:
Dim IE as Object
Dim doc as Object
Dim elementA as Object
Dim elementB as Object
Dim TagA as Object
Dim TagB as Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Navigate "http://www.unsuspectwebpage.com/about"
Set doc = IE.Document
Do Until IE.ReadyState = 4
DoEvents
Loop
Set TagA = doc.getElementsByTagName("span")
For Each elementA In TagA
Set TagB = doc.getElementsByClassName("st")
For Each elementB In TagB
ws.Range("A1") = ws.Range("A1") & elementB.innertext
Next elementB
Next elementA
How can I get the text which is within class st
but outside class f
?
Upvotes: 1
Views: 471
Reputation: 22440
Not a very efficient one but it should fetch you the desired content:
Dim elem As Object, HTML As New HTMLDocument
For Each elem In HTML.getElementsByClassName("st")
Debug.Print Split(elem.innerText, elem.getElementsByTagName("span")(0).innerText)(1)
Next elem
Output:
this is the text it matters, it has a keyword i need
Upvotes: 3