Reputation: 1383
Let's say I have a page as follows, saved at c:\temp\html_page.html:
<html>
<head>
<link rel="stylesheet" href="styles.css">
</head>
<body>
<div id="xxx1">
<img src="test.png">
</div>
</body>
</html>
I'd like to programatically adjust the src attribute of the img, based on Excel data & VBA. Basically a way to find the div with Xpath, and adjust the (single) img tag that is contained in it.
I found an example for manipulating XML with VBA through the XML library here, but I've been crunching my head around making this work with the HTML object library; can't find any examples and/or documentation.
Dim XDoc As Object, root As Object
Set XDoc = CreateObject("MSXML2.DOMDocument")
XDoc.async = False: XDoc.validateOnParse = False
If XDoc.Load(html_path) Then
Debug.Print "Document loaded"
Else
Dim strErrText As String
Dim xPE As MSXML2.IXMLDOMParseError
' Obtain the ParseError object
Set xPE = XDoc.parseError
With xPE
strErrText = "Your XML Document failed to load" & _
"due the following error." & vbCrLf & _
"Error #: " & .ErrorCode & ": " & xPE.reason & _
"Line #: " & .Line & vbCrLf & _
"Line Position: " & .linepos & vbCrLf & _
"Position In File: " & .filepos & vbCrLf & _
"Source Text: " & .srcText & vbCrLf & _
"Document URL: " & .URL
End With
MsgBox strErrText, vbExclamation
All I want to do is:
'...
Set outer_div = XDoc.SelectFirstNode("//div[id='xxx1'")
... edit the img attribute
But I can't load the HTML page, because it's not proper XML (img tag isn't closed).
Any help is greatly appreciated. Oh, and I can't use other languages such as Python, bummer.
Upvotes: 4
Views: 6549
Reputation: 14053
For this purpose you can use doc.querySelector("div[id='xxx1'] img")
. To change src
attribute use img.setAttribute "src", "new.png"
. HTH
Option Explicit
' Add reference to Microsoft Internet Controls (SHDocVw)
' Add reference to Microsoft HTML Object Library
Sub Demo()
Dim ie As SHDocVw.InternetExplorer
Dim doc As MSHTML.HTMLDocument
Dim url As String
url = "file:///C:/Temp/StackOverflow/html/html_page.html"
Set ie = New SHDocVw.InternetExplorer
ie.Visible = True
ie.navigate url
While ie.Busy Or ie.readyState <> READYSTATE_COMPLETE: DoEvents: Wend
Set doc = ie.document
Dim img As HTMLImg
Set img = doc.querySelector("div[id='xxx1'] img")
If Not img Is Nothing Then
img.setAttribute "src", "new.png"
End If
ie.Quit
End Sub
Upvotes: 1
Reputation: 4363
This isn't quite what you want, but it may be close enough. Rather than using the XML
library, use the HTML
library:
Sub changeImg()
Dim dom As Object
Dim img As Object
Dim src As String
Set dom = CreateObject("htmlFile")
Open "C:\temp\test.html" For Input As #1
src = Input$(LOF(1), 1)
Close #1
dom.body.innerHTML = src
Set img = dom.getelementsbytagname("img")(0)
img.src = "..."
Open "C:\temp\test.html" For Output As #1
Print #1, dom.DocumentElement.outerHTML
Close #1
End Sub
The issue is that the resulting file will have Head
nodes added and the tag names will be in upper case. If you can live with this, the solution will work for you.
As an aside if you want to do things a bit more in depth, with better selectors consider early binding. The HTML interface exposed is different than the interface when late bound and supports more properties - you'll want to add a reference to the HTML Object Library
:
Sub changeImg()
Dim dom As HTMLDocument
Dim img As Object
Dim src As String
Set dom = CreateObject("htmlFile")
Open "C:\temp\test.html" For Input As #1
src = Input$(LOF(1), 1)
Close #1
dom.body.innerHTML = src
Set img = dom.getelementsbytagname("img")(0)
img.src = "..."
Open "C:\temp\test.html" For Output As #1
Print #1, dom.DocumentElement.outerHTML
Close #1
End Sub
Upvotes: 4