Reputation: 1091
I use an automation script that tests a browser-based application. I'd like to save the visible text of each page I load as a text file. This needs to work for the current open browser window. I've come across some solutions that use InternetExplorer.Application
but this won't work for me as it has to be the current open page.
Ideally, I'd like to achieve this using vbscript. Any ideas how to do this?
Upvotes: 0
Views: 6229
Reputation: 200213
You can attach to an already running IE instance like this:
Set app = CreateObject("Shell.Application")
For Each window In app.Windows()
If InStr(1, window.FullName, "iexplore", vbTextCompare) > 0 Then
Set ie = window
Exit For
End If
Next
Then save the document body text like this:
Set fso = CreateObject("Scripting.FileSystemObject")
Set f = fso.OpenTextFile("output.txt", 2, True)
f.Write ie.document.body.innerText
f.Close
If the page contains non-ASCII characters you may need to create the output file with Unicode encoding:
Set f = fso.OpenTextFile("output.txt", 2, True, -1)
or save it as UTF-8:
Set stream = CreateObject("ADODB.Stream")
stream.Open
stream.Type = 2 'text
stream.Position = 0
stream.Charset = "utf-8"
stream.WriteText ie.document.body.innerText
stream.SaveToFile "output.txt", 2
stream.Close
Edit: Something like this may help getting rid of script code in the document body:
Set re = New RegExp
re.Pattern = "<script[\s\S]*?</script>"
re.IgnoreCase = True
re.Global = True
ie.document.body.innerHtml = re.Replace(ie.document.body.innerHtml, "")
WScript.Echo ie.document.body.innerText
Upvotes: 6