Reputation: 21
Diclosure: I'm just an office clerk and very-very new to VBA and HTML. Hopefully you will be patient with me. I would really appreciate any guidance and help. Hopefully I'm formatting it correctly.
I spent whole day yesterday trying to import information from intranets web-page to automate routing copying and pasting. This will really help in the long run.
Since power-query doesn't seem to see the table I need, I figured the only option is using VBA. MsServer tool grabs page perfectly, but to my disappointment page came out with error, since it needs to authorize before access it first.
I figured with the use of IE it should work, since IE have login information in cookies.
Here where I got so far.
Sub ExtractFromEndeca() Dim ie As InternetExplorer Dim html As
IHTMLDocument Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
ie.Navigate "intranet address"
While ie.Busy
DoEvents Wend While ie.ReadyState < 4
DoEvents Wend
Set Doc = CreateObject("htmlfile")
Set Doc = ie.document
Set Data = Doc.getElementById("findSimilarOptions2")
Sheet1.Cells(1, 1) = Data
ie.Quit Set ie = Nothing
ThisWorkbook.Sheets(1).Cells(1, 1) = Data
End Sub
Result is [object] in Cell A1 and that's it and I can't understand if I got past login or not.
Here is a page fragment I'm trying to grab. Ideally this data will be output as a table.
<td valign="top" id="findSimilarOptions2">
<div class="subtitle">Part Attributes</div>
<input type="checkbox" id="n_200012" value="-19192896" NAME="n_200012">
<b>
ASSY TYPE</b> >
Component<br>
<input type="checkbox" id="n_200013" value="-18148519" NAME="n_200013">
<b>
PARAMETER I NEED(1)</b> >
VALUE I NEED(1)<br>
<input type="checkbox" id="n_200006" value="-20823731" NAME="n_200006">
<b>
PARAMETER I NEED(2)</b> >
VALUE I NEED(2)<br>
<input type="checkbox" id="n_200006" value="-20823618" NAME="n_200006">
<b>
PARAMETER I NEED(3)</b> >
VALUE I NEED(3)<br>
<input type="checkbox" id="n_200006" value="-20823586" NAME="n_200006">
<b>
PARAMETER I NEED(4)</b> >
VALUE I NEED(4)<br>
...
Upvotes: 2
Views: 374
Reputation: 2267
Please read my comments in the following code:
'Use the following line in every module head
'It forces you to define all variables
Option Explicit
Sub ExtractFromEndeca()
Dim ie As InternetExplorer
Dim doc As IHTMLDocument 'You don't use html in your code, but doc
Dim data As HTMLHtmlElement 'You should define all variables
Dim singleData As HTMLHtmlElement 'New variable
Dim row As Long 'New variable
row = 1 'First row for output in Excel table
'Set ie = CreateObject("InternetExplorer.Application") 'This could be problematic on the intranet due to security guidelines
Set ie = GetObject("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}") 'Try this instead to initialize the IE
ie.Visible = True 'This property should be True while development
ie.Navigate "intranet address"
'While ie.Busy: DoEvents: Wend 'You don't need this line
While ie.ReadyState <> 4: DoEvents: Wend
'Set Doc = CreateObject("htmlfile") 'You don't need this line
Set doc = ie.document
Set data = doc.getElementById("findSimilarOptions2").getElementsByTagName("input")
'Data is only a reference to an object
'You want the text information which lies in the value attributes of each input tag
For Each singleData In data
Sheet1.Cells(row, 1) = data.Value
row = row + 1
Next singleData
'Clean up
'(Automatic after development has finished)
'ie.Quit
'Set ie = Nothing
End Sub
Upvotes: 3