wayfarer
wayfarer

Reputation: 790

vb.net get style of HTML element

I'm parsing some html to translate it into openXML xlsx. I haven't been able to extract a style attribute. I could brute force this with a custom parser, however, I was hoping to use mshtml as much as possible. The source html may have some non-standard formatting. Here are the details:

(below: input, code, and debug output)

input string:

<div id="GLGV" class="GLVG1">
<div class="GLGVOuterRow" ID="GLGV_PRTS_0" style="height:20px;">
<span id="ExtID01_0000" title="Note - N0001" class="ExtID01Label">N0001</span>
<span id="Note01" class="Note01" style="display:inline-block;width:70px;">Area Name</span>
<span id="Main01" class="MainTextAll" style="display:inline-block;height:16px;width:250px;">My new area</span>
<span id="OTLID_0" class="GRPL_Hidden">8270</span>
<span id="OTLParID_0" class="GRPL_Hidden">8269</span>
<span id="PrtTyp_0" class="GRPL_Hidden">NOTE</span>
<span class="FloatClear"></span>
</div>

Asp.net code:

Public Sub TestSample()

    Dim wrkListString As String = C.AC("List")

    Dim wrkDocument As IHTMLDocument2 = New HTMLDocumentClass()
    wrkDocument.write(wrkListString)
    wrkDocument.close()

    Dim wrkAllElements As IHTMLElementCollection = wrkDocument.body.all

    Dim ws As String = ""
    Dim wrkType As String = ""
    Dim wrkStyle As String = ""
    Dim wrkId As String = ""
    Dim wrkClass As String = ""

    For Each wrkElem In wrkAllElements

        wrkType = wrkElem.GetType().ToString
        wrkId = wrkElem.id
        wrkClass = wrkElem.className
        wrkStyle = wrkElem.Style.ToString

        ws = wrkType & " , " & wrkId & " , " & wrkClass & " , " & wrkStyle & " , "

        Debug.Print(ws)
    Next

End Sub

Debug output:

mshtml.HTMLDivElementClass , GLGV , GLVG1 , System.__ComObject , 
mshtml.HTMLDivElementClass , GLGV_PRTS_0 , GLGVOuterRow , System.__ComObject , 
mshtml.HTMLSpanElementClass , ExtID01_0000 , ExtID01Label , System.__ComObject , 
mshtml.HTMLSpanElementClass , Note01 , Note01 , System.__ComObject , 
mshtml.HTMLSpanElementClass , Main01 , MainTextAll , System.__ComObject , 
mshtml.HTMLSpanElementClass , OTLID_0 , GRPL_Hidden , System.__ComObject , 
mshtml.HTMLSpanElementClass , OTLParID_0 , GRPL_Hidden , System.__ComObject , 
mshtml.HTMLSpanElementClass , PrtTyp_0 , GRPL_Hidden , System.__ComObject , 
mshtml.HTMLSpanElementClass ,  , FloatClear , System.__ComObject , 

I don't see the detailed style from the span id="Main01", only "System.__ComObject"

Any help with how to get the detailed inline style string would be appreciated. Thanks!

Upvotes: 2

Views: 2703

Answers (1)

mga911
mga911

Reputation: 1546

The Style property of wrkElem is an IHTMLStyle object so you'll want to use the cssText property of the IHTMLStyle object to retrieve the style's text.

So now to implement this information, change this:

wrkStyle = wrkElem.Style.ToString

To this:

wrkStyle = wrkElem.Style.Csstext

Upvotes: 1

Related Questions