rorfun
rorfun

Reputation: 96

How to remove extra attributes of img tag using regular expression?

I have a web form using an outdated texteditor. With this, whenever user inserted an image in design view, the html view will have 2 width/height attributes generated. When it is posted to server, it will cause issues in other pages where System.Xml.XPath.XPathDocument is used.

Due to certain reason, I cannot replace the texteditor with a modern one and hence wish to remove the extra attribute (width and height) from the posted img tag in server side.

The following is an example of the posted html text to the server:

<table align="left">
    <thead>
    </thead>
    <tbody>
        <tr>
            <td style="width: 250px; vertical-align: top; text-align: left;">
            <p>&nbsp;<img width="263" height="175" alt="" style="height: 122px; width: 184px;" src="/Portals/0/Resources/Site/Signature/1.PNG" width="263" height="175" /></p>
            </td>
            <td style="width: 50px;">&nbsp;</td>
            <td style="width: 250px; vertical-align: top; text-align: left;">
            <p>&nbsp;</p>
            <p><img width="168" height="66" alt="" style="height: 79px; width: 170px;" src="/Portals/0/Resources/Site/Signature/2.jpg" width="168" height="66" /></p>
            </td>
            <td style="width: 50px;">&nbsp;</td>
            <td style="width: 250px; vertical-align: top; text-align: left;">
            <p>&nbsp;</p>
            <p><img width="217" height="93" alt="" src="/Portals/0/Resources/Site/Signature/3.png" width="217" height="93" /></p>
            </td>
        </tr>
    </tbody>
</table>

Is there anyway efficient way to remove them using vb.net regular expression? Or, you have better idea of handling this?

Upvotes: 0

Views: 376

Answers (1)

rorfun
rorfun

Reputation: 96

Use HTMLAgilityPack to delete the duplicated attributes (VB.NET):

     Dim txtTemp As string = ""
     Try
         Dim htmlDocument as HtmlDocument = New HtmlDocument()
         htmlDocument.LoadHtml(txtPosted)             
         for each imgNode as HtmlNode In htmlDocument.DocumentNode.Descendants("img")
    'Get value of width/height attribute
             Dim txtTempWidth as string = IIf(imgNode.Attributes.Contains("width"), imgNode.Attributes("width").Value, "").ToString()
             Dim txtTempHeight as string = IIf(imgNode.Attributes.Contains("height"), imgNode.Attributes("height").Value, "").ToString()                 
             if not string.IsNullOrEmpty(txtTempWidth)  then
         'remove all "width" attributes
                 While imgNode.Attributes.Contains("width")
                     imgNode.Attributes.Remove("width")
                 End While
         'add one "width" attribute
                 imgNode.Attributes.Add("width", txtTempWidth)
             End If
             if not string.IsNullOrEmpty(txtTempHeight)  then
         'remove all "height" attributes
                 While imgNode.Attributes.Contains("height")
                     imgNode.Attributes.Remove("height")
                 End While  
         'add one "height" attribute
                 imgNode.Attributes.Add("height", txtTempHeight)
             End If     
         Next

     'close img tag
         if (HtmlNode.ElementsFlags.ContainsKey("img")) then
            HtmlNode.ElementsFlags("img") = HtmlElementFlag.Closed
         else
            HtmlNode.ElementsFlags.Add("img", HtmlElementFlag.Closed)
         end if

        using writer as StringWriter = new StringWriter()
            htmlDocument.Save(writer)
            txtTemp = writer.ToString()
        End Using
     Catch ex As Exception
        Exceptions.LogException(ex)
        txtTemp = ""
     End Try


    'final result
    Dim txtFinal as string = IIf(string.IsNullOrEmpty(txtTemp), txtPosted, txtTemp) .ToString()

Upvotes: 1

Related Questions