itsraja
itsraja

Reputation: 1736

vbscript create-convert xml with special characters

I'm creating a xml file in a .vbs file with node values like the following,

  <car>David's</car>
  <company>Mannar & Co.</company>

While parsing this xml, I find issues with &, etc.

I want to convert all possible xml special characters with encoded characters(with a function or something) so that while parsing I get the original content.

Thanking you.

Upvotes: 3

Views: 8623

Answers (3)

peter
peter

Reputation: 42207

Based on the comment of OP here i version i made myself, couldn't find a reliable one, i think it covers all possible ascii characters

Function HTML_Encode(byVal string)
  Dim tmp, i 
  tmp = string

  tmp = Replace(tmp, chr(38), "&amp;") ' Must be the first replacement

  For i = 160 to 255
    tmp = Replace(tmp, chr(i), "&#" & i & ";")
  Next
  tmp = Replace(tmp, chr(34), "&quot;")
  tmp = Replace(tmp, chr(39), "&apos;")
  tmp = Replace(tmp, chr(60), "&lt;")
  tmp = Replace(tmp, chr(62), "&gt;")
  tmp = Replace(tmp, chr(32), "&nbsp;")
  HTML_Encode = tmp
End Function

Function HTML_Decode(byVal encodedstring)
  Dim tmp, i
  tmp = encodedstring
  tmp = Replace(tmp, "&quot;", chr(34) )
  tmp = Replace(tmp, "&apos;", chr(39))
  tmp = Replace(tmp, "&lt;"  , chr(60) )
  tmp = Replace(tmp, "&gt;"  , chr(62) )
  tmp = Replace(tmp, "&amp;" , chr(38) )
  tmp = Replace(tmp, "&nbsp;", chr(32) )
  For i = 160 to 255
    tmp = Replace(tmp, "&#" & i & ";", chr(i))
  Next
  HTML_Decode = tmp
End Function

str = "This !@#± is a & test!"
wscript.echo HTML_Encode(str) '=> This&nbsp;!@#&amp;#177;&nbsp;is&nbsp;a&nbsp;&amp;&nbsp;test!
wscript.echo HTML_Decode(HTML_Encode(str)) '=> This !@#± is a & test!

Upvotes: 0

Rotem Varon
Rotem Varon

Reputation: 1647

This is an old post but I am replying as I hope this will save someone some grief

I was working on an issue where a vendor complained that in some cases not all the special characters are being escaped in the XML. I was surprised to see that the dev used it’s own logic (function) and not some functionality offered by the framework as escaping sounds like a very common task. The following is the function before the fix:

Function HTML_Encode(byVal string)
  Dim tmp, i 
  tmp = string
  For i = 160 to 255
    tmp = Replace(tmp, chr(i), "&#" & i & ";")
  Next
  tmp = Replace(tmp, chr(34), "&quot;")
  tmp = Replace(tmp, chr(39), "&apos;")
  tmp = Replace(tmp, chr(60), "&lt;")
  tmp = Replace(tmp, chr(62), "&gt;")
  tmp = Replace(tmp, chr(38), "&amp;") <- the problem: this line should be the first replacement
  tmp = Replace(tmp, chr(32), "&nbsp;")
  HTML_Encode = tmp
End Function

Funny enough, it looks exactly as one of the answers to this post (probably copied from here :-).

I traced the problem to the order which the special characters is being replaced. Replacing the ampersand (&) MUST be the first replacement (line) as replacements (like: &quot;) are injecting ampersands which in turn will be replaced by &amp;. For example, if I have the following string: We <3 SO. The original (above) function will escape it to: We &amp;lt;3 SO. The right escaping is: We &lt;3 SO.

So the revised function can be:

  Function HTML_Encode(byVal string)
      Dim tmp, i 
      tmp = string

      tmp = Replace(tmp, chr(38), "&amp;") <- Must be the first replacement (Thanks Aaron)

      For i = 160 to 255
        tmp = Replace(tmp, chr(i), "&#" & i & ";")
      Next

      tmp = Replace(tmp, chr(34), "&quot;")
      tmp = Replace(tmp, chr(39), "&apos;")
      tmp = Replace(tmp, chr(60), "&lt;")
      tmp = Replace(tmp, chr(62), "&gt;")
      tmp = Replace(tmp, chr(32), "&nbsp;")
      HTML_Encode = tmp
    End Function

For completeness, you can find the Predefined entities in XML here

Upvotes: 8

peter
peter

Reputation: 42207

My keys weren't cold yet when i did found another one, i give this as another answer because the output is slichtly different, so you can choose which suits best. I did remove the original answer to not confuse

Function Escape(s) 
  Dim scr
  Set scr = CreateObject("MSScriptControl.ScriptControl")
  scr.Language = "VBScript"
  scr.Reset
  Escape = scr.Eval("escape(""" & s & """)")
End Function

Function Unescape(s)
  Dim scr
  Set scr = CreateObject("MSScriptControl.ScriptControl")
  scr.Language = "VBScript"
  scr.Reset
  Unescape = scr.Eval("unescape(""" & s & """)")
End Function

wscript.echo Escape("This !@#± is a & test!") '=> This%20%21@%23%B1%20is%20a%20%26%20test%21
wscript.echo Unescape(Escape("This !@#± is a & test!")) '=> This !@#± is a & test!

Upvotes: 0

Related Questions