Reputation: 1736
I'm creating a xml file in a .vbs file with node values like the following,
<company>Mannar & Co.</company>
While parsing this xml, I find issues with &, etc.
I want to convert all possible xml special characters with encoded characters(with a function or something) so that while parsing I get the original content.
Thanking you.
Upvotes: 3
Views: 8623
Reputation: 42207
Based on the comment of OP here i version i made myself, couldn't find a reliable one, i think it covers all possible ascii characters
Function HTML_Encode(byVal string)
Dim tmp, i
tmp = string
tmp = Replace(tmp, chr(38), "&") ' Must be the first replacement
For i = 160 to 255
tmp = Replace(tmp, chr(i), "&#" & i & ";")
tmp = Replace(tmp, chr(34), """)
tmp = Replace(tmp, chr(39), "'")
tmp = Replace(tmp, chr(60), "<")
tmp = Replace(tmp, chr(62), ">")
tmp = Replace(tmp, chr(32), " ")
HTML_Encode = tmp
End Function
Function HTML_Decode(byVal encodedstring)
Dim tmp, i
tmp = encodedstring
tmp = Replace(tmp, """, chr(34) )
tmp = Replace(tmp, "'", chr(39))
tmp = Replace(tmp, "<" , chr(60) )
tmp = Replace(tmp, ">" , chr(62) )
tmp = Replace(tmp, "&" , chr(38) )
tmp = Replace(tmp, " ", chr(32) )
For i = 160 to 255
tmp = Replace(tmp, "&#" & i & ";", chr(i))
HTML_Decode = tmp
End Function
str = "This !@#± is a & test!"
wscript.echo HTML_Encode(str) '=> This !@#&#177; is a & test!
wscript.echo HTML_Decode(HTML_Encode(str)) '=> This !@#± is a & test!
Upvotes: 0
Reputation: 1647
This is an old post but I am replying as I hope this will save someone some grief
I was working on an issue where a vendor complained that in some cases not all the special characters are being escaped in the XML. I was surprised to see that the dev used it’s own logic (function) and not some functionality offered by the framework as escaping sounds like a very common task. The following is the function before the fix:
Function HTML_Encode(byVal string)
Dim tmp, i
tmp = string
For i = 160 to 255
tmp = Replace(tmp, chr(i), "&#" & i & ";")
tmp = Replace(tmp, chr(34), """)
tmp = Replace(tmp, chr(39), "'")
tmp = Replace(tmp, chr(60), "<")
tmp = Replace(tmp, chr(62), ">")
tmp = Replace(tmp, chr(38), "&") <- the problem: this line should be the first replacement
tmp = Replace(tmp, chr(32), " ")
HTML_Encode = tmp
End Function
Funny enough, it looks exactly as one of the answers to this post (probably copied from here :-).
I traced the problem to the order which the special characters is being replaced. Replacing the ampersand (&
) MUST be the first replacement (line) as replacements (like: "
) are injecting ampersands which in turn will be replaced by &
. For example, if I have the following string: We <3 SO
. The original (above) function will escape it to: We &lt;3 SO
. The right escaping is: We <3 SO
So the revised function can be:
Function HTML_Encode(byVal string)
Dim tmp, i
tmp = string
tmp = Replace(tmp, chr(38), "&") <- Must be the first replacement (Thanks Aaron)
For i = 160 to 255
tmp = Replace(tmp, chr(i), "&#" & i & ";")
tmp = Replace(tmp, chr(34), """)
tmp = Replace(tmp, chr(39), "'")
tmp = Replace(tmp, chr(60), "<")
tmp = Replace(tmp, chr(62), ">")
tmp = Replace(tmp, chr(32), " ")
HTML_Encode = tmp
End Function
For completeness, you can find the Predefined entities in XML here
Upvotes: 8
Reputation: 42207
My keys weren't cold yet when i did found another one, i give this as another answer because the output is slichtly different, so you can choose which suits best. I did remove the original answer to not confuse
Function Escape(s)
Dim scr
Set scr = CreateObject("MSScriptControl.ScriptControl")
scr.Language = "VBScript"
Escape = scr.Eval("escape(""" & s & """)")
End Function
Function Unescape(s)
Dim scr
Set scr = CreateObject("MSScriptControl.ScriptControl")
scr.Language = "VBScript"
Unescape = scr.Eval("unescape(""" & s & """)")
End Function
wscript.echo Escape("This !@#± is a & test!") '=> This%20%21@%23%B1%20is%20a%20%26%20test%21
wscript.echo Unescape(Escape("This !@#± is a & test!")) '=> This !@#± is a & test!
Upvotes: 0