Hamamelis
Hamamelis

Reputation: 2115

How to encode special characters in XML

My string XML contains a whole series of special characters:

&
egrave;
&
rsquo;
&
rsquo;
&
rsquo;
&
ldquo;
&
rdquo;
&
rsquo
&
agrave;
&
agrave;

I need replace this special characters in insert string in DB and I tried use System.Net.WebUtility.HtmlEncode without success, can you help me?

string sql = "insert into rss (title, description, link, pubdate) values (?,?,?, " +
             " STR_TO_DATE(?, '%a, %d %b %Y %H:%i:%s GMT'));";

OdbcCommand command;
OdbcDataAdapter adpter = new OdbcDataAdapter();
connection.Open();
command = new OdbcCommand(sql, connection);
command.Parameters.AddWithValue("param1", System.Net.WebUtility.HtmlEncode(xmlTitle.InnerText.ToString()));
command.Parameters.AddWithValue("param2", System.Net.WebUtility.HtmlEncode(xmlDescription.InnerText.ToString()));
command.Parameters.AddWithValue("param3", System.Net.WebUtility.HtmlEncode(xmlLink.InnerText.ToString()));
command.Parameters.AddWithValue("param4", System.Net.WebUtility.HtmlEncode(xmlPubDate.InnerText.ToString()));
adpter.InsertCommand = command;
adpter.InsertCommand.ExecuteNonQuery();
connection.Close();

Upvotes: 21

Views: 87197

Answers (9)

Anton Lasevich
Anton Lasevich

Reputation: 196

Dirty and easy

public static class StringExtensions
{
    public static string EscapeForXml(this string value)
    {
        if (string.IsNullOrEmpty(value))
        {
            return value;
        }
        return new XElement("temp", value).LastNode.ToString();
    }
}

Upvotes: 0

ogggre
ogggre

Reputation: 2266

The ready to use XML escape function for .NET 5+:

[return: NotNullIfNotNull(nameof(s))]
static string? XmlEscape(string? s)
{
    if (string.IsNullOrEmpty(s))
        return s;

    var node = new XElement("X") { Value = s };
    return node.ToString()[3..^4];
}

Usage example:

Console.WriteLine(XmlEscape("Hello < & >"));

The produced output:

Hello &lt; &amp; &gt;

Upvotes: 2

Pankaj Bajad
Pankaj Bajad

Reputation: 31

You can use System.Xml.Linq.XElement to encode special characters in XML.

Like this:

var val = "test&<";
var node = new XElement("Node");
node.Value = val ?? node.Value;
Console.WriteLine(node.ToString());

OUTPUT:

"<Node>test&amp;&lt;</Node>"

Upvotes: 3

Solarev Sergey
Solarev Sergey

Reputation: 323

Simple code:

    public static string ToXmlStr(string value) => String.IsNullOrEmpty(value) ? "" : value.Replace("&", "&amp;").Replace("'", "&apos;").Replace("\"", "&quot;").Replace(">", "&gt;").Replace("<", "&lt;");

    public static string FromXmlStr(string xmlStr) => String.IsNullOrEmpty(xmlStr) ? "" : xmlStr.Replace("&apos;", "'").Replace("&quot;", "\"").Replace("&gt;", ">").Replace("&lt;", "<").Replace("&amp;", "&");

    public static string ToMultilineXmlStr(string value) => String.IsNullOrEmpty(value) ? "" :
        value.Replace("\r", "").Split('\n').Aggregate(new StringBuilder(), (s, n) => s.Append("<p>").Append(ToXmlStr(n)).Append("</p>\n")).ToString();

Please note: for multiline values in xml usualy yon need to incapsulate each line into <p> tag. So "<'&A'>\n<'&B'>" => "<p>&lt;&amp;A;&gt;</p><p>&lt;&amp;B;&gt;</p>"

Upvotes: 0

Nathanael Istre
Nathanael Istre

Reputation: 125

There are 3 other ways this can be done from what you tried:

  1. Use string.Replace() 5 times
  2. Use System.Web.HttpUtility.HtmlEncode()
  3. System.Xml.XmlTextWriter

I could explain each case but I found this link to be mightily useful.

Upvotes: 4

Dmytro Khmara
Dmytro Khmara

Reputation: 1220

You can use a native .NET method for escaping special characters in text. Sure, there's only like 5 special characters, and 5 Replace() calls would probably do the trick, but I'm sure there's got to be something built-in.

Example of converting "&" to "&amp;"

To much relief, I've discovered a native method, hidden away in the bowels of the SecurityElement class. Yes, that's right - SecurityElement.Escape(string s) will escape your string and make it XML safe.

This is important, since if we are copying or writing data to Infopath Text fields, it needs to be first Escaped to non-Entity character like "&amp;".

invalid XML Character to Replaced With

"<" to "&lt;"

">" to "&gt;"

"\"" to "&quot;"

"'" to "&apos;"

"&" to "&amp;"

Namespace is "System.Security". Refer : http://msdn2.microsoft.com/en-us/library/system.security.securityelement.escape(VS.80).aspx

The Other Option is to Customise code for

public static string EscapeXml( this string s )
{
  string toxml = s;
  if ( !string.IsNullOrEmpty( toxml ) )
  {
    // replace literal values with entities
    toxml = toxml.Replace( "&", "&amp;" );
    toxml = toxml.Replace( "'", "&apos;" );
    toxml = toxml.Replace( "\"", "&quot;" );
    toxml = toxml.Replace( ">", "&gt;" );
    toxml = toxml.Replace( "<", "&lt;" );
  }
  return toxml;
}

public static string UnescapeXml( this string s )
{
  string unxml = s;
  if ( !string.IsNullOrEmpty( unxml ) )
  {
    // replace entities with literal values
    unxml = unxml.Replace( "&apos;", "'" );
    unxml = unxml.Replace( "&quot;", "\"" );
    unxml = unxml.Replace( "&gt;", ">" );
    unxml = unxml.Replace( "&lt;", "<" );
    unxml = unxml.Replace( "&amp;", "&" );
  }
  return unxml;
}

Upvotes: 26

Mohammad J Qureshi
Mohammad J Qureshi

Reputation: 21

Statement toxml = toxml.Replace( "&", "&amp;" );

This has to be done first. Otherwise, when calling this last will replace all the previous "&" (' or ") with &s;

Upvotes: 1

Dallas
Dallas

Reputation: 504

You can use HttpUtility.HtmlDecode or with .NET 4.0+ you can also use WebUtility.HtmlDecode

Upvotes: 20

BRAHIM Kamel
BRAHIM Kamel

Reputation: 13755

Instead of System.Net.WebUtility.HtmlEncode you have to use System.Net.WebUtility.HtmlDecode

Upvotes: 6

Related Questions