Reputation: 7504
Other ASCII codes are doing the same thing.
Just to give you some background, these codes are part of the HTML that I'm reading from WordPress blog posts. I'm porting them over to BlogEngine.NET using a little C# WinForm app I wrote. Do I need to do some kind of conversion as I port them over to BlogEngine.NET (as XML files)?
It'd sure be nice if they just displayed properly without any intervention on my part.
Here's a code fragment from one of the WordPress source pages:
<link rel="alternate" type="application/rss+xml" title="INRIX® Traffic » Taking the “E” out of your “ETA” Comments Feed" href="http://www.inrixtraffic.com/blog/2012/taking-the-e-out-of-your-eta/feed/" />
Here's the corresponding chunk of XML that's in the XML file I output during the conversion:
<title>Taking the &#8220;E&#8221; out of your &#8220;ETA&#8221;</title>
UPDATE.
Tried this, but still no dice.
writer.WriteElementString("title", string.Format("<![CDATA[{0}]]>", post.Title));
...outputs this:
<title><![CDATA[Taking the &#8220;E&#8221; out of your &#8220;ETA&#8221;]]></title>
Upvotes: 1
Views: 11532
Reputation: 801
Since the data you are getting from Wordpress is already encoded you can decode it to a regular string and then let the XMLWriter encode it properly for XML.
string input = "Taking the “E” out of your “ETA”";
string decoded = System.Net.WebUtility.HtmlDecode(input);
//decoded = Taking the "E" out of your "ETA"
This may not be very efficient, but since this sounds like a one time conversion I don' think it will be an issue.
A similar question was asked here: How can I decode HTML characters in C#?
Upvotes: 3
Reputation: 1199
Any chance CDATA tags solve the issue? Just make sure the text is correct in the source XML file. You don't need the ampersand magic (in the source) if you use CDATA tags.
<some_tag><![CDATA[Taking the “ out of your ...]]></some_tag>
Upvotes: 0
Reputation: 8640
As I pointed out in my comment above: Your problem is that your Ü
gets encoded into &8220;
. When you output this in the browser it displays as Ü
I don't know how your porting works, but to fix this issue, you need to make sure that the &
in the ASCII codes doesn't get encoded to &
Upvotes: 0