mberube.Net
mberube.Net

Reputation: 2170

How to decode HTML encoded character embedded in a json string

I've a little question about decoding special characters from a JSon result (in my case, \x27 but it could be any valid html encoded character). If the result doesn't contains any escaped characters, it works well but if not, I get a Unrecognized escape sequence exception. I try to do an HttpUtility.HtmlDecode on the Json string before deserializing using JavascriptSerializer, it doesn't work, the character still in encoded format.

Here's a code snippet:

public IEnumerable<QuoteInfo> ParseJson(string json)
{
    System.Web.Script.Serialization.JavaScriptSerializer jss = new System.Web.Script.Serialization.JavaScriptSerializer();
    List<QuoteInfo> result = jss.Deserialize<List<QuoteInfo>>(System.Web.HttpUtility.HtmlDecode(json));
    return result;
}

I tried to use RegistersConverters to HtmlDecode any string I could find during deserialization but I can't figure out how to use it properly.

How can I solve that problem?

As back2dos nicely explained, this problem wasn't related to an HtmlDecode problem but to an misformatted Json string.

Upvotes: 1

Views: 13230

Answers (2)

back2dos
back2dos

Reputation: 15623

ok, i have very superficial knowledge about C#, and none about the .NET API, but intuitively HtmlDecode should decode HTML entities (please excuse me if i'm wrong on that one) ... encoding is quite a b*tch, i know, so i will try to clearly explain the differences between what you have, what you tried, and what should work ...

the correct HTML entity would be &#x27 and not \x27 ... \x27 is a hexadecimal ASCII escape-sequence, as accepted by some JSON decoders and many programming languages, but is completely unrelated to HTML ...

and also, it has nothing to do with JSON, which is the problem ... JSON specs for strings do not allow hexadecimal ASCII escape-sequences, but only Unicode escape-sequences, which is why the escape sequence is unrecognized and which is why using \u0027 instead should work ... now you could blindly replace \x with \u00 (this should perfectly work on valid JSON, although some comments may get damaged in theory, but who cares ... :D)

but personally, if you have access to the source, you should modify it, to make it output valid JSON to match the specs ...

greetz

back2dos

Upvotes: 4

Luke Schafer
Luke Schafer

Reputation: 9265

I'm not sure I understand the requirements, but you could try looking at System.Security.SecurityElement.Escape (that's what I'm using, I'm guessing that there's an unescape but don't have time now to check the api, have to go to a meeting)

Good luck

Upvotes: 0

Related Questions