Arghya C
Arghya C

Reputation: 10068

Simple way to un-escape escaped unpritable encoded string returned by mvc application

I have checked other similar questions on SO and they either propose to use WebUtility.HtmlDecode() or replace the encoded parts character-by-character or assumes some known regex pattern, etc. They do not answer this specific query.

I have a C# console application, which posts some data to a MVC application. Then the message returned by the service is written to a simple text file. When I write it to the file, the text is like

"Something didn\u0027t work right while processing this request! \r\nSee detailed logs \u003e d:\\Sandboxes\\UGBNC\\Stage\\Logs\\ArgLog2087129002.log"

What I want is to remove these encoded texts like \u0027, \r\n, \\ etc. and format it properly (like, with new line, tab etc.) in my text file. And I don't know what all characters might come, so I cannot replace them with string replace or regex replace, I need a generic solution.

The MVC service returns the data as json with Content-Type: application/json; charset=utf-8, and my client code is this

try
{
    using (var client = new HttpClient())
    {
        var request = WebRequest.Create(uri);
        //configure request details
        using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
        using (StreamReader sr = new StreamReader(response.GetResponseStream()))
        {
            var message = sr.ReadToEnd();
            //process message
        }
    }
}
catch (WebException wex)
{
    using (var stream = wex.Response.GetResponseStream())
    using (var reader = new StreamReader(stream))
    {
        var message = reader.ReadToEnd(); //this is the encoded string
        File.AppendAllText("SomeTextFile.txt", message);
    }
}

What is the best/simplest way to do this?

Note: I don't want to replace them character by character, I want a generic solution.

Upvotes: 1

Views: 4241

Answers (1)

Arghya C
Arghya C

Reputation: 10068

From the links in the comments, I got a working solution, thanks to this post. In short, this worked for now.

var unescapedString = System.Text.RegularExpressions.Regex.Unescape(escapedString);

Longer version : Little more details for those who might face similar issues.

This is a typical sample of the strings that I was trying to make sane (readable & printable)

"Something didn\u0027t work right while processing this request! \r\nSee detailed logs \u003e d:\Sandboxes\UGBNC\Stage\Logs\ArgLog2087129002.log"

(1) Though the string came from a web response, this was not HTML, rather a JSON. So, the HTML decode methods like new WebUtility.HtmlDecode(str) or older System.Web HttpUtility.HtmlDecode(str) did not work.

(2) Characters like \u0027 are unicode characters (this ones for apostrophe '), but trying with System.Text.Encoding.Unicode yielded no good result. (Maybe I missed the trick!)

(3) Basically what I needed was to convert characters like \u0027, \r\n, \\ to their printable format. For that the System.Text.RegularExpressions.Regex.Unescape() method worked fine on my strings. This method converts all the escaped characters in the string to their unescaped form.

Note: Anyone using this method, please refer the msdn doc first. This method has some limitations, it's not perfect and might give wrong result in some scenarios.

Check this & this for better solutions.

Upvotes: 3

Related Questions