Chad Schultz
Chad Schultz

Reputation: 7860

Strange characters in HTTP GET response

I'm trying to figure out why special characters in a JSON feed (that looks completely fine when viewed in a browser) will break when used in my Android code. Characters with accent marks, ellipsis characters, curly quote characters and so on are replaced by other characters--perhaps translating it from UTF-8 down to ASCII? I'm not sure. I'm using a GET request to pull JSON data from a server, parsing it, storing it in a database, then using Html.fromHtml() and placing the contents in a TextView.

Upvotes: 0

Views: 1822

Answers (1)

Chad Schultz
Chad Schultz

Reputation: 7860

After much experimentation, I narrowed down possibilities until I discovered the problem is with the Ignition HTTP libraries (https://github.com/kaeppler/ignition). Specifically, with ignitedHttpResponse.getResponseBodyAsString()

Although that's a handy shortcut, that one line results in the broken characters. Instead, I now use:

InputStream contentStream = ignitedHttpResponse.getResponseBody();
String content = Util.inputStreamToString(contentStream);


public static String inputStreamToString(InputStream is) throws IOException {
        String line = "";
        StringBuilder total = new StringBuilder();

        // Wrap a BufferedReader around the InputStream
        BufferedReader rd = new BufferedReader(new InputStreamReader(is));

        // Read response until the end
        while ((line = rd.readLine()) != null) {
            total.append(line);
        }

        // Return full string
        return total.toString();
    }

Edit: Adding more detail

Here is a minimum test case to reproduce the issue.

@Override
protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.test);

    activity = this;

    instance = this;

    String url = SaveConstants.URL;
    IgnitedHttpRequest request = new IgnitedHttp(activity).get(url);
    InputStream contentStream = null;
    try {
    IgnitedHttpResponse response = request.send();

    String badContent = response.getResponseBodyAsString();
    int start = badContent.indexOf("is Texas");
    Log.e(TAG, "bad content: " + badContent.substring(start, start + 10));
    contentStream = response.getResponseBody();
    String goodContent = Util.inputStreamToString(contentStream);
    start = goodContent.indexOf("is Texas");
    Log.e(TAG, "good content: " + goodContent.substring(start, start + 10));
    } catch (IOException ioe) {
        Log.e(TAG, "error", ioe);
    }
}

In the log:

bad content: is Texasâ good content: is Texas’

Update: either I'm crazy, or the problem only occurs in the clients' production feed, not their development feed, although the contents look identical when viewed in a browser--showing "Texas’". So perhaps there's some wonky server configuration required to cause this issue... but still, the fix for this issue when it occurs is as I outlined. I do not recommend using response.getResponseBodyAsString();

Upvotes: 1

Related Questions