Roman Nazarevych
Roman Nazarevych

Reputation: 7703

How to parse JSON which has escaped quotes with GSON

I have following JSON [{\"X\":24.0124010872935,\"Y\":49.7740722529036,\"Code\":\"0320\",\"Name\": .....]

I try to parse it as

Gson gson = new Gson();
gson.fromJson(response.body(), RouteModel[].class)

And got Exception

Caused by: com.google.gson.stream.MalformedJsonException: Expected name at line 1 column 3 path $[0].

EDIT So far the best solution was to add compile 'org.apache.commons:commons-lang3:3.5' dependency and use gson.fromJson(StringEscapeUtils.unescapeJson(response.body()), RouteModel[].class)

Or just simply use replace("\\\"","\"")

Upvotes: 4

Views: 6681

Answers (2)

Lyubomyr Shaydariv
Lyubomyr Shaydariv

Reputation: 21105

Oh, welcome to the gorgeous world of the SimpleRide API. :D I had happy-fun-coding time when I tried to solve that problem for the first time a year and half ago before launching my Android app. I suspect that those guys return such a string in order to use JSON.parse in the front-side only. Therefore the easiest (but not the most efficient) way is parsing responses as strings to "normalize" them and then parsing normalized JSON documents.

In order to parse your (see the comments below) JSON, there is a need to represent your JSON input stream as a JSON string literal input stream. This can be done easily by concatenating input streams.

final class FixedInputStreams {

    private static final byte[] e1DoubleQuoteArray = "\"".getBytes();

    private FixedInputStreams() {
    }

    static InputStream fixInputStream(final InputStream inputStream) {
        return concatInputStreams(
                new ByteArrayInputStream(e1DoubleQuoteArray),
                inputStream,
                new ByteArrayInputStream(e1DoubleQuoteArray)
        );
    }

    private static InputStream concatInputStreams(final InputStream... inputStreams) {
        return concatInputStreams(asList(inputStreams).iterator());
    }

    // Iterator and not an iterable by design
    private static InputStream concatInputStreams(final Iterator<? extends InputStream> inputStreamsIterator) {
        return new SequenceInputStream(asEnumeration(inputStreamsIterator));
    }

    private static <T> Enumeration<T> asEnumeration(final Iterator<T> iterator) {
        return new Enumeration<T>() {
            @Override
            public boolean hasMoreElements() {
                return iterator.hasNext();
            }

            @Override
            public T nextElement() {
                return iterator.next();
            }
        };
    }

}

What this class does is only fixing such a malformed input stream in order to simulate a JSON string input stream. Thus, with the input stream above, your JSON becomes a legal JSON string:

[{\"X\":24.0124010872935,\"Y\":49.7740722529036,\"Code\":\"0320\",\"Name\": .....]

"[{\"X\":24.0124010872935,\"Y\":49.7740722529036,\"Code\":\"0320\",\"Name\": .....]"

Now you have to parse this string to extract the normalized JSON. MalformedJsonTypeAdapterFactory represents a synthetic Gson type adapter factory and it's only responsibility is parsing JSON string literals and then parse the latter to well-formed DTOs.

final class StringWrapperTypeAdapterFactory
        implements TypeAdapterFactory {

    private final Gson realGson;

    private StringWrapperTypeAdapterFactory(final Gson realGson) {
        this.realGson = realGson;
    }

    static TypeAdapterFactory getStringWrapperTypeAdapterFactory(final Gson realGson) {
        return new StringWrapperTypeAdapterFactory(realGson);
    }

    @Override
    public <T> TypeAdapter<T> create(final Gson gson, final TypeToken<T> typeToken) {
        return new TypeAdapter<T>() {
            @Override
            public void write(final JsonWriter out, final T value) {
                throw new UnsupportedOperationException();
            }

            @Override
            public T read(final JsonReader in) {
                final String jsonDocument = realGson.fromJson(in, String.class);
                return realGson.fromJson(jsonDocument, typeToken.getType());
            }
        };
    }

}

So the idea here is:

"[{\"X\":24.0124010872935,\"Y\":49.7740722529036,\"Code\":\"0320\",\"Name\": .....]"

[{"X":24.0124010872935,"Y":49.7740722529036,"Code":"0320","Name": .....]

A sample DTO class similar to what I have in my application source code:

final class NamedPoint {

    @SerializedName("X")
    final double longitude = Double.valueOf(0); // disabling primitives inlining

    @SerializedName("Y")
    final double latitude = Double.valueOf(0);

    @SerializedName("Code")
    final String code = null;

    @Override
    public String toString() {
        return '<' + code + "=(" + latitude + ',' + longitude + ")>";
        //                   ^__ See? Even this string is aware of the issue
    }

}

Finally, the general configuration and workflow now become as it follows:

static final Type namedPointListType = new TypeToken<List<NamedPoint>>() {
}.getType();

static final Gson realGson = new GsonBuilder()
        // ... your Gson configuration here ...
        .create();

static final Gson stringWrapperGson = new GsonBuilder()
        .registerTypeAdapterFactory(getStringWrapperTypeAdapterFactory(realGson))
        .create();
// or `new ByteArrayInputStream(jsonSource.getBytes())` to test quickly
final InputStream malformedInputStream = ...;
try ( final InputStream fixedInputStream = fixInputStream(malformedInputStream);
        final Reader jsonReader = new BufferedReader(new InputStreamReader(fixedInputStream))) {
    final List<NamedPoint> namedPoints = stringWrapperGson.fromJson(jsonReader, namedPointListType);
    out.println(namedPoints);
}

The output:

[<0320=(49.7740722529036,24.0124010872935)>]

A couple of comments regarding the SimpleRide API:

  • I'm not sure if you need "fixing" input streams by "-concatenation now, because the API seems to wrap it up itself (due to JSON.parse?). You can check it easily with something like wget http://82.207.107.126:13541/SimpleRide/LAD/SM.WebApi/api/Schedule/?routeId=713032&code=0298. Maybe a certain Content-Type can adjust the responses format?
  • Since StringWrapperTypeAdapterFactory creates an intermediate string to be parsed in further steps, it may be not efficient due to memory costs. To overcome this issue and reduce the size of consumed memory during the parsing, you could write a custom InputStream or Reader that could be JSON-aware and strip the escaping characters themselves, so you wouldn't even need StringWrapperTypeAdapterFactory and intermediate strings.

Edit:

As it was said above, streaming fashioned style is better for such parsing in order to save memory from unnecessary intermediate objects. Despite, InputStream is not a very appropriate place to read character data and Reader fits such a task better, simple JSON-escaping InputStream is easier to implement:

final class StringWrapperInputStream
        extends InputStream {

    private final InputStream inputStream;

    private State state = State.PRE_INIT;

    private StringWrapperInputStream(final InputStream inputStream) {
        this.inputStream = inputStream;
    }

    static InputStream getStringWrapperInputStream(final InputStream inputStream) {
        return new StringWrapperInputStream(inputStream);
    }

    @Override
    public int read()
            throws IOException {
        for ( ; ; ) {
            switch ( state ) {
            case PRE_INIT:
                final int chPreInit = inputStream.read();
                if ( chPreInit == -1 ) {
                    return -1;
                }
                if ( isWhitespace(chPreInit) ) {
                    continue;
                }
                if ( chPreInit == '\"' ) {
                    state = IN_PROGRESS;
                } else {
                    throw new IllegalArgumentException("char=" + chPreInit);
                }
                continue;
            case IN_PROGRESS:
                final int chInProgress1 = inputStream.read();
                if ( chInProgress1 == -1 ) {
                    return -1;
                }
                if ( chInProgress1 == '\"' ) {
                    state = DONE;
                    continue;
                }
                if ( chInProgress1 != '\\' ) {
                    return chInProgress1;
                }
                final int chInProgress2 = inputStream.read();
                if ( chInProgress2 == -1 ) {
                    return -1;
                }
                if ( chInProgress2 == '\"' ) {
                    return '\"';
                }
                break;
            case DONE:
                return -1;
            default:
                throw new AssertionError(state);
            }
        }
    }

    enum State {

        PRE_INIT,
        IN_PROGRESS,
        DONE

    }

}

Upvotes: 1

Jordi Castilla
Jordi Castilla

Reputation: 26961

Using disableHtmlEscaping should solve the problem without ugly workarounds. Also I used prettyPrinting to have a nicer output....

Gson gson = new GsonBuilder().setPrettyPrinting().disableHtmlEscaping().create();
gson.from(response.body(), RouteModel[].class)

Upvotes: 2

Related Questions