otus
otus

Reputation: 5732

How to parse "streamed" json objects with json4s?

I have a streaming source that produces many JSON objects without separators (or only whitespace in between). If I pass that to json4s parse function, it only produces AST for the first object.

As a workaround, I could parse it manually and either turn it into a JSON array by adding brackets and commas as appropriate or chunk it and call parse on each chunk.

However, this is a rather common format, so I'm sure the problem is already solved. I just cannot find the API for it in json4s documentation.

Upvotes: 2

Views: 710

Answers (1)

Andriy Plokhotnyuk
Andriy Plokhotnyuk

Reputation: 7989

If you reading it from an InputStream, then use BufferedInputStream wrapper with mark(), read() and reset() calls to skip whitespace(s) between parse() call:

val in = new BufferedInputStream(new FileInputStream("/tmp/your.json"))
try {
  var continue = true
  in.mark(1)
  do {
    in.reset()         

    // <-- here should be call for parse

    // skip white spaces or exit if EOF found
    var b = 0
    do {
      in.mark(1)
      b = in.read()
      if (b < 0) continue = false
    } while (Character.isWhitespace(b))
  } while (continue)
} finally in.close()

EDIT: Today I have released 0.11.0 version of jsoniter-scala with new ability to parse streaming JSON values or JSON arrays w/o need to hold all values in memory.

Upvotes: 1

Related Questions