Parsing json without using the
character

Question

I am currently implementing an Akka Stream Scala application which reads-in a zipped file containing tweets formated as below (using json):

{"created_at": "Mon Nov 04 14:37:29 +0000 2019", ... }
{"created_at": "Mon Nov 04 14:37:29 +0000 2019", ... }

I already succeeded in reading in uncompressing the file but I'm now trying to split the stream into chucks in such a way that each chunck contains one representation of a tweet, which corresponds to one line in the code snippet above.

I have tried using the the following as a flow to achieve this:

Framing.delimiter(ByteString("
"), 50000)

The problem is however that within the json there is an attribute "full_text", representing the content of the tweet. This text can contain characters, resulting in the above code snippet not working as it will also split at those text characters. Example below.

{"created_at": "Mon Nov 04 14:37:29 +0000 2019", "full_text": "I love to eat 
 CHEESE!!", ... }

Does anyone know a good solution to this issue?

aventurin · Accepted Answer

It seems that Akka‘s JSON Framing is made for this purpose:

https://doc.akka.io/docs/alpakka/current/data-transformations/json.html

Parsing json without using the \n character

Answers (1)

Related Questions