Astarno
Astarno

Reputation: 446

Parsing json without using the \n character

I am currently implementing an Akka Stream Scala application which reads-in a zipped file containing tweets formated as below (using json):

{"created_at": "Mon Nov 04 14:37:29 +0000 2019", ... }
{"created_at": "Mon Nov 04 14:37:29 +0000 2019", ... }

I already succeeded in reading in uncompressing the file but I'm now trying to split the stream into chucks in such a way that each chunck contains one representation of a tweet, which corresponds to one line in the code snippet above.

I have tried using the the following as a flow to achieve this:

Framing.delimiter(ByteString("\n"), 50000)

The problem is however that within the json there is an attribute "full_text", representing the content of the tweet. This text can contain \n characters, resulting in the above code snippet not working as it will also split at those \n text characters. Example below.

{"created_at": "Mon Nov 04 14:37:29 +0000 2019", "full_text": "I love to eat \n CHEESE!!", ... }

Does anyone know a good solution to this issue?

Upvotes: 1

Views: 175

Answers (1)

aventurin
aventurin

Reputation: 2203

It seems that Akka‘s JSON Framing is made for this purpose:

https://doc.akka.io/docs/alpakka/current/data-transformations/json.html

Upvotes: 6

Related Questions