Amrish khan Sheik
Amrish khan Sheik

Reputation: 101

is there a way to convert a json object to json l file

I have an array of objects. I need to convert it in .jsonl format and send it as response using node in a lambda function i have been trying to change it as a string and add '\n' to make it a new line but it didn't work

Upvotes: 8

Views: 6479

Answers (2)

imsheth
imsheth

Reputation: 61

Approaches to resolve the issue for larger amount of data conversion from .json to .jsonl :

  1. Monkey patching trial before implementing @user120242's answer failed due to presence of { , }, [, ] in the data

    const sampleData = [{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' }]
    
    console.log(JSON.stringify(sampleData).replace('[', '').replace(']', '').replaceAll('},{', '}\n{'));

  2. @user120242's answer works (I wanted a solution that was free from any external libraries or packages as far as possible) for smaller data and is indeed a clean solution which worked for me upto data which was ~100 MB of array of objects, beyond that it fails (my solution was working in node.js v14.1.0 being executed by Docker version 20.10.5, build 55c4c88 using DockerOperator in airflow v2.0.1 upto data which was ~100 MB of array of objects and it was failing miserably for data in the range of ~750 MB of array of objects with this issue - JSON.stringify throws RangeError: Invalid string length for huge objects)

  3. Trail for similar solution to https://dev.to/madhunimmo/json-stringify-rangeerror-invalid-string-length-3977 for converting .json to .jsonl didn't work with same issue as above - JSON.stringify throws RangeError: Invalid string length for huge objects

  4. Implementing for...of from @Bergi's answer - Using async/await with a forEach loop worked out with great performance (my implementation was working in node.js v14.1.0 being executed by Docker version 20.10.5, build 55c4c88 using DockerOperator in airflow v2.0.1 upto data which was ~750 MB of array of objects)

const fsPromises = require('fs').promises;
const writeToFile = async () => {
    const dataArray = [{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' }];
    for (const dataObject of dataArray) {
        await fsPromises.appendFile( "out.jsonl" , JSON.stringify(dataObject) + "\n");
    }
}

P.S. : You'll face Node JS Process out of memory with larger data (typically >100 MB)if you haven't already provided extra memory above the default to node.js v14.1.0, the following worked out for usage inside Dockerfile (replace 6144 with the amount of memory in MB which you want to allocate)

CMD node --max-old-space-size=6144 app.js

Upvotes: -1

user120242
user120242

Reputation: 15268

Simple code to generate jsonlines. jsonlines is really just a bunch of one-line JSON objects stringified and concatenated with newlines between them. That's it.
The other issue you will need to deal with is escaping unicode, so when you write to a file, you must use UTF-8 encoding.

repl.it demo using jsonlines npm library: https://repl.it/repls/AngelicGratefulMoto

Simple plain JS demo:

data = [{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' },{ jsonlines: 'is awesome' }]

console.log(
data.map(x=>JSON.stringify(x)).join('\n')
)

Upvotes: 16

Related Questions