How can you store and modify large datasets in node.js?

Question

Basics

So basically I have written a program which generates test data for MongoDB in Node.

The problem

For that, the program reads a schema file and generates a specified amount of test data out of it. The problem is that this data can eventually become quite big (Think about creating 1M Users (with all properties it needs) and 20M chat messages (with userFrom and userTo) and it has to keep all of that in the RAM to modify/transform/map it and after that save it to a file.

How it works

The program works like that:

Read schema file
Create test data from the schema and store it in a structure (look down below for the structure)
Run through this structure and link all objects referenceTo to a random object with matching referenceKey.
Transform the object structure in a string[] of MongoDB insert statements
Store that string[] in a file.

This is the structure of the generated test data:

export interface IGeneratedCollection {
    dbName: string,                 // Name of the database
    collectionName: string,         // Name of the collection
    documents: IGeneratedDocument[] // One collection has many documents
}

export interface IGeneratedDocument {
    documentFields: IGeneratedField [] // One document has many fields (which are recursive, because of nested documents)
}

export interface IGeneratedField {
    fieldName: string, // Name of the property
    fieldValue: any,   // Value of the property (Can also be IGeneratedField, IGeneratedField[], ...)
    fieldNeedsQuotations?: boolean, // If the Value needs to be saved with " ... "
    fieldIsObject?: boolean,        // If the Value is a object (stored as IGeneratedField[]) (To handle it different when transforming to MongoDB inserts)
    fieldIsJsonObject?: boolean,    // If the Value is a plain JSON object
    fieldIsArray?: boolean,         // If the Value is array of objects (stored as array of IGeneratedField[])
    referenceKey?: number,          // Field flagged to be a key
    referenceTo?: number            // Value gets set to a random object with matching referenceKey
}

Actual data

So in the example with 1M Users and 20M messages it would look like this:

1x IGeneratedCollection (collectionName = "users")
- 1Mx IGeneratedDocument
  - 10x IGeneratedField (For example each user has 10 fields)
1x IGeneratedCollection (collectionName = "messages")
- 20Mx IGeneratedDocument
  - 3x IGeneratedField (message, userFrom, userTo)

hich would result in 190M instances of IGeneratedField (1x1Mx10 + 1x20Mx3x = 190M).

Conclusion

This is obviously a lot to handle for the RAM as it needs to store all of that at the same time.

Temporary Solution

It now works like that:

Generate 500 documents(rows in sql) at a time

JSON.stringify those 500 documents and put them in a SQLite table with the schema (dbName STRING, collectionName STRING, value JSON)

Remove those 500 documents from JS and let the Garbage Collector do its thing

Repeat until all data is generated and in the SQLite table

Take one of the rows (each containing 500 documents) at a time, apply JSON.parse and search for keys in them

Repeat until all data is queried and all keys retrieved

Take one of the rows at a time, apply JSON.parse and search for key references in them

Apply JSON.stringify and update the row if necessary (if key references found and resolved)

Repeat until all data is queried and all keys are resolved

Take one of the rows at a time, apply JSON.parse and transform the documents to valid sql/mongodb inserts

Add the insert (string) in a SQLite table with the schema (singleInsert STRING)

Remove the old and now unused row from the SQLite table

Write all inserts to file (if run from the command line) or return a dataHandle to query the data in the SQLite table (if run from other node app)

This solution does handle the problem with RAM, because SQLite automatically swaps to the Harddrive when the RAM is full

BUT

As you can see there are a lot of JSON.parse and JSON.stringify involved, which slows down the whole process drastically

What I have thought:

Maybe I should modify the IGeneratedField to only use shortend names as variables (fieldName -> fn, fieldValue -> fv, fieldIsObject -> fio, fieldIsArray -> fia, ....)

This would make the needed storage in the SQLite table smaller, BUT it would also make the code harder to read

Use a document oriented database (But I have not really found one), to handle JSON data better

The Question

Is there any better solution to handle big objects like this in node?

Is my temporary solution OK? What is bad about it? Can it be changed to perform better?