Reputation: 61
I am fetching a large JSON file (200mb) and I want to render that data out as the data streams in. The problem I am having is that after I decode and parse the chunks that are streamed to me, a syntax error is returned to me Unexpected end of JSON input
in the console. What I want to do is parse the return chunks and do something with that data as soon as I get it. However, because the ReadableStream is streamed in chunks that are unpredictable in how they are sliced, I cannot perform JSON.parse() on the return values. What sort of data massaging needs to be done to make this happen? Is there a better approach?
Here is my code:
const decoder = new TextDecoder('utf-8')
fetch("../files/response.json")
.then(response => {
const reader = response.body.getReader()
new ReadableStream({
start(controller) {
function enqueueValues() {
reader.read()
.then(({ done, value }) => {
if (done) {
controller.close() // stream is complete
return
}
var decodedValue = decoder.decode(value) // one chunk of invalid json data in string format
console.log(JSON.parse(decodedValue)) // json syntax error
// do something with the json value here
controller.enqueue(value)
enqueueValues() // run until all data has been streamed
})
}
enqueueValues()
}
})
})
Upvotes: 5
Views: 2866
Reputation: 7558
What you need is a JSON parser that supports streaming. There are many libraries out there, one of which is json-stream-es, which I maintain.
From the description that you gave it sounds like the data arrives in JSONL format, meaning that the data is not a single JSON document but multiple ones (usually delimited by newlines).
You can parse such a stream with json-stream-es like this:
import { parseJsonStream, streamToIterable } from "json-stream-es";
const response = await fetch("../files/response.json");
const values = response.body
.pipeThrough(new TextDecoderStream())
.pipeThrough(parseJsonStream(undefined, { multi: true }));
for await (const decodedValue of streamToIterable(values)) {
console.log(decodedValue);
// do something with the json value here
}
Upvotes: 1
Reputation: 7242
I think the only way to achieve this is to send a valid json data (object, array) in each chunk.
Here is a sample express.js handler:
app.get("/stream", (req, res) => {
let i = 0;
const interval = setInterval((a) => {
i += 1;
res.write(JSON.stringify([{ message: `Chunk ${i}` }]));
}, 500);
setTimeout(() => {
clearInterval(interval);
res.end(() => {
console.log("End");
});
}, 5000);
});
the downside of this is that final json (all chunks concatenated into one string) is not valid. But holding 200mb object in browser's memory is not good either.
UPDATE: I was trying to solve the similar issue in my project and found a workaround.
Then on the client I ignore chunks that are equal to [
and ]
and cut ending comma in each data chunk.
Server:
app.get("/stream", (req, res) => {
let i = 0,
chunkString;
res.write("["); // <<---- OPENING bracket
const interval = setInterval((a) => {
i += 1;
chunkString = JSON.stringify({ message: `Chunk ${i}` });
res.write(`${chunkString},`); // <<----- Note ending comma at the end of each data chunk
}, 500);
setTimeout(() => {
clearInterval(interval);
res.end("]", () => {. // <<---- CLOSING bracket
console.log("End");
});
}, 5000);
});
Client:
const decoder = new TextDecoder("utf-8");
const handleJsonChunk = (jsonChunk) => {
console.log("Received Json Chunk: ", jsonChunk);
};
const main = async () => {
const response = await fetch("http://localhost:3000/stream");
const reader = response.body.getReader();
const skipValues = ["[", "]"];
const work = (reader) => {
reader.read().then(({ done, value }) => {
if (!done) {
let stringValue = decoder.decode(value);
const skip = skipValues.indexOf(stringValue) >= 0;
if (skip) return work(reader);
if (stringValue.endsWith(","))
stringValue = stringValue.substr(0, stringValue.length - 1);
try {
const jsonValue = JSON.parse(stringValue);
handleJsonChunk(jsonValue);
} catch (error) {
console.log(`Failed to parse chunk. Error: ${error}`);
}
work(reader);
}
});
};
work(reader);
};
main();
Upvotes: 1