Reputation: 74
So, here is my problem: I have a large text file (size around 150 MB) with hundreds of thousands of lines.I need to read the contents of the file, parse it so that the lines are put in appropriate html tags and write it into a window.document.open() object.
My code works for files until 50 MB of size.
var rawFile=new XMLHttpRequest();
rawFile.open("GET",file, true);
rawFile.onreadystatechange= function () {
if (rawFile.readyState === 4) {
if (rawFile.status === 200 || rawFile.status === 0) {
var allText = rawFile.responseText;
var contents = allText.split("\n");
var w = window.open();
w.document.open();
for (i = 0; i < contents.length; i++) {
//logc so that str= appropriate tags + contents[i]
w.document.write(str);
}
}
}
}
The code works. The logic works. But if the file size is greater than 100MB or similar, chrome crashes. I think reading the file in chunks and then writing it to window.document.open() will remove this problem for me.
any advice how I could go about accomplishing this is very appreciated. Thank you :)
(Ignore if there are any errors in the code I posted above, my actual code is very large so I just wrote a miniature version of it)
Upvotes: 4
Views: 3996
Reputation: 5425
Your approach will cripple the browser because you are processing the entire response at once. A better approach would be to break the process down so that you are processing smaller chunks or alternatively stream the file through your process.
Using the Fetch API rather than XMLHttpRequest
will get you access to the streaming data. The big advantage of using the stream is that you aren't hogging the browser's memory when you're processing the content.
The following code outlines how to use streams to perform the task:
var file_url = 'URL_TO_FILE';
// @link https://developer.mozilla.org/en-US/docs/Web/API/Request/Request
var myRequest = new Request( file_url );
// fetch returns a promise
fetch(myRequest)
.then(function(response) {
var contentLength = response.headers.get('Content-Length');
// response.body is a readable stream
// @link https://learn.microsoft.com/en-us/microsoft-edge/dev-guide/performance/streams-api
var myReader = response.body.getReader();
// the reader result will need to be decoded to text
// @link https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder/TextDecoder
var decoder = new TextDecoder();
// add decoded text to buffer for decoding
var buffer = '';
// you could use the number of bytes received to implement a progress indicator
var received = 0;
// read() returns a promise
myReader.read().then(function processResult(result) {
// the result object contains two properties:
// done - true if the stream is finished
// value - the data
if (result.done) {
return;
}
// update the number of bytes received total
received += result.value.length;
// result.value is a Uint8Array so it will need to be decoded
// buffer the decoded text before processing it
buffer += decoder.decode(result.value, {stream: true});
/* process the buffer string */
// read the next piece of the stream and process the result
return myReader.read().then(processResult);
})
})
I didn't implement the code for processing the buffer but the algorithm would be as follows:
If the buffer contains a newline character:
Split the buffer into an array of lines
If there is still more data to read:
Save the last array item because it may be an incomplete line
Do this by setting the content of the buffer to that of the last array item
Process each line in the array
A quick look at Can I Use tells me that this won't work in IE because the Fetch API wasn't implemented before the Edge browser. However there's no need to despair because as always some kind soul has implemented a polyfill for non-supporting browsers.
Upvotes: 5