Reputation: 111
I have several CSV files that I'd like to use in a JavaScript based front-end application. Most of them are stored on a Cloud. Considering the fact that some of the CSV are sometimes very big (several gigabytes), I first considered using parquetjs for compression, and transfer them to the front-end as small parquet files (we have large redundancy in our CSV files and parquet allows us to achieve high compression results, ie a 1.6Gb CSV file is compressed as a 7Mb parquet file).
For optimization issues, I intended to use the stream capabilities of parquetjs to extract the CSV files "on the fly". But somehow this feature doesn't seem to be very mature yet. I wanted to know if it was possible to find another solution in order to have fast streamed CSV decompression. Are there zip-based JavaScript packages that would do the trick ? Transfering and reading big CSV files directly doesn't seem to be an optimal solution for me.
Upvotes: 0
Views: 2268
Reputation: 712
If I understand the problem correctly, and depending on what you want to do on the client, you could potentially use the Fetch API and use the response.body
stream to process as it’s still downloading.
There’s a post by Jake Archibald which touches on how to read in a stream of CSV data, which could be useful.
Upvotes: 0