Reputation: 2588
Are there any differences between createReadStream
from fs
and Readable
from stream
test.csv
name,color
audi,green
bmw,red
mercedes,silver
fs.createReadStream
import { createReadStream } from 'fs'
import parser from 'csv-parser'
;(async () => {
createReadStream('test.csv')
.pipe(parser())
.on('data', (data) => console.log(data))
})()
Readable
import { readFile } from 'fs/promises'
import { Readable } from 'stream'
import parser from 'csv-parser'
;(async () => {
const csv = await readFile('test.csv')
Readable.from(csv)
.pipe(parser())
.on('data', (data) => console.log(data))
})()
Both seem to do the same thing. but which one is more efficient for a heavy volume of data?
Upvotes: 3
Views: 1775
Reputation: 19957
The main point of using steam API is to optimize memory usage. If you just const csv = await readFile('test.csv')
then the content of the whole file is already read and saved into memory.
The way you use Readable.from
only wrap the csv content into a readable stream interface. But the memory is already allocated. So it does not help improve performance at all.
On the other side, createReadStream('test.csv')
creates a "real" readable stream. It’s driven by downstream consumers and only read small chunks of data on demand.
So definitely use fs.createReadStream
Upvotes: 1
Reputation: 13652
fs.readFile
/fsPromises.readFile
read the entire contents of the file at once, whereas createReadStream
only reads chunks at a time. Using readFile
with Readable.from
may provide the same API, but by reading the entire contents to memory, you lose the benefits of streaming.
If you're reading heavy files, I would recommend createReadStream
over readFile
as you can start processing and parsing the data without having to wait for the entire file to be read, and you don't even need to read the entire file into memory (unless your use case requires it, that is). If your files are small, it probably doesn't matter (but do profile it).
Upvotes: 2