Reputation: 1112
How to write a single file while reading from multiple input streams of the exact same file from diffrent locations with NodeJS.
I want to use more performance for the download lets say we have 2 locations for the same file each can perform only 10mb down stream so i want to download a part from the first location and the secund in parallel. to get it with 20mb.
so both streams need to get joined some how and both streams need to know the range they are downloading.
i have 2 examples
var http = require('http')
var fs = require('fs')
// will write to disk __dirname/file1.zip
function writeFile(fileStream){
//...
}
// This example assums downloading from 2 http locations
http.request('http://location1/file1.zip').pipe(writeFile)
http.request('http://location2/file1.zip').pipe(writeFile)
var fs = require('fs')
// will write to disk __dirname/file1.zip
function writeFile(fileStream){
//...
}
// this example is reading the same file from 2 diffrent disks
fs.readfFile('/mount/volume1/file1.zip').pipe(writeFile)
fs.readfFile('/mount/volume2/file1.zip').pipe(writeFile)
ReadStream needs to check if a defined content range is already writen befor rereading the next chunk from each file and maybe they should start in on a random location in the file to read.
if the total file content length is X we will divide it into smaller chunks and create a map where each entry has a fixed content length so we know what parts we got and what parts we are downloading in total.
We can try to simply optimistic raise Read
let SIZE = 64; // 64 byte intervals
let buffers = []
let bytesRead = 0
function readParallel(filepath,callback){
fs.open(filepath, 'r', function(err, fd) {
fs.fstat(fd, function(err, stats) {
let bufferSize = stats.size;
while (bytesRead < bufferSize) {
let size = Math.min(SIZE, bufferSize - bytesRead);
let buffer = new Buffer(size),
let position = bytesRead
let length = size
let offset = bytesRead
let read = fs.readSync(fd, buffer, offset, length, position);
buffers.push(buffer);
bytesRead += read;
}
});
});
}
// At the End: buffers.concat() ==== "File Content"
fs.createReadStream() has an option you can pass it to specify the start
let f = fs.createReadStream("myfile.txt", {start: 1000});
You could also open a normal file descriptor with fs.open()
, then fs.read()
one byte from a position right before where you want the stream to be positioned using the position argument to fs.read()
and then you can pass that file descriptor into fs.createReadStream()
as an option and the stream will start with that file descriptor and position (though obviously the start
option to fs.createReadStream()
is a bit simpler).
Upvotes: 1
Views: 1681
Reputation: 1112
The Answer is Advisory Locking it is as simple as Torrent does it
To get a File from Multiple Sources a JS Implementation would look like if we assume all files are only i put no error handling in here
const queue = [];
const sources = ['https://example.com/file','https://example1.com/file'];
const fileSize = fetch({sources[0],{method: 'HEAD'}).then(({ headers })=>headers['Content-Size']);
const targetBuffer = new UInt8Array(fileSize);
const charset = 'x-user-defined';
// Maps to the UTF Private Address Space Area so you can get bits as chars
const binaryRawEnablingHeader = `text/plain; charset=${charset}`;
const requestDefaults = {
headers: {
'Content-Type': binaryRawEnablingHeader,
'range': 'bytes=2-5,10-13'
}
}
const downloadPlan = /* some logic that puts that bytes into the target WiP */
// use response.text() and then convert that to byte via
// UNICODE Private Area 0xF700-0xF7ff.
const convertToAbyte = (chars) =>
new Array(chars.length)
.map((_abyte,offset) =>
chars.charCodeAt(offset) & 0xff);
Upvotes: 0
Reputation: 938
Using csv-parse
with csv-stringify
from the CSV Project.
const fs = require('fs');
const parse = require('csv-parse');
const stringify = require('csv-stringify')
const stringifier = stringify();
const writeFile = fs.createWriteStream('out.csv');
fs.createReadStream('file1.csv').pipe(parse()).pipe(stringifier).pipe(writeFile);
fs.createReadStream('file2.csv').pipe(parse()).pipe(stringifier).pipe(writeFile);
Here I parse each file separately (using a different parse
stream for each source), then pipe both to the same stringify
stream which concatenates them, then write to destination.
Upvotes: 0