Reputation: 1469
I have the url to a possibly large (100+ Mb) file, how do I save it in a local directory using fetch?
I looked around but there don't seem to be a lot of resources/tutorials on how to do this.
Upvotes: 105
Views: 108402
Reputation: 8151
This is now easy using modern NodeJS APIs. This will not read the entire file into memory at once so can be used with huge files and is great for performance.
import { writeFile } from 'node:fs/promises'
import { Readable } from 'node:stream'
const response = await fetch('https://example.com/example.mp4')
const stream = Readable.fromWeb(response.body)
await writeFile('example.mp4', stream)
Upvotes: 21
Reputation: 6927
If you want to avoid explicitly making a Promise like in the other very fine answer, and are ok with building a buffer of the entire 100+ MB file, then you could do something simpler:
const fetch = require('node-fetch');
const {writeFile} = require('fs/promises');
function downloadFile(url, outputPath) {
return fetch(url)
.then(x => x.arrayBuffer())
.then(x => writeFile(outputPath, Buffer.from(x)));
}
But the other answer will be more memory-efficient since it's piping the received data stream directly into a file without accumulating all of it in a Buffer.
Upvotes: 28
Reputation: 1669
Updated solution on Node 18:
const fs = require("fs");
const { mkdir } = require("fs/promises");
const { Readable } = require('stream');
const { finished } = require('stream/promises');
const path = require("path");
const downloadFile = (async (url, fileName) => {
const res = await fetch(url);
if (!fs.existsSync("downloads")) await mkdir("downloads"); //Optional if you already have downloads directory
const destination = path.resolve("./downloads", fileName);
const fileStream = fs.createWriteStream(destination, { flags: 'wx' });
await finished(Readable.fromWeb(res.body).pipe(fileStream));
});
await downloadFile("<url_to_fetch>", "<fileName>")
Old Answer works till Node 16:
Using the Fetch API you could write a function that could download from a URL like this:
You will need node-fetch@2
run npm i node-fetch@2
const fetch = require("node-fetch");
const fs = require("fs");
const downloadFile = (async (url, path) => {
const res = await fetch(url);
const fileStream = fs.createWriteStream(path);
await new Promise((resolve, reject) => {
res.body.pipe(fileStream);
res.body.on("error", reject);
fileStream.on("finish", resolve);
});
});
Upvotes: 137
Reputation: 961
This got the job done for me node 18 and presumably 16. Has only fs and node-fetch (probably works with other fetch libraries) as a dependency.
const fs = require('fs');
const fetch = require("node-fetch");
async function downloadImage(imageUrl){
//imageurl https://example.com/uploads/image.jpg
imageUrl = imageUrl.split('/').slice(-1) //image.jpg
const res = await fetch(imageUrl);
const fileStream = fs.createWriteStream(`./folder/${imageUrl}`);
await new Promise((resolve, reject) => {
res.body.pipe(fileStream);
res.body.on("error", reject);
fileStream.on("finish", resolve);
});
};
Previous top answer by @code_wrangler was split into a node 16 and 18 solution (this is like the 16 solution), but on Node 18 the Node 18 solution created a 0 byte file for me and cost me some time.
Upvotes: 0
Reputation: 848
Older answers here involve node-fetch
, but since Node.js v18.x
this can be done with no extra dependencies.
The body of a fetch response is a web stream. It can be converted to a Node fs
stream using Readable.fromWeb
, which can then be piped into a write stream created by fs.createWriteStream
. If desired, the resulting stream can then be turned into a Promise
using the promise version of stream.finished
.
const fs = require('fs');
const { Readable } = require('stream');
const { finished } = require('stream/promises');
const stream = fs.createWriteStream('output.txt');
const { body } = await fetch('https://example.com');
await finished(Readable.fromWeb(body).pipe(stream));
Upvotes: 50
Reputation: 31
import { existsSync } from "fs";
import { mkdir, writeFile } from "fs/promises";
import { join } from "path";
export const download = async (url: string, ...folders: string[]) => {
const fileName = url.split("/").pop();
const path = join("./downloads", ...folders);
if (!existsSync(path)) await mkdir(path);
const filePath = join(path, fileName);
const response = await fetch(url);
const blob = await response.blob();
// const bos = Buffer.from(await blob.arrayBuffer())
const bos = blob.stream();
await writeFile(filePath, bos);
return { path, fileName, filePath };
};
// call like that ↓
await download("file-url", "subfolder-1", "subfolder-2", ...)
Upvotes: 3
Reputation: 327
If you don't need to deal with 301/302 responses (when things have been moved), you can actually just do it in one line with the Node.js native libraries http
and/or https
.
You can run this example oneliner in the node
shell. It just uses https
module to download a GNU zip file of some source code to the directory where you started the node
shell. (You start a node
shell by typing node
at the command line for your OS where Node.js has been installed).
require('https').get("https://codeload.github.com/angstyloop/js-utils/tar.gz/refs/heads/develop", it => it.pipe(require('fs').createWriteStream("develop.tar.gz")));
If you don't need/want HTTPS use this instead:
require('http').get("http://codeload.github.com/angstyloop/js-utils/tar.gz/refs/heads/develop", it => it.pipe(require('fs').createWriteStream("develop.tar.gz")));
Upvotes: 1
Reputation: 533
I was looking for kinda a same usage, wanted to fetch bunch of api endpoints and save the json responses to some static files, so I came up creating my own solution, hope it helps
const fetch = require('node-fetch'),
fs = require('fs'),
VERSIOINS_FILE_PATH = './static/data/versions.json',
endpoints = [
{
name: 'example1',
type: 'exampleType1',
url: 'https://example.com/api/url/1',
filePath: './static/data/exampleResult1.json',
updateFrequency: 7 // days
},
{
name: 'example2',
type: 'exampleType1',
url: 'https://example.com/api/url/2',
filePath: './static/data/exampleResult2.json',
updateFrequency: 7
},
{
name: 'example3',
type: 'exampleType2',
url: 'https://example.com/api/url/3',
filePath: './static/data/exampleResult3.json',
updateFrequency: 30
},
{
name: 'example4',
type: 'exampleType2',
url: 'https://example.com/api/url/4',
filePath: './static/data/exampleResult4.json',
updateFrequency: 30
},
],
checkOrCreateFolder = () => {
var dir = './static/data/';
if (!fs.existsSync(dir)) {
fs.mkdirSync(dir);
}
},
syncStaticData = () => {
checkOrCreateFolder();
let fetchList = [],
versions = [];
endpoints.forEach(endpoint => {
if (requiresUpdate(endpoint)) {
console.log(`Updating ${endpoint.name} data... : `, endpoint.filePath);
fetchList.push(endpoint)
} else {
console.log(`Using cached ${endpoint.name} data... : `, endpoint.filePath);
let endpointVersion = JSON.parse(fs.readFileSync(endpoint.filePath, 'utf8')).lastUpdate;
versions.push({
name: endpoint.name + "Data",
version: endpointVersion
});
}
})
if (fetchList.length > 0) {
Promise.all(fetchList.map(endpoint => fetch(endpoint.url, { "method": "GET" })))
.then(responses => Promise.all(responses.map(response => response.json())))
.then(results => {
results.forEach((endpointData, index) => {
let endpoint = fetchList[index]
let processedData = processData(endpoint.type, endpointData.data)
let fileData = {
data: processedData,
lastUpdate: Date.now() // unix timestamp
}
versions.push({
name: endpoint.name + "Data",
version: fileData.lastUpdate
})
fs.writeFileSync(endpoint.filePath, JSON.stringify(fileData));
console.log('updated data: ', endpoint.filePath);
})
})
.catch(err => console.log(err));
}
fs.writeFileSync(VERSIOINS_FILE_PATH, JSON.stringify(versions));
console.log('updated versions: ', VERSIOINS_FILE_PATH);
},
recursiveRemoveKey = (object, keyname) => {
object.forEach((item) => {
if (item.items) { //items is the nesting key, if it exists, recurse , change as required
recursiveRemoveKey(item.items, keyname)
}
delete item[keyname];
})
},
processData = (type, data) => {
//any thing you want to do with the data before it is written to the file
let processedData = type === 'vehicle' ? processType1Data(data) : processType2Data(data);
return processedData;
},
processType1Data = data => {
let fetchedData = [...data]
recursiveRemoveKey(fetchedData, 'count')
return fetchedData
},
processType2Data = data => {
let fetchedData = [...data]
recursiveRemoveKey(fetchedData, 'keywords')
return fetchedData
},
requiresUpdate = endpoint => {
if (fs.existsSync(endpoint.filePath)) {
let fileData = JSON.parse(fs.readFileSync(endpoint.filePath));
let lastUpdate = fileData.lastUpdate;
let now = new Date();
let diff = now - lastUpdate;
let diffDays = Math.ceil(diff / (1000 * 60 * 60 * 24));
if (diffDays >= endpoint.updateFrequency) {
return true;
} else {
return false;
}
}
return true
};
syncStaticData();
Upvotes: 1
Reputation: 6068
const {createWriteStream} = require('fs');
const {pipeline} = require('stream/promises');
const fetch = require('node-fetch');
const downloadFile = async (url, path) => pipeline(
(await fetch(url)).body,
createWriteStream(path)
);
Upvotes: 11