wowza_MAN
wowza_MAN

Reputation: 107

How To Chunk And Upload A Large File To Google Bucket

I am trying to upload larger files to a google bucket from nodejs. uploading any file under and around the 200MB size mark works perfectly fine. Anything greater than that returns an error

Cannot create a string longer than 0x1fffffe8 characters

By me having a file that big, I have found out that node, does have limitations on how big a blob/file can be. Here are the two code snippets that both throw the same error

This one is with upload streaming

let fileSize = file.size;
      fs.createReadStream(file)
        .pipe(
          upload({
            bucket: BUCKET,
            file: file,
          })
        )
        .on("progress", (progress) => {
          console.log("Progress event:");
          console.log("\t bytes: ", progress.bytesWritten);
          const pct = Math.round((progress.bytesWritten / fileSize) * 100);
          console.log(`\t ${pct}%`);
        })
        .on("finish", (test) => {
          console.log(test);
          console.log("Upload complete!");
          resolve();
        })
        .on("error", (err) => {
          console.error("There was a problem uploading the file");
          reject(err);
        });

and of course just a regular bucket upload

await storage.bucket(BUCKET)
           .upload(file.path, {
             destination: file.name,
            })

I have come to terms that the only solution can be to chunk the file, upload it in chunks, and rejoin the file chunks in the bucket. The problem is that i don't know how to do that and i cant find any documentation on google or GitHub for this clause

Upvotes: 1

Views: 2971

Answers (3)

Dawid Kisielewski
Dawid Kisielewski

Reputation: 829

There is option that does the upload in chunks in the google storage api itself no need to implement it yourself by just using chunkSize option:

    const options = {
      destination: destination,
      resumable: false,
      validation: 'crc32c',
      chunkSize = 100 * 2**20
    };

    const [file] = await this.bucket.upload(filePath, options);

Upvotes: 0

Dawid Kisielewski
Dawid Kisielewski

Reputation: 829

I think external user shouldn't worry about the file size upload if the google storage supports up 5TB file sizes. Submitted issue to google team. https://github.com/googleapis/nodejs-storage/issues/2167

Upvotes: 1

wowza_MAN
wowza_MAN

Reputation: 107

To resolve this issue I checked the file size to see if it was larger than 200MB. I chunked it in 200MB chunks (roughly) then uploaded each individually. then joined the files with bucket.combine()

A very important note is to add the timeout. by default google has a 1 min file upload timeout, I have set it to 60 mins in the below snippet. It is a very hacky approach i must admit

if (uploadF.size > 209715200) {
    await splitFile
      .splitFileBySize(file.path, "2e8")
      .then(async (names) => {
        console.log(names);
        for (let i = 0; i < names.length; i++) {
          console.log("uploading " + names[i]);
          await storage
            .bucket(BUCKET)
            .upload(names[i], {
              destination: names[i],
              timeout: 3600000,
            })
            .catch((err) => {
              return { status: err };
            });
        }

        await bucket
          .combine(names, file.name)
          .catch((err) => {
            return {
              status: err,
            };
          });

        for (let i = 0; i < names.length; i++) {
          console.log("deleting " + names[i]);
          await storage
            .bucket(BUCKET)
            .file(names[i])
            .delete()
            .then(() => {
              console.log(`Deleted ${name[i]}`);
            })
            .catch((err) => {
              return { status: err };
            });
        }
        console.log("done");
        return { status: "ok" };
      })

Upvotes: 2

Related Questions