NeNaD
NeNaD

Reputation: 20304

Uploading PDF obtained from API directly to s3 in NodeJS

I am fetching a PDF report from a third party API, and I want to upload that PDF directly to the s3. When I do that, it uploads PDF to s3, but for some reason when I open that PDF in s3, all pages are blank. I am doing something wrong? My code is as below.

var report = reportInfo.body;
const params = {
   Key: 'report.pdf',
   Body: report,
   Bucket: process.env.S3_BUCKET_NAME,
   ContentType: 'application/pdf',
};
s3.upload(params, (err, res) => {
   if (err) {
        console.log(err, 'err');
   }
    console.log(res, 'res');
});

I am assigning the response from API to the report object. One part of the response is looking like this:

'%PDF-1.5\n%����\n1 0 obj<</Length 2872/Filter/FlateDecode>>stream\nx��\�n$�\r�����~�0� @�&1���>x�6�f����PU�U�a���mf���V�D�yHQ�փ��~$gF7�?���_/�����/��[�?��=�Ѓv��􃆨�F?u��ǿS3�d��k��:O�����X�k���2�k���\t˿?�������XY�Ի��Ti-3�y��y�u3�Q~���E?�g����߈f_I4��'>>>�$�&����e��G���0�1Go�@M��&�jҚ�YJ3�zmhz��0<�Q��n�۶�����i�\r5w�0�1���ѦO�5��SwM=�pm�����#f�>��q^g��j�J����}O�fi�xz&f�0�ǜ�^���yj���mm{�OM/B{z��%+��H�Ɣl4

I think that this is the plain PDF and that I can directly upload it to s3. Do I need to do something before uploading it?? Why it uploads only blank pages?

Upvotes: 3

Views: 1335

Answers (2)

NeNaD
NeNaD

Reputation: 20304

PDF document should be fetched as arraybuffer.

const axios = require('axios');

const fetchAndUploadPDF = async () => {
  try {
    const pdf_document = await axios({
      method: 'get',
      url: 'url_of_the_report',
      headers: {
        accept: 'application/pdf',
      },
      responseType: 'arraybuffer',
    });
   
    await s3.upload({
      Key: 'file_name',
      Body: pdf_document.data,
      Bucket: 'bucket_name',
      ContentType: 'application/pdf',
      ACL: 'public-read',
    })
    .promise();

    console.log('PDF document successfully fetched and uploaded');
  } catch (error) {
    console.log('ERROR: ', error.stack);
    throw(error);
  }
};

Upvotes: 2

Abdul Moeez
Abdul Moeez

Reputation: 1401

Its all about encoding, In your scenario, file is the multipart file (pdf) that is passed to aws via a POST to the url.

  • server gets a file with this byte -> 0010 (this will not be interpreted right, because a standard byte has 8 bits)
  • so, we encode it in base 64 -> doesn't matter what result, decode it to get a standard byte -> 0000 0010 (now this is a standard byte and is interpreted right by aws)

For node.js encoding or decoding, you should refer this doc.

Another configuration which need's to be done, the API Gateway settings must be properly configured to support binary data types.

Path: AWS Console --> API Gateway --> Settings --> multipart/form-data

enter image description here

Upvotes: 0

Related Questions