David Mckee
David Mckee

Reputation: 1180

How can I parse an Excel file using SheetJS from an external link (Amazon S3)

I am trying to parse an excel file that I already have the URL for. I keep getting different errors when trying to access the file so that it can be readable. Right now, here is my code:

  const input_file = doc.input_file;
  const extension = input_file.split('.').pop();



  let XMLHttpRequest = require("xmlhttprequest").XMLHttpRequest;
  let oReq = new XMLHttpRequest();
  oReq.open("GET", input_file, true);
  oReq.responseType = "arraybuffer";

  oReq.onload = function(e) {
    let arraybuffer = oReq.responseText;
    /* convert data to binary string */
    let data = new Uint8Array(arraybuffer);
    let arr = new Array();
    for(let i = 0; i != data.length; ++i) arr[i] = String.fromCharCode(data[i]);
    let bstr = arr.join("");

    /* Call XLSX */
    let workbook = XLSX.read(bstr, {type:"binary"});

    /* DO SOMETHING WITH workbook HERE */
    let firstSheet = workbook.SheetNames[0];
    let parsed = XLSX.utils.sheet_to_csv(firstSheet);
    console.log(parsed);
  }

  oReq.send();

The current error I am getting is: Error: Unsupported file NaN at the when I try to read the file at: let workbook = XLSX.read(bstr, {type:"binary"});

I'm unsure on the easiest way to read that external link. Any ideas? If it helps, I am using Meteor.

Upvotes: 9

Views: 6059

Answers (4)

David Mckee
David Mckee

Reputation: 1180

I ended up using a combination of a few of these answers. I want to post it here just in case it helps anyone else.

I started at using the Meteor HTTP package as mentioned by Achal.

meteor add http

I also added an additional package from the Meteor community that allowed for response type to be added in.

meteor add aldeed:http

Then, I used the following code to convert to binary and could proceed with reading the sheet:

HTTP.get(input_file, {responseType: 'arraybuffer'}, function(error, result) {
  let data = new Uint8Array(result.content);
  let arr = new Array();
  for(let i = 0; i != data.length; ++i) arr[i] = String.fromCharCode(data[i]);
  let bstr = arr.join("");

  let workbook = XLSX.read(bstr, {type:"binary"});
  var first_sheet_name = workbook.SheetNames[0];
  let sheet = workbook.Sheets[first_sheet_name];
  let parsed = XLSX.utils.sheet_to_json(sheet);
});

Upvotes: 0

Achal Agrawal
Achal Agrawal

Reputation: 161

A better idea might be to use the Meteor's HTTP package to get the file. The docs are here

Add the package using

meteor add http

And then use :

let result = HTTP.get(input_file,function (error,result){
//process result here
});

result.data will contain your Excel file which you can comfortably parse using SheetJS.

However, make sure that you have allowed Cross Origin on Amazon S3 or you'll receive an error of the form :

"No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'blah blah' is therefore not allowed access."

Upvotes: 2

Kevin
Kevin

Reputation: 2893

This is a tried-and-true answer.

There are two problems with your code:

  1. for binary files, it should be let arraybuffer = oReq.response;, not let arraybuffer = oReq.responseText;

  2. You should enabled Cross-Origin Resource Sharing on your Amazon S3 instance. Just follow the official tutorial here.

Here is a working codepen:

http://codepen.io/KevinWang15/pen/GZXJKj

Are you using nodeJS?

note: The above code just uses the web browser's (chrome) XMLHttpRequest, I'm noticing that you are using

XMLHttpRequest = require("xmlhttprequest").XMLHttpRequest

Are you using something like nodejs? (Sorry I'm not familiar with Meteor)

More specifically, are you using driverdan/node-XMLHttpRequest ?

I experimented with it and your code, and it led to exactly the same error message. I think it's because this XMLHttpRequest still has compatibility problem with oReq.response and oReq.responseText

If you are using nodeJS, I recommend another library: ykzts/node-xmlhttprequest

Install it with

npm i w3c-xmlhttprequest

Change your XMLHttpRequest with

let XMLHttpRequest = require('w3c-xmlhttprequest').XMLHttpRequest;

And it instantly solves the problem!

Upvotes: 4

SKR
SKR

Reputation: 180

XMLHttpRequest is restricted by the Same Origin Policy, meaning you can only access content directly from your own domain.

But you can create a service on your server which would load the sheet for you and pass it back to the client.

Here is a straight forward tutorial.

But please be aware that a general approach of loading third-party files can be a serious security issue. So if your URL to the sheet is constant you may consider only loading this specific link through a php script and not allowing any other URLs.

Upvotes: 1

Related Questions