Vincent Ryan
Vincent Ryan

Reputation: 115

Combining multiple xlsx files into a single google sheet for datastudio

I have a folder that will receive multiple xlsx files that will be uploaded via Google forms. There were will be new sheets added a couple of times a week and this data will need to be added.

I want convert all of these xlsx files into a single sheet that will feed a datastudio.

I had started working with this script:

    function myFunction() {
   //folder ID
   var folder = DriveApp.getFolderById("folder ID");
   
   var filesIterator = folder.getFiles();
   var file;
   var filetype;
   var ssID;
   var combinedData = [];
   var data;



   while(filesIterator.hasNext()){
     file = filesIterator.next();
     filetype = file.getMimeType();
     if (filetype === "application/vnd.google-apps.spreadsheet"){
       ssID = file.getId();
       data = getDataFromSpreadsheet(ssID)
       combinedData = combinedData.concat(data);
     }//if ends here
   }//while ends here
   Logger.log(combinedData.length);
 }




 function getDataFromSpreadsheet(ssID) {

   var ss = SpreadsheetApp.openById(ssID);
   var ws = ss.getSheets()[0];
   var data = ws.getRange("A:W" + ws.getLastRow()).getValues();
 return data;

 }

Unfortunately that array is returning 0! I think this maybe due to the xlsx issue.

Upvotes: 0

Views: 891

Answers (1)

Oriol Castander
Oriol Castander

Reputation: 648

1. Fetch the excel data

Unfortunately, Apps Script can not deal directly with excel values. You need to first convert those files into Google Sheets to access the data. This is fairly easy to do, and can be accomplished using the Drive API (you can check the documentation here) with the following two lines at the top of your code.

var filesToConvert = DriveApp.getFolderById(folderId).getFilesByType(MimeType.MICROSOFT_EXCEL);
while (filesToConvert.hasNext()){ Drive.Files.copy({mimeType: MimeType.GOOGLE_SHEETS, parents: [{id: folderId}]}, filesToConvert.next().getId());}

Please note that this duplicates the existing file by creating a Google Sheets copy of the excel but does not remove the excel file itself. Also note that you will need to activate the Drive API service.

2. Remove duplicates from combinedData

This is not as straightforward as removing duplicate from a regular array, as combinedData is an array of arrays. Nevertheless, it can be accomplished by creating an intermediate object that stores an stringified version of the row array as the key and the row array itself as the value:

var intermidiateStep = {};
combinedData.forEach(row => {intermidiateStep[row.join(":")] = row;})
var finalData = Object.keys(intermidiateStep).map(row=>intermidiateStep[row]);

Extra

I also found another mistake in your code. You should add a 1 (or whichever the first row that you want to read is) when declaring the range of the values to be read, so

var data = ws.getRange("A1:W"+ws.getLastRow()).getValues();

instead of:

var data = ws.getRange("A:W" + ws.getLastRow()).getValues();

As it currently is, Apps Script fails to understand the exact range you want to be read and just assumes that it is the whole page.

Upvotes: 2

Related Questions