Qilin Xue
Qilin Xue

Reputation: 15

How do I merge duplicate cells together with google app script?

I currently have a list with two columns. The first column is student name, and the second column is the number of points they have.

I imported this list from multiple spreadsheets so there were many duplicates on the names of the students. I am able to remove the duplicates, but I want to keep a tally on the total points they have. For example:

Amy 10 Bob 9 Carol 15 Amy 12

would turn into:

Amy 22 Bob 9 Carol 15

This is what I have so far:

  var target = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = target.getSheetByName("Sheet2");
  var data = sheet.getRange("A2:B1000").getValues();
  var newData = new Array();
  var k = 0
  var finallist = []
  for(i in data){
    k++;
    var row = data[i];
    var duplicate = false;
    for(j in newData){
      if(row[0] == newData[j][0]){
        duplicate = true;
        var storedHour = sheet.getRange("B"+k).getValue();
        var position = finallist.indexOf(row[0]);
        var originalCell = sheet.getRange("B"+(position+1));
        var originalHour = originalCell.getValue();
        originalCell.setValue(originalHour + storedHour);
        sheet.getRange(k,2).setValue("")
        sheet.getRange(k,1).setValue("")
      }
    }
    if(!duplicate){
      newData.push(row);
      finallist.push(row[0])
    }
  }
}

The problem I'm having is that we have a really large data sample and I'm afraid it may run over Google's 5 minute maximum execution time. Is there another more efficient way to achieve my goal?

Upvotes: 1

Views: 1323

Answers (2)

ScampMichael
ScampMichael

Reputation: 3728

Sorting before comparing allows looking at the next item only instead of all items for each iteration. A spillover benefit is finallist result is alphabatized. Execution time reduction significant.

function sumDups() {
  var target = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = target.getSheetByName("Sheet2");
  var data = sheet.getRange("A2:B" + sheet.getLastRow()).getValues().sort();
  var finallist = [];
  for(var i = 0; i<= data.length - 1; i++){
    var hours = data[i][1];
    while((i < data.length - 1) && (data[i][0] == data[i+1][0])) {
       hours += data[i+1][1];
       i++;
       };
    finallist.push([data[i][0], hours]);
  };
  Logger.log(finallist);
}

Edit: the simple data structure with the name being in the first column allows this to work. For anything more complex understanding and applying the methods shown in @Kos's answer is preferable

Upvotes: 0

user555121
user555121

Reputation:

Your code is running slow because Spreadsheets API methods (like getRange) are time consuming and much slower then other JavaScript code.

Here is optimized function with reduced number of such Spreadsheets API calls:

function calcNumbers()
{
  var target = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = target.getSheetByName("Sheet2");
  var lastRow = sheet.getLastRow();
  var dataRange = sheet.getRange(2, 1, lastRow-1, 2);
  var data = dataRange.getValues();
  var pointsByName = {};

  for (var i = 0; i < data.length; i++)
  {
    var row = data[i];
    var curName = row[0];
    var curNumber = row[1];

    // empty name
    if (!curName.trim())
    {
      continue;
    }

    // if name found first time, save it to object
    if (!pointsByName[curName])
    {
      pointsByName[curName] = Number(curNumber);
    }
    // if duplicate, sum numbers
    else
    {
      pointsByName[curName] += curNumber;
    }
  }

  // prepare data for output
  var outputData = Object.keys(pointsByName).map(function(name){
    return [name, pointsByName[name]];
  });

  // clear old data
  dataRange.clearContent();

  // write calculated data
  var newDataRange = sheet.getRange(2, 1, outputData.length, 2);
  newDataRange.setValues(outputData);
}

Upvotes: 1

Related Questions