Richard Hostler
Richard Hostler

Reputation: 13

Defining a duplicate row by the first 20 characters of a cell

I am running a basic de-duplication script for a Google spreadsheet and have hit roadblock. I want to base whether or not a row is a duplicate on only the first 20 characters in the columns I have defined, not the entire contents of the column.

I ran multiple searches and haven't found anything even close. I keep coming up with ways to remove the first X characters, but that doesn't do me any good.

Any help is greatly appreciated.

function removeDuplicates() {
  var sheet = SpreadsheetApp.getActiveSheet();
  var data = sheet.getDataRange().getValues();
  var newData = new Array();
  for(i in data){
    var row = data[i];
    var duplicate = false;
    for(j in newData){
      if(row[0] == newData[j][0] && row[4] == newData[j][4]){
  duplicate = true;
}
    }
    if(!duplicate){
      newData.push(row);
    }
  }
  sheet.clearContents();
  sheet.getRange(1, 1, newData.length, newData[0].length).setValues(newData);
}

Upvotes: 1

Views: 106

Answers (2)

iJay
iJay

Reputation: 4283

You were nearly there, If you could compare first 20 characters of column B of your spreadsheet with saved array, you can find duplicates.

substring() is the function you need to use

Read more: https://www.w3schools.com/jsref/jsref_substring.asp

function removeDuplicates() 
{
  var sheet = SpreadsheetApp.getActiveSheet();
  var data = sheet.getDataRange().getValues();
  var newData = new Array();
  for(var i in data)
  {
    var row = data[i];
    var duplicate = false;
    for(var j in newData)
    {
    //row[1]=> column B value 
      if(row[1].substring(0,20) == newData[j][1].substring(0,20)){
        duplicate = true;
      }
    }
    if(!duplicate){
      newData.push(row);
    }
  }
  sheet.clearContents();
  sheet.getRange(1, 1, newData.length, newData[0].length).setValues(newData);
}

Upvotes: 1

Gerome ten Velde
Gerome ten Velde

Reputation: 1

First, you want an array with the first 20 characters of all the values in the specified column

var columnNumber = 2
var values = data.map(function(el){ 
    return /^.{0,20}/.exec(el[columnNumber + 1]).toString() 
})

Then, you want to go through the array and see if there is a duplicate

for (var row = 1; row < values.length; row++){
    var dupIndex = values.indexOf(values[row],row + 1)
    if (~dupIndex)
    {
        Browser.msgBox("Row " + (row + 1) + " and Row " + (dupIndex + 1) + " are identical")
        // Act on the duplicate rows here
    }
}

I hope this at least gives a good starting point, good luck!

Upvotes: 0

Related Questions