Mahesh Jandwani
Mahesh Jandwani

Reputation: 41

Remove duplicates based on one column and keep latest entry in google sheets

I was working on some automation and wanted to remove the duplicate rows in my google sheet by comparing it on basis of 3rd column. I found one code which is working flawlessly but it does not remove the old entry in sheet, it removes the latest one. I wanted to keep the latest one from the duplicates.

This is the code which I found for appscript by Cooper:

function removeDuplicates() {
  var sh=SpreadsheetApp.getActiveSheet();
  var dt=sh.getDataRange().getValues();
  var uA=[];
  var d=0;
  for(var i=0;i<dt.length;i++) {
    if(uA.indexOf(dt[i][2])==-1) {
      uA.push(dt[i][2]);
    }else{
      sh.deleteRow(i+1-d++);
    }
  }
}

Can anyone help me with the code which does the same work "removing duplicate rows (Keeps latest entry removes the old entry) based on column" ?

Upvotes: 2

Views: 2180

Answers (3)

Aamir
Aamir

Reputation: 11

I have modified retaining all other rows while eliminating duplicates from column A.

function removeDuplicates() {
 // Get the reference to the active sheet.
 var sh = SpreadsheetApp.getActiveSheet();
 // Get all the data from the sheet.
 var dt = sh.getRange("A:A").getValues(); // Only fetch data from Column A
 // Create an array to store unique values from Column A.
 var uA = [];
 // Loop through each cell in Column A.
 for (var i = 0; i < dt.length; i++) {
 // Check if the value in Column A is not in the unique array.
 if (uA.indexOf(dt[i][0]) == -1) {
 // If not found, add it to the unique array.
 uA.push(dt[i][0]);
 } else {
       // If found, clear the cell.
          sh.getRange(i + 1, 1).setValue('');
       }
 }
}

Upvotes: 1

Yuri Khristich
Yuri Khristich

Reputation: 14537

My suggestion:

function myFunction() {
  var sh = SpreadsheetApp.getActiveSheet();

  // get values of column C
  var col = sh.getDataRange().getValues().map(x => x[2]);

  // get indexes of duplicated values in the column
  var duplicates = col.map((x,i) => 
    col.slice(i+1).includes(x) ? i+1 : '').filter(String); 

  // remove rows by the indexes
  duplicates.reverse().forEach(x => sh.deleteRow(x));
}

Before:

enter image description here

After:

enter image description here

Update

If there will some glitches it makes sense to add the command flush() after every deleteRow(). The last line of the code should be like this:

  // remove rows by the indexes
  duplicates.reverse().forEach(x => { sh.deleteRow(x); SpreadsheetApp.flush() });

Upvotes: 2

Tanaike
Tanaike

Reputation: 201703

From I wanted to keep the latest one from the duplicates., when the latest one is the last row, in your script, how about the following modification?

Modified script:

function removeDuplicates() {
  var sh = SpreadsheetApp.getActiveSheet();
  var dt = sh.getDataRange().getValues();
  var uA = [];
  for (var i = dt.length - 1; i >= 0; i--) {
    if (uA.indexOf(dt[i][2]) == -1) {
      uA.push(dt[i][2]);
    } else {
      sh.deleteRow(i + 1);
    }
  }
}

Upvotes: 3

Related Questions