DanCue
DanCue

Reputation: 776

Remove Duplicates if Value in Separate Column is Different

I'm trying to come up with a list of people to send a letter to. I do not want to send them a second letter, but at the same time I don't want to delete them from this list if they are not truly duplicates.

Take this example:

Acct    RandomCol1  RandomCol2  RandomCol3  Address
00001   blahblah    blahblah    blahblah    123 address drive
12345   blahblah    blahblah    blahblah    345 address drive
00001   blahblah    blahblah    blahblah    123 address drive
67890   blahblah    blahblah    blahblah    567 address drive
12345   blahblah    blahblah    blahblah    890 address drive

Notice that Acct 00001 has two rows where the address matches. No need to send a letter to that address twice so we can delete one row. Acct 12345, however does not have a matching address. That should be kept. The end result should look like this:

Acct    RandomCol1  RandomCol2  RandomCol3  Address
00001   blahblah    blahblah    blahblah    123 address drive
12345   blahblah    blahblah    blahblah    345 address drive
67890   blahblah    blahblah    blahblah    567 address drive
12345   blahblah    blahblah    blahblah    890 address drive

I've seen plenty of posts with solutions to delete duplicates in an array based on one column but have not seen one to accomplish the above. Furthermore, a lot of the solutions I don't think would do well with 100K rows.

I've asked another question on here earlier that is similar in nature but I'm having trouble converting this into a solution for the above scenario where the columns are not next to each other. I really liked the performance of that solution, however I've been told it is frowned upon to ask for changes to that solution on that same question.

I've tried putting something together but I'll admit I have no experience with objects or reduce.

function removeDupsWDiffAddr(sh, colWDupValuesIndex, colToComp1Index){ //Master with v8 runtime
/**
 * @param {sheet}  sh - Sheet where duplicates will be removed
 * @param {number} colWDupValuesIndex - Col Index where duplicate values should be.
 * @param {number} colToComp1Index - Col Index where criteria should be met
**/
  let rg = sh.getRange(2, 1, sh.getLastRow() - 1, sh.getLastCol());
  let initDataSz = rg.getNumRows();
  let values = rg.getValues();
  let colWDupValuesArr = sh.getRange(2, colWDupValuesIndex, initDataSz, 1).getValues();
  let colToComp1Arr = sh.getRange(2, colToComp1Index, initDataSz, 1).getValues();
  rg.clearContent(); //Clear the sheet after grabbing everything you need from it

  let out = Object.entries(
    values.reduce((obj, [acctNum, address]) => {
      let compare = (obj[acctNum] = obj[acctNum] || Infinity);
      if (compare == address) {
        obj[acctNum] = address;
      }
      return obj;
    }, {})
  ).map(e => e.reverse());


  let diffAddrDupsCount = initDataSz - out.length;
  let destRange = sh.getRange(2, 1, out.length, out[0].length);
  destRange.setValues(out);

  return diffAddrDupsCount;
}

My guess is that this line values.reduce((obj, [acctNum, address]) would require me adding each column, but there is no guarantee that the column structure or number of columns stays the same, hence being able to provide a column index.

Additional Notes:

Upvotes: 1

Views: 99

Answers (1)

Tanaike
Tanaike

Reputation: 201338

  • You want to remove the duplicated rows by checking the column "D" and "L".
  • You want to overwrite the sheet with the new values.
  • You want to achieve this using Google Apps Script.

In this answer, I used removeDuplicates. I thought that when this method is used, it might become the simple script.

Sample script:

function myFunction() {
  const sheetName = "###";  // Please set the sheet name.

  SpreadsheetApp
    .getActiveSpreadsheet()
    .getSheetByName(sheetName)
    .getDataRange()
    .removeDuplicates([4, 12]);
}

Reference:

Upvotes: 2

Related Questions