Evan292
Evan292

Reputation: 13

Query 2 sets of Sheets data & identify discrepancies

I am trying to use Apps Script to query 2 datasets and compare certain columns across them. I am hoping to...

a) identify missing ID values;

b) reconcile differences in other fields, when the ID values match.

INPUT:
Spreadsheet with 2 tabs (tab1, tab2).
The key ID in each B column (Btab1, Btab2)

OUTPUT: tab that displays problem areas in the datasets.
First column is ID Key.
Second column explains the issue via text string

Again, the challenge here is that the values are not sorted the same, and there could be a slight difference in total # rows

function compare() {
  var ss = SpreadsheetApp.getActiveSpreadsheet();
   ss.insertSheet(1);
   ss.getActiveSheet().setName('output');
  var sheet1 = ss.getSheetByName('sheet1');
  var sheet2 = ss.getSheetByName('sheet2');
  var sheet_output = ss.getSheetByName('output');


  var range1 = sheet1.getRange(1,1,sheet1.getLastRow(),sheet1.getLastColumn()).getValues(); 
  var output1 = [];
  var a1;
  var b1;
  var h1;
  var i1;
  var j1;
  var m1;
  var o1;
  var p1;
  var an1;
  var ao1;
  var x;

  var range2 = sheet2.getRange(1,1,sheet2.getLastRow(),sheet2.getLastColumn()).getValues(); 
  var output2 = [];
  var a2;
  var b2;
  var c2;
  var d2;
  var e2;
  var f2;
  var g2;
  var h2;
  var y;

  /// can i do for(x in range1; y in range2) { all in one function??  If so, what is the proper syntax?

  for(x in range1, y in range2) {
    a1 = range1[x][0];
    b1 = range1[x][1];
    h1 = range1[x][7];
    i1 = range1[x][8];
    j1 = range1[x][9];
    m1 = range1[x][12];
    o1 = range1[x][14];
    p1 = range1[x][15];
    an1 = range1[x][39];
    ao1 = range1[x][40];

    a2 = range2[y][0];
    b2 = range2[y][1];
    c2 = range2[y][2];
    d2 = range2[y][3];
    e2 = range2[y][4];
    f2 = range2[y][5];
    g2 = range2[y][6];
    h2 = range2[y][7];

    if (

      (b1 != b2) ||
      (m1 != e2) // etc etc etc
       )
    {

  //push to output
  }}

Upvotes: 0

Views: 84

Answers (1)

ziganotschka
ziganotschka

Reputation: 26836

Whilst your syntax for(x in range1, y in range2) will not return an error, it won't give you the desired result neither if the rows are not in the same order

Reason:

During each iteration both x and y will change, e.g. if var range1 = [1,2,3] and var range2 = [4,5,6], your loop will iterate 3 times and the values in your sample loop iterations will be:

  1. iteration

range1[x] = 1 and range2[y] = 4

  1. iteration

range1[x] = 2 and range2[y] = 5

  1. iteration

range1[x] = 3 and range2[y] = 6

In this case you will not retrieve the combination

range1[x] = 1 and range2[y] = 4

or

range1[x] = 2 and range2[y] = 6

and so on.

Instead you need to use two nested for loops, which would iterate through all possible combinations of x and y:

 for(x in range1) {
    for(y in range2){
      ...
    } 
  }

Sidenote:

Even if your rows would be in the same order, you still need to be careful. Because for(x in range1) opposed to for(x = 0; x < range1.length; i++) gives you no control about in which folder the loop will iterate over the range.

Now to your query for duplicates

A possible way to implement the functionality in a not too complicated manner would be the following:

  • Define boolean variable and use it to check for each x either it has a duplicate
  • If a duplicate (for column B) is found - further criteria will be evaluated
  • If two rows match by all criteria, the inner loop will be exited with break and the function will jump to the next x
  • If rows with identical key IDs, but discrepancies in other columns are found - both rows will be pushed into sheet output for comparison purposes (this is easier to implement than specifying what exactly is discrepant)
  • After this the inner loop will also be exited
  • In oth cases above duplicate will be set to true
  • If a unique Id is found in sheet1 (duplicate = false) - it will be immediately pushed into output

Sample

function compare() {
  var ss = SpreadsheetApp.getActiveSpreadsheet();
   ss.insertSheet(1);
    ss.getActiveSheet().setName('output');
  var sheet1 = ss.getSheetByName('sheet1');
  var sheet2 = ss.getSheetByName('sheet2');
  var sheet_output = ss.getSheetByName('output');


  var range1 = sheet1.getRange(1,1,sheet1.getLastRow(),sheet1.getLastColumn()).getValues(); 
  var output1 = [];
  var b1;
  var m1;
  var p1;
  var an1;
  var x;

  var range2 = sheet2.getRange(1,1,sheet2.getLastRow(),sheet2.getLastColumn()).getValues(); 
  var output2 = [];
  var b2;
  var e2;
  var f2;
  var g2;
  var y;

  var array = [];
  for(x in range1) {
    var duplicate = false;
    for(y in range2){
      b1 = range1[x][1];
      m1 = range1[x][12];
      p1 = range1[x][15];
      an1 = range1[x][39];    

      b2 = range2[y][1];
      e2 = range2[y][4];
      f2 = range2[y][5];
      g2 = range2[y][6];

      if (        
        (b1 == b2)
      )
      {
        Logger.log("found");
        duplicate = true;
        if((m1 != e2)||
        (p1 != f2) ||
        (an1 != g2)){
          array.push(range1[x]);
          array.push(range2[y]);
        }
        break;          
      }
    } 
    if (duplicate == false){
      Logger.log("duplicate false");
      array.push(range1[x]);
    }
  }
  //push to output
  if(array[0]){
    sheet_output.getRange(sheet_output.getLastRow()+1, 1, array.length, array[0].length).setValues(array);
  }
}

Upvotes: 1

Related Questions