Tide Gu
Tide Gu

Reputation: 835

Find the difference between two Arrays

this does not look like a duplicate to existing intersection / union questions (I may wrong).

I have two ArrayLists, which contains a class that contains a value, e.g.,

class A {
    private int value;

    public int getValue() {
        return value;
    }
}

And my Lists are

ArrayList<A> first, second;

I want to find two indices from first and second, which first indices are points to items in first that the respective values only in first but not in second, and the other indices are indices in second that the respective values only in second but not in first.

For example (read from center rows, where has the data):

hit                  *   *
index of first   0   1   2   3
----------------------------------
first.value      1   3   5   7
second.value     1 2   4   6 7 8
----------------------------------
index of second  0 1   2   3 4 5
hit                *   *   *   *

I want two indices of 1 2 and 1 2 3 5.

Note:

  1. I prefer not to use 3rd party libraries
  2. Values are integers and sorted, and I the return must also in order.
  3. Lengths of first and second are uncertain - either one can be longer than the other.

Thanks.

PS: Anyone need more detail, this is a simplified version of my problem. first is actually a ResultSet of SQL query, which are all records on the server, second is a list of local record (or consider processed from another local database), what I want to do is to delete all records on the server (in first) that does not contained in the local database (second), and add those missing record to remote database. Thanks.


To @Tibrogargan

My target is to incrementally transfer my local database (sqlite) to remote database (MySQL). The local databases has a field called LocalId (no duplicate and not the primary key) and of course other contents (~10 fields), and apart from copy required fields (~5 fields) to the remote database, I also included a field to determine if the record is deleted from local database.

As other requirement, I have already read all local data into a ArrayList, sorted by LocalId on local database, and have access to all remote data via rs, sorted by LocalId on remote database.

My earlier version of code was to find the last LocalId on local database, and mark all records on remote database where their LocalId is greater than the max LocalId on local database. However, it was correct until I found I don't always delete the tail records on local database, so I found it is necessary to iterate all messages and compare the existence.

I don't know if there're any "better" solution but seems that if I load all local messages to remote, the performance will be horrible?

Oh, just need to mention, there are only tens of messages need to be deleted and added to remote, out from hundreds to 100+k records existing on both remote and local dataset. The logic of determine if a record on remote should be marked as deleted is rely on if the respective message (same LocalId) exists on local dataset.

Upvotes: 2

Views: 2224

Answers (1)

nhouser9
nhouser9

Reputation: 6780

Since the arrays are sorted, you can just loop through both of them, checking for duplicates as you go. Here is some pseudocode:

//loop over both arrays at once
int arrayAIndex = 0;
int arrayBIndex = 0;
while (arrayAIndex < arrayA.length && arrayBIndex < arrayB.length) {
  if (arrayA[arrayAIndex] == arrayB[arrayBIndex]) {
    arrayAIndex++;
    arrayBIndex++;
  } else if (arrayA[arrayAIndex] > arrayB[arrayBIndex]) {
    addToResultB(arrayBIndex);
    arrayBIndex++;
  } else if (arrayB[arrayBIndex] > arrayA[arrayAIndex]) {
    addToResultA(arrayAIndex);
    arrayAIndex++;
  }
}

//clean up any elements which were not looked at during the above
while (arrayAIndex < arrayA.length) {
  addToResultA(arrayAIndex);
  arrayAIndex++;
}
while (arrayBIndex < arrayB.length) {
  addToResultB(arrayBIndex);
  arrayBIndex++;
}

Upvotes: 4

Related Questions