KeV
KeV

Reputation: 2891

Merge arrays that diverged

I have an initial array s0 which i replicate twice, giving s1 and s2. Now s1 and s2 diverged due to a concurrent insertion.

I know that s1 was obtained by inserting a character into s0 (at a random position). Similarly, s2 is derived from s0 and a random insertion. However, i don't have access to s0 anymore.

How can i combine s1 and s2 as if both additions happened on s0 on the respective positions.

As an example:

s0 = [1, 4, 9, 11]
s1 = [1, 4, 11, 9, 11]
s2 = [1, 3, 4, 9, 11]
-----
s3 = merge(s1, s2) = [1, 3, 4, 11, 9, 11]

(If both s1 and s2 were obtained by inserting a character at the same index, i don't matter what their order will be in s3, both are valid)


Since i know that s1 and s2 have exactly 2 differences, i start by searching for the index of the first difference.

var i = 0;
while (s1[i] === s2[i]) i++;

Now i know that the first difference is at index i, but i'm unable to figure out if s1[i] is the character that was added (and hence, must be added to s2 at index i) or if s2[i] is the character that was added.

Upvotes: 1

Views: 127

Answers (2)

samanime
samanime

Reputation: 26617

One approach would be to go through them simultaneously. If they aren't the same value at a given index, do a look ahead on them both. If one has the same value in the look ahead as the other in the current position, it was probably an insertion on that one.

For example, looking through your examples above:

s1 = [1, 4, 11, 9, 11]
s2 = [1, 3, 4, 9, 11]

Index 0, both are the same. Index 1, they are different. Looking at index 2 of each, you see that s2 has the same value as the current index of s1, so s2 likely had an insertion, so add that before the value of s1. After that, line them back up again (so look at s2 one ahead`.

const s1 = [1, 4, 11, 9, 11];
const s2 = [1, 3, 4, 9, 11];
const s3 = [1, 3, 4, 11, 9, 11]; // for comparison

function merge(a, b) {
  let iA = 0;
  let iB = 0;
  let result = [];
  let abort = 0;
  
  while (abort < 100 && iA < a.length && iB < b.length) {
    if (a[iA] === b[iB]) {
      result.push(a[iA]);
      iA++;
      iB++;
    } else if (a[iA+1] === b[iB]) { // a had an insert
      result.push(a[iA]); 
      iA++;
    } else if (a[iA] === b[iB+1]) { // b had an insert
      result.push(b[iB]);
      iB++;
    } else {
      // extra stuff here
    }
    
    abort++; // this is just so no infinite loop in partially defined loop
  }
  
  // Catch values at the end
  if (iA < a.length) {
    result = result.concat(a.slice(iA));
  } else if (iB < b.length) {
    result = result.concat(b.slice(iB));
  }
    
  return result;
}

console.log(merge(s1, s2));

This code works for your particular example. Notice there is an extra else though. If you got there, it means an insertion happened in one, but either they both got an insertion at the same place, or one got multiple insertions. You'd have to loop through both and determine how to resolve, but the basic approach would be the same: loop through until you find where things line up again, and then take the ones that don't match before incrementing the iterators appropriately.


I just re-read your question a bit closer and realized that you know that only one operation is happening on each before the merge. That makes things quite a bit easier, since you don't have to loop for an arbitrary distance, you just have to go out two at most.

If you know they won't be in the same location, then you can just get rid of the else in my above code, because it'll never happen.

If they might be in the same position, since you know you only have two changes, if neither of the values match (and you know each only received one change), then both received a change at the same point. In that case, just take both. If they need a particular order in this case, you'll have to define some logic. In general though, probably doesn't matter, so just take either first.

function merge(a, b) {
  let iA = 0;
  let iB = 0;
  let result = [];
  
  while (iA < a.length && iB < b.length) {
    if (a[iA] === b[iB]) {
      result.push(a[iA]);
      iA++;
      iB++;
    } else if (a[iA+1] === b[iB]) { // a had an insert
      result.push(a[iA]); 
      iA++;
    } else if (a[iA] === b[iB+1]) { // b had an insert
      result.push(b[iB]);
      iB++;
    } else { // both received a change, take both
      result.push(a[iA]);
      result.push(b[iB]);
      iA++;
      iB++;
    }
  }
  
  // Catch values at the end
  if (iA < a.length) {
    result = result.concat(a.slice(iA));
  } else if (iB < b.length) {
    result = result.concat(b.slice(iB));
  }
    
  return result;
}

console.log(merge([1, 4, 11, 9, 11], [1, 3, 4, 9, 11])); // different places
console.log(merge([1, 3, 4, 9, 11], [1, 2, 4, 9, 11])); // same places
console.log(merge([1, 4, 9, 11, 13], [1, 3, 4, 9, 11]));

Upvotes: 1

trincot
trincot

Reputation: 350821

I would suggest this function. After finding the first index with a difference, it looks ahead to see whether the next values are consistent with either cause (i.e. an insertion in the first or second array respectively). This may not be evident from the next character, so a loop is needed to get clarity on that.

Once it is clear in which array the insertion took place, all information is there to build the final array:

function merge(s1, s2) {
    let i, j;
    for (i = 0; i < s1.length; i++) {
        if (s1[i] !== s2[i]) break;
    }
    for (j = i; j < s1.length; j++) {
        if (s1[j+1] !== s2[j]) return s2.slice(0, i + 1).concat(s1.slice(i));
        if (s1[j] !== s2[j+1]) return s1.slice(0, i + 1).concat(s2.slice(i));
    }
    // If we get here, both mutations were the "same"
    return s1[i].slice(); // ignore one.
}

const s1 = [1, 4, 11, 9, 11],
      s2 = [1, 3, 4, 9, 11];
console.log(merge(s1, s2));

Solution Not Always Unique

Sometimes there is more than one possible value for s3. For instance, if s1=[1, 2] and s2=[2, 1], there is no way to be sure whether s0=[1] or s0=[2].

If you want to detect such situations and get both solutions, then let the function return an array of solutions:

function merge(s1, s2) {
    let i, j, solutions = [];
    for (i = 0; i < s1.length; i++) {
        if (s1[i] !== s2[i]) break;
    }
    for (j = i; j < s1.length; j++) {
        if (s1[j+1] !== s2[j]) solutions.push(s2.slice(0, i + 1).concat(s1.slice(i)));
        if (s1[j] !== s2[j+1]) solutions.push(s1.slice(0, i + 1).concat(s2.slice(i)));
        if (solutions.length) return solutions;
    }
    // If we get here, both mutations were the "same"
    return [s1[i].slice()]; // ignore one.
}
const s1 = [1, 5, 4], 
      s2 = [1, 3, 4];
console.log(merge(s1, s2)); // two solutions

Upvotes: 3

Related Questions