Richmond Watkins
Richmond Watkins

Reputation: 1370

How to remove both instances of duplicated objects in an array

I have an array of objects. I am trying to find duplicated objects and then remove both instances of that object.

Right now I am using this method:

function checkForDups(data){
    for(var i = 0; i < data.length; i++){
      for(var j = i+1; j < data.length; j++){
        if(data[j].number === data[i].number){
          data.splice(j,1);
          data.splice(i,1);
        }
      }
    }
    return data;
  }

I believe problem is that it only checks for duplicates that have an index that greater than the current position it is checking. This means objects that are "behind" it in the array are not checked for duplication. Currently I run the array through this function a few times to get the desired results. However, this is obviously extremely inefficient. How could I achieve my desired results more efficiently?

Upvotes: 2

Views: 85

Answers (3)

enrico.bacis
enrico.bacis

Reputation: 31484

You could use underscore.js:

function checkForDups(data) {
  return (
    _.map(
      _.filter(
        _.pairs(
          _.countBy(data)
        ), function(v) {return v[1] == 1}
      ), function(v) {return ~~v[0]}
    )
  )
}

Example

>> console.log(checkForDups([0,1,2,3,0,1,0,4]))
[2, 3, 4]

You can try it here.

Upvotes: 0

nrabinowitz
nrabinowitz

Reputation: 55688

Here's an alternate implementation. This uses only two passes through the array, and removes all dupes in cases > 2: http://jsfiddle.net/nrabinowitz/1pdr780j/

function removeDupes(arr, test) {
    test = test || function(a, b) { return a === b; };

    function find(cache, element) {
        return cache.some(test.bind(null, element));
    }

    var seen = [];
    var dupes = [];
    var len = arr.length;
    var x;
    var current;

    // First pass - find dupes
    for (x = 0; x < len; x++) {
        current = arr[x];
        if (find(seen, current)) {
            dupes.push(current);
        } else {
            seen.push(current);
        }
    }

    // Second pass: remove dupes. Reverse iteration saves headaches here
    for (x = len - 1; x >= 0; x--) {
        current = arr[x];
        if (find(dupes, current)) {
            arr.splice(x, 1);
        }
    }
}

This is an updated version that takes an optional test function to determine equality for dupe purposes; for the OP's case, the call would be

removeDupes(arr, function(a, b) {
    return a.number == b.number;
});

Note that this assumes support for ES5 methods - Array#some, Function#bind. If you need to support older browsers, an ES5 shim or the Underscore library would fit the bill.

Upvotes: 0

Bergi
Bergi

Reputation: 664346

I believe problem is that it only checks for duplicates that have an index that greater than the current position it is checking. This means objects that are "behind" it in the array are not checked for duplication.

No, that's a simply optimisation, made possible by the symmetry of the equality relation. By searching ahead and removing all duplicates in front of you, any current item can't be a duplicate of a previous one or it would have been already eliminated.

However, there are some things you did not take care of:

  • When splicing (removing) an item from the array, all subsequent ones are moved, and the array changes its length. To really examine all items in the array, you need decrease (or: not increase) the counter variable when removing, so that the new item which is now in the same spot as the removed one gets visited (the spot from which you just removed needs to get revisited).
  • You probably want to break the inner loop after having found a duplicate, otherwise you compare and remove totally different items.
  • You have not made clear what the algorithm should do when there are more than 2 duplicate items of the same sort in the array. Leave one when their number is odd? Thanks for your comment.
    To remove all existing duplicates, you will need to continue the search, but must not remove the ith element immediately or you won't have anything to compare to furtheron - or you might even remove the ith item multiple times (see #2).

So this modification should fit:

function removeAllDups(data) {
    // leaves only items in the array that appeared a single time
    // removes everything whose .number can be found multiple times
    for (var i = 0; i < data.length; i++) {
        var found = false,
            num = data[i].number;
        for (var j = i+1; j < data.length; j++) {
            if (data[j].number === num) {
                found = true;
                data.splice(j--, 1);
            }
        }
        if (found) {
            data.splice(i--, 1);
        }
    }
    return data;
}

Upvotes: 1

Related Questions