TheStranger
TheStranger

Reputation: 1597

How can I get a list containing the differences between two sets?

Let's say I have two sets:

Set<String> set1 = new HashSet<>(Arrays.asList("a", "b", "c", "d", "e") );
Set<String> set2 = new HashSet<>(Arrays.asList("b", "c", "d", "e", "f") );

What is the easiest and best way, performance wise, to compare the two and get a List of the differences? Meaning I should get a list that contains "a" and "f". The tricky thing here is that the differences can occur in either of the list.

I could only make it work with for loops, but there has to be an easier way...

Upvotes: 2

Views: 587

Answers (5)

queeg
queeg

Reputation: 9463

I respond late but think the code is clearer to understand.

Here you can see how to both do union and intersection with Java Sets: https://stackoverflow.com/a/51113135/4222206

Coming back to your question for a difference in both sets:

  • create the union and the intersection
  • subtract the intersection from the union

In code that means:

  Set union = new Set(set1);
  union.addAll(set2);
  Set intersection = new Set(set1);
  intersection.retainAll(set2);
  
  Set result = new Set(union);
  result.removeAll(intersection);

Now in result you have everything that the two sets did not have in common and it should contain [a, f].

Upvotes: 0

The Blind Hawk
The Blind Hawk

Reputation: 1666

something like this?

    public static void main(String[] args) {
        
        Set<String> set1 = new HashSet<>(Arrays.asList("a", "b", "c", "d", "e") );
        Set<String> set2 = new HashSet<>(Arrays.asList("b", "c", "d", "e", "f") );
        //get intersection array
        Set<String> listSame =  new HashSet<>(set2);
        listSame.retainAll(set1);
        System.out.println(listSame);
        //get union array
        Set<String> listDiff =  new HashSet<>(set1);
        listDiff.addAll(set2);
        // get difference
        listDiff.removeAll(listSame);
        System.out.println(listDiff);
    }

The best appreach is probably ro use unions and intersections, then getting the difference.

Upvotes: 0

Georgii Lvov
Georgii Lvov

Reputation: 2743

Using streams it can be done with Collectors.groupingBy and Collectors.counting() to determine the count of occurrences and then filter any string that appears more than once:

List<String> differences = Stream.concat(set1.stream(), set2.stream())
                .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))  // Map<String, Long> -> key -> element, value -> count of occurrences
                .entrySet().stream()
                .filter(e -> e.getValue() == 1) // filter not unique elements from 2 sets
                .map(Map.Entry::getKey)
                .collect(Collectors.toList());

Upvotes: 2

Thomas
Thomas

Reputation: 88747

One approach to do this is to create a set of one collection, iterate over the other and check if the element is contained while removing it:

Collection<String> c1 = Arrays.asList("a", "b", "c", "d", "e");
Collection<String> c2 = Arrays.asList("b", "c", "d", "e", "f");

Set<String> set = new HashSet<>(c1); //one loop over c1 to create the set - O(n)
List<String> differences = new LinkedList<>();
    
//one loop over c2 - O(m)
for( String e : c2 ) {
   boolean removed = set.remove(e); //should be O(1) due to hashset
   //if the element wasn't removed it was not in the set and thus not in c1
   if( !removed ) { 
       differences.add(e); //should be O(1) due to linked list
   }
}
    
//at this point set only contains elements not in c2 as those have been removed by the loop
differences.addAll(set); //at most O(n) if nothing got removed

As you can see, we have 2 O(n) and 1 O(m) operation so the total time complexity is O(n + m).

Upvotes: 0

Upendra Joshi
Upendra Joshi

Reputation: 19

enter image description here

Please follow the screenshot i have sent I think this is the easiest way using java 8 streams

Upvotes: -4

Related Questions