MikeB
MikeB

Reputation: 788

Comparing sets in Java; operation does not work as expected in both directions

Having created two sets of different sizes, but with some common elements, the intention is to identify the elements in each set that make it different from the other set. The Set class in Java seemed to have a good method: removeAll. I did the following:

import java.util.*;

public class HelloWorld {

    @SuppressWarnings("unchecked")
    public static void main(String args[]) {

        // Create a new set
        Set<String> mySet1 = new HashSet();

        // Add elements
        mySet1.add("1");
        mySet1.add("2");
        mySet1.add("4");
        mySet1.add("5");
        mySet1.add("6");
        mySet1.add("7");

        // Print the elements of the Set1
        System.out.println("mySet1: " + mySet1);

        // Create a new set
        Set<String> mySet2 = new HashSet();

        // Add elements
        mySet2.add("1");
        mySet2.add("2");
        mySet2.add("3");
        mySet2.add("5");
        mySet2.add("6");
        mySet2.add("7");
        mySet2.add("8");

        System.out.println("mySet2: " + mySet2);

        // Compare the two sets
        System.out.println("MySet1 matches mySet2: " + mySet1.equals(mySet2));

        // Remove all elements of mySet2 from mySet1
        Set<String> deletions = mySet1;
        deletions.removeAll(mySet2);
        System.out.println("deletions: " + deletions);

        // Remove all elements of mySet1 from mySet2
        Set<String> updates = mySet2;
        updates.removeAll(mySet1);
        System.out.println("updates: " + updates);
    }
}

The result is:

mySet1: [1, 2, 4, 5, 6, 7]
mySet2: [1, 2, 3, 5, 6, 7, 8]
MySet1 matches mySet2: false
deletions: [4]
updates: [1, 2, 3, 5, 6, 7, 8]

Why isn't the result for 'updates' [3,8]?

Upvotes: 1

Views: 193

Answers (5)

Makoto
Makoto

Reputation: 106440

The operation that you're performing here is called symmetric difference. You're looking for the union of elements that are in A - B and B - A (with set subtraction).

Your code overwrites your original sets, so it should be a simple matter of creating new Sets ensure that the operations don't mutate anything.

// Remove all elements of mySet2 from mySet1
Set<String> deletions = new HashSet<>(mySet1);
deletions.removeAll(mySet2);
System.out.println("deletions: " + deletions);

// Remove all elements of mySet1 from mySet2
Set<String> updates = new HashSet<>(mySet2);
updates.removeAll(mySet1);
System.out.println("updates: " + updates);

Alternatively, Google Guava provides a Sets utility which has a symmetricDifference method:

// prints the numbers 4, 3, and 8 in no guaranteed order
System.out.println(Sets.symmetricDifference(mySet1, mySet2));

Upvotes: 2

Fayaz
Fayaz

Reputation: 471

With the following code:

Set<String> deletions = mySet1;

You have created one more reference to the same set.Both deletions and mySet1 are now 2 variables that point to the same set.With the following line, you have changed the contents of this set.

deletions.removeAll(mySet2);

Effectively,now mySet1 will also contain only elements that deletions contain since both of them are just references that point to the same set. Thats why with the following code you got the unexpected result

updates.removeAll(mySet1);

You need to create a separate copy of the sets so that you can have the results you expect.You can do this by following:

Set<String> deletions = new HashSet<>(mySet1);

Upvotes: 0

Pallavi Sonal
Pallavi Sonal

Reputation: 3901

In the following line, you have assigned the deletions reference the mySet1 object, so now both deletions and mySet1 are pointing to same object. Set deletions = mySet1; Later you have removed all elements of deletions that are contained in mySet2, which leaves deletions with just one element 4. Since, mySet1 and deletions both are pointing to same object, so this means mySet1 also has one element left which is 4. When you try to remove all elements of mySet1 from mySet2, it just tries to remove the element 4 from the mySet2. Hence the output.

Upvotes: 1

Eran
Eran

Reputation: 393851

deletions refers to the same Set as mySet1 due to the Set<String> deletions = mySet1 assignment. Therefore deletions.removeAll removed elements from the original Set mySet1, so the second removeAll received a Set that contains just "4".

You should create copies of the original Sets in order not to mutate mySet1 and mySet2 :

Set<String> deletions = new HashSet<>(mySet1); // create a copy of mySet1
deletions.removeAll(mySet2);
System.out.println("deletions: " + deletions);

// Remove all elements of mySet1 from mySet2
Set<String> updates = new HashSet<>(mySet2); // create a copy of mySet2
updates.removeAll(mySet1);
System.out.println("updates: " + updates);

Upvotes: 3

khelwood
khelwood

Reputation: 59112

Set<String> deletions = mySet1;
deletions.removeAll(mySet2);

You have just removed all of mySet2 from mySet1. Assigning another variable to an object does not copy the object. You can copy it easily enough with HashSet's constructor:

Set<String> deletions = new HashSet<String>(mySet1);
deletions.removeAll(mySet2);

Upvotes: 2

Related Questions