Pentium10
Pentium10

Reputation: 208042

Efficient intersection of two List<String> in Java?

Question is simple:

I have two List

List<String> columnsOld = DBUtils.GetColumns(db, TableName);
List<String> columnsNew = DBUtils.GetColumns(db, TableName);

And I need to get the intersection of these. Is there a quick way to achieve this?

Upvotes: 83

Views: 121172

Answers (9)

Mišo Stankay
Mišo Stankay

Reputation: 339

With Java 8 Stream API (and Java 9 List.of()) you can do following:

List<Integer> list1 = List.of(1, 1, 2, 2);
List<Integer> list2 = List.of(2, 2, 3, 3);

List<Integer> intersection = list1.stream()
    .filter(list2::contains)
    .distinct()
    .collect(Collectors.toList()); 

Upvotes: 0

user_3380739
user_3380739

Reputation: 1284

using retainAll if don't care occurrences, otherwise using N.intersection

a = N.asList(12, 16, 16, 17, 19);
b = N.asList(16, 19, 107);
a.retainAll(b); // [16, 16, 19]
N.println(a);

a = N.asList(12, 16, 16, 17, 19);
b = N.asList(16, 19, 107);
a = N.intersect(a, b);
N.println(a); // [16, 19]

N is an utility class in abacus-common

Upvotes: 4

Serhii Shevchyk
Serhii Shevchyk

Reputation: 39456

Using Google's Guava library:

Sets.intersection(Sets.newHashSet(setA), Sets.newHashSet(setB))

Note: This is much more efficient than naively doing the intersection with two lists: it's O(n+m), versus O(n×m) for the list version. With two million-item lists it's the difference between millions of operations and trillions of operations.

Upvotes: 30

Dheeraj Sachan
Dheeraj Sachan

Reputation: 4133

use org.apache.commons.collections4.ListUtils#intersection

Upvotes: 1

Ravi Sanwal
Ravi Sanwal

Reputation: 634

If you put the second list in a set say HashSet. And just iterate over the first list checking for presence on the set and removing if not present, your first list will eventually have the intersection you need. It will be way faster than retainAll or contains on a list. The emphasis here is to use a set instead of list. Lookups are O(1). firstList.retainAll (new HashSet (secondList)) will also work.

Upvotes: 3

Deutro
Deutro

Reputation: 3333

There is a nice way with streams which can do this in one line of code and you can two lists which are not from the same type which is not possible with the containsAll method afaik:

columnsOld.stream().filter(c -> columnsNew.contains(c)).collect(Collectors.toList());

An example for lists with different types. If you have a realtion between foo and bar and you can get a bar-object from foo than you can modify your stream:

List<foo> fooList = new ArrayList<>(Arrays.asList(new foo(), new foo()));
List<bar> barList = new ArrayList<>(Arrays.asList(new bar(), new bar()));

fooList.stream().filter(f -> barList.contains(f.getBar()).collect(Collectors.toList());

Upvotes: 3

Roman
Roman

Reputation: 66226

You can use retainAll method:

columnsOld.retainAll (columnsNew);

Upvotes: 131

Gigas
Gigas

Reputation: 97

How about

private List<String> intersect(List<String> A, List<String> B) {
    List<String> rtnList = new LinkedList<>();
    for(String dto : A) {
        if(B.contains(dto)) {
            rtnList.add(dto);
        }
    }
    return rtnList;
}

Upvotes: 8

bjornhol
bjornhol

Reputation: 550

Since retainAll won't touch the argument collection, this would be faster:

List<String> columnsOld = DBUtils.GetColumns(db, TableName); 
List<String> columnsNew = DBUtils.GetColumns(db, TableName); 

for(int i = columnsNew.size() - 1; i > -1; --i){
    String str = columnsNew.get(i);
    if(!columnsOld.remove(str))
        columnsNew.remove(str);
}

The intersection will be the values left in columnsNew. Removing already compared values fom columnsOld will reduce the number of comparisons needed.

Upvotes: 20

Related Questions