Reputation: 55
I have two lists, let's call them list A and list B. Both of these lists contain names and there are no duplicates (they are unique values). Every name in list B can be found in list A. I want to find which names are missing from list B in order to insert those missing names into a database. Basic example:
List<String> a = new ArrayList<>(Arrays.asList("name1", "name2", "name3", "name4","name5","name6"));
List<String> b = new ArrayList<>(Arrays.asList("name1", "name2", "name4", "name6"));
a.removeAll(b);
//iterate through and insert into my database here
From what I've searched removeAll()
seems to be a go-to answer. In my case, I am dealing with a wide range of possible quantities. It could be anywhere between 500 to 50,000 names. Will removeAll()
suffice for this? I've read that removeAll()
is O(n^2) which may not be a problem with very small quantities, but with larger quantities, it sounds like it could be. I'd imagine it also depends on the user's patience as to when it would be considered a problem? Ultimately I'm wondering if there is a better way to do this without adding a huge amount of complexity as I do appreciate simplicity (to a point).
Upvotes: 2
Views: 214
Reputation: 64845
50000 is a very small amount of data. Unless you're doing this repeatedly, anything reasonable would likely be good enough.
One way to implement this:
List<String> a = new ArrayList<>(Arrays.asList("name1", "name2", "name3", "name4","name5","name6"));
List<String> b = new HashSet<>(Arrays.asList("name1", "name2", "name4", "name6"));
for (String s : a) {
if (!b.contains(s)) {
insertToDb(s);
}
}
or using Stream API in Java 8:
List result = a.stream().filter(s -> !b.contains(s)).collect(Collectors.toList());
// alternatively: Set result = a.stream().filter(s -> !b.contains(s)).collect(Collectors.toSet());
Upvotes: 0
Reputation: 311458
If the only thing you're doing with these lists is inserting them into a database, you shouldn't really care about the order of the elements. You could use HashSet
s instead of ArrayList
s and get an O(n) performance instead of O(n2). As a side bonus, using a Set
will ensure the values in a
and b
are really unique.
Upvotes: 1