algorithmic
algorithmic

Reputation: 137

Iterate efficiently through 2 different List with same Type of Object(Java8)

I have two list containing an important number of object with each N elements:

List<Foo> objectsFromDB = {{MailId=100, Status=""}, {{MailId=200, Status=""}, {MailId=300, Status=""} ... {MailId=N , Status= N}}

List <Foo> feedBackStatusFromCsvFiles = {{MailId=100, Status= "OPENED"}, {{MailId=200, Status="CLICKED"}, {MailId=300, Status="HARDBOUNCED"} ... {MailId=N , Status= N}} 

Little Insights: objectFromDB retrieves row of my database by calling a Hibernate method.

feedBackStatusFromCsvFiles calls a CSVparser method and unmarshall to Java objects.

My entity class Foo has all setters and getters. So I know that the basic idea is to use a foreach like this:

     for (Foo fooDB : objectsFromDB) {
          for(Foo fooStatus: feedBackStatusFromCsvFiles){
              if(fooDB.getMailId().equals(fooStatus.getMailId())){
                    fooDB.setStatus(fooStatus.getStatus());
                }
               }
            }

As far as my modest knowledge of junior developer is, I think it is a very bad practice doing it like this? Should I implement a Comparator and use it for iterating on my list of objects? Should I also check for null cases?

Thanks to all of you for your answers!

Upvotes: 1

Views: 3491

Answers (4)

gfelisberto
gfelisberto

Reputation: 1723

Assuming Java 8 and considering the fact that feedbackStatus may contain more than one element with the same ID.

  1. Transform the list into a Map using ID as key and having a list of elements.
  2. Iterate the list and use the Map to find all messages.

The code would be:

final Map<String, List<Foo>> listMap = 
objectsFromDB.stream().collect(
      Collectors.groupingBy(item -> item.getMailId())
);

for (final Foo feedBackStatus : feedBackStatusFromCsvFiles) {
        listMap.getOrDefault(feedBackStatus.getMailId(), Colleactions.emptyList()).forEach(item -> item.setStatus(feedBackStatus.getStatus()));
}

Upvotes: 3

holi-java
holi-java

Reputation: 30696

your problem is merging Foo's last status into Database objects.so you can do it in two steps that will make it more clearly & readable.

  1. filtering Foos that need to merge.
  2. merging Foos with last status.

    //because the status always the last,so you needn't use groupingBy methods to create a complex Map.
    Map<String, String> lastStatus = feedBackStatusFromCsvFiles.stream()
            .collect(toMap(Foo::getMailId, Foo::getStatus
                           , (previous, current) -> current));
    //find out Foos in Database that need to merge
    Predicate<Foo> fooThatNeedMerge = it -> lastStatus.containsKey(it.getMailId());
    //merge Foo's last status from cvs.
    Consumer<Foo> mergingFoo = it -> it.setStatus(lastStatus.get(it.getMailId()));
    
    objectsFromDB.stream().filter(fooThatNeedMerge).forEach(mergingFoo);
    

Upvotes: 1

Michael Hibay
Michael Hibay

Reputation: 522

Use maps from collections to avoid the nested loops.

    List<Foo> aList = new ArrayList<>();
    List<Foo> bList = new ArrayList<>();
    for(int i = 0;i<5;i++){
        Foo foo = new Foo();
        foo.setId((long) i);
        foo.setValue("FooA"+String.valueOf(i));
        aList.add(foo);
        foo = new Foo();
        foo.setId((long) i);
        foo.setValue("FooB"+String.valueOf(i));
        bList.add(foo);
    }

    final Map<Long,Foo> bMap = bList.stream().collect(Collectors.toMap(Foo::getId, Function.identity()));

    aList.stream().forEach(it->{
        Foo bFoo = bMap.get(it.getId());
        if( bFoo != null){
            it.setValue(bFoo.getValue());
        }
    });

The only other solution would be to have the DTO layer return a map of the MailId->Foo object, as you could then use the CVS list to stream, and simply look up the DB Foo object. Otherwise, the expense of sorting or iterating over both of the lists is not worth the trade-offs in performance time. The previous statement holds true until it definitively causes a memory constraint on the platform, until then let the garbage collector do its job, and you do yours as easy as possible.

Upvotes: 2

John Bollinger
John Bollinger

Reputation: 180968

Given that your lists may contain tens of thousands of elements, you should be concerned that you simple nested-loop approach will be too slow. It will certainly perform a lot more comparisons than it needs to do.

If memory is comparatively abundant, then the fastest suitable approach would probably be to form a Map from mailId to (list of) corresponding Foo from one of your lists, somewhat as @MichaelH suggested, and to use that to match mailIds. If mailId values are not certain to be unique in one or both lists, however, then you'll need something a bit different than Michael's specific approach. Even if mailIds are sure to be unique within both lists, it will be a bit more efficient to form only one map.

For the most general case, you might do something like this:

// The initial capacity is set (more than) large enough to avoid any rehashing
Map<Long, List<Foo>> dbMap = new HashMap<>(3 * objectFromDb.size() / 2);

// Populate the map
// This could be done more effciently if the objects were ordered by mailId,
// which perhaps the DB could be enlisted to ensure.
for (Foo foo : objectsFromDb) {
    Long mailId = foo.getMailId();
    List<Foo> foos = dbMap.get(mailId);

    if (foos == null) {
        foos = new ArrayList<>();
        dbMap.put(mailId, foos);
    }
    foos.add(foo);
}

// Use the map
for (Foo fooStatus: feedBackStatusFromCsvFiles) {
    List<Foo> dbFoos = dbMap.get(fooStatus.getMailId());

    if (dbFoos != null) {
        String status = fooStatus.getStatus();

        // Iterate over only the Foos that we already know have matching Ids
        for (Foo fooDB : dbFoos) {
            fooDB.setStatus(status);
        }
    }
}

On the other hand, if you are space-constrained, so that creating the map is not viable, yet it is acceptable to reorder your two lists, then you should still get a performance improvement by sorting both lists first. Presumably you would use Collections.sort() with an appropriate Comparator for this purpose. Then you would obtain an Iterator over each list, and use them to iterate cooperatively over the two lists. I present no code, but it would be reminiscent of the merge step of a merge sort (but the two lists are not actually merged; you only copy status information from one to the other). But this makes sense only if the mailIds from feedBackStatusFromCsvFiles are all distinct, for otherwise the expected result of the whole task is not well determined.

Upvotes: 1

Related Questions