jacob
jacob

Reputation: 11

Join list from two list in Java Object in stream

I have two list on two Class where id and month is common

public class NamePropeties{
    private String id;
    private Integer name;
    private Integer months;
}


public class NameEntries {
    private String id;
    private Integer retailId;
    private Integer months;
}

List NamePropetiesList = new ArrayList<>(); List NameEntries = new ArrayList<>();

Now i want to JOIN two list (like Sql does, JOIN ON month and id coming from two results) and return the data in new list where month and id is same in the given two list.

if i will start iterating only one and check in another list then there can be a size iteration issue.

i have tried to do it in many ways but is there is any stream way?

Upvotes: 1

Views: 1441

Answers (2)

Chop Suey
Chop Suey

Reputation: 250

If performance and nesting (as discussed) is not too much of a concern you could employ something along the lines of a crossjoin with filtering:

Result holder class

public class Tuple<A, B> {
    public final A a;
    public final B b;

    public Tuple(A a, B b) {
        this.a = a;
        this.b = b;
    }
}

Join with a predicate:

public static <A, B> List<Tuple<A, B>> joinOn(
    List<A> l1,
    List<B> l2,
    Predicate<Tuple<A, B>> predicate) {
    return l1.stream()
        .flatMap(a -> l2.stream().map(b -> new Tuple<>(a, b)))
        .filter(predicate)
        .collect(Collectors.toList());
}

Call it like this:

List<Tuple<NamePropeties, NameEntries>> joined = joinOn(
    properties,
    names,
    t -> Objects.equals(t.a.id, t.b.id) && Objects.equals(t.a.months, t.b.months)
);

Upvotes: 1

fps
fps

Reputation: 34460

The general idea has been sketched in the comments: iterate one list, create a map whose keys are the attributes you want to join by, then iterate the other list and check if there's an entry in the map. If there is, get the value from the map and create a new object from the value of the map and the actual element of the list.

It's better to create a map from the list with the higher number of joined elements. Why? Because searching a map is O(1), no matter the size of the map. So, if you create a map from the list with the higher number of joined elements, then, when you iterate the second list (which is smaller), you'll be iterating among less elements.

Putting all this in code:

public static <B, S, J, R> List<R> join(
    List<B> bigger, 
    List<S> smaller,
    Function<B, J> biggerKeyExtractor,
    Function<S, J> smallerKeyExtractor,
    BiFunction<B, S, R> joiner) {

    Map<J, List<B>> map = new LinkedHashMap<>();
    bigger.forEach(b -> 
        map.computeIfAbsent(
                biggerKeyExtractor.apply(b),
                k -> new ArrayList<>())
            .add(b));

    List<R> result = new ArrayList<>();
    smaller.forEach(s -> {
        J key = smallerKeyExtractor.apply(s);
        List<B> bs = map.get(key);
        if (bs != null) {
            bs.forEach(b -> {
                R r = joiner.apply(b, s);
                result.add(r);
            }
        }
    });

    return result;
}

This is a generic method that joins bigger List<B> and smaller List<S> by J join keys (in your case, as the join key is a composite of String and Integer types, J will be List<Object>). It takes care of duplicates and returns a result List<R>. The method receives both lists, functions that will extract the join keys from each list and a joiner function that will create new result R elements from joined B and S elements.

Note that the map is actually a multimap. This is because there might be duplicates as per the biggerKeyExtractor join function. We use Map.computeIfAbsent to create this multimap.

You should create a class like this to store joined results:

public class JoinedResult {

    private final NameProperties properties;
    private final NameEntries entries;

    public JoinedResult(NameProperties properties, NameEntries entries) {
        this.properties = properties;
        this.entries = entries;
    }

    // TODO getters
}

Or, if you are in Java 14+, you might just use a record:

public record JoinedResult(NameProperties properties, NameEntries entries) { }

Or actually, any Pair class from out there will do, or you could even use Map.Entry.

With the result class (or record) in place, you should call the join method this way:

long propertiesSize = namePropertiesList.stream()
    .map(p -> Arrays.asList(p.getMonths(), p.getId()))
    .distinct()
    .count();
long entriesSize = nameEntriesList.steram()
    .map(e -> Arrays.asList(e.getMonths(), e.getId()))
    .distinct()
    .count();

List<JoinedResult> result = propertiesSize > entriesSize ? 
    join(namePropertiesList, 
         nameEntriesList, 
         p -> Arrays.asList(p.getMonths(), p.getId()),
         e -> Arrays.asList(e.getMonths(), e.getId()),
         JoinedResult::new)                                    :
    join(nameEntriesList, 
         namePropertiesList, 
         e -> Arrays.asList(e.getMonths(), e.getId()),
         p -> Arrays.asList(p.getMonths(), p.getId()),
         (e, p) -> new JoinedResult(p, e));

The key is to use generics and call the join method with the right arguments (they are flipped, as per the join keys size comparison).

Note 1: we can use List<Object> as the key of the map, because all Java lists implement equals and hashCode consistently (thus they can safely be used as map keys)

Note 2: if you are on Java9+, you should use List.of instead of Arrays.asList

Note 3: I haven't checked for neither null nor invalid arguments

Note 4: there is room for improvements, i.e. key extractor functions could be memoized, join keys could be reused instead of calculated more than once and multimap could have Object values for single elements and lists for duplicates, etc

Upvotes: 1

Related Questions