user3663882
user3663882

Reputation: 7367

LEFT OUTER JOIN of two collections

The question is pretty much in the title. I'm looking for an algorithm more efficient than full search through the collections.

I have two collections:

List<Map<TypeId, Object> > col1;
List<Entity> col2;

Where

public enum TypeId{
    PLAYER,
    PARTNER,
    PLATFORM,
    AMOUNT
}

and

public class Entity{
    private int player_id;
    private int platform_id
    private BigDecimal amount;

    //GET, SET
}

The col1 collection which is of the type List<Map<TypeId, Object> > contains only PLAYER, PARTNER, PLATFORM TypeIds.

I need to write a method:

public List<Map<TypeId, Object> > merge(List<Map<TypeId, Object> > col1, List<Entity> col2){
    //Impl
}

Which is going to produce List<Map<TypeId, Object> > each the entry entry of the map contains additional key-value (AMOUNT, AMOUNT's value) where AMOUNT's value is the value of the amount field of the instance e of Entity if e.player_id = entry.get(PLAYER) && e.platform_id = entry.get(PLATFORM) and null otherwise.

In fact, the operation would be the same as

col1 LEFT OUTER JOIN 
col2 ON e.player_id = entry.get(PLAYER) && e.platform_id = entry.get(PLATFORM)

SAMPLE:

col1:
[{PLATFORM: 1, PARTNER: 1, PLAYER: 1},
 {PLATFORM: 1, PARTNER: 3, PLAYER: 1},
 {PLATFORM: 2, PARTNER: 1, PLAYER: 2}
 {PLATFORM: 3, PARTNER: 4, PLAYER: 5}]

col2:
[Entity(platform_id = 1, player_id = 1, amount = 100),
Entity(platform_id = 2, player_id = 2, amount = 200),
Entity(platform_id = 3, player_id = 4, amount = 300)]

result:
[{PLATFORM: 1, PARTNER: 1, PLAYER: 1, AMOUNT: 100},
 {PLATFORM: 1, PARTNER: 3, PLAYER: 1, AMOUNT: 100},
 {PLATFORM: 2, PARTNER: 1, PLAYER: 2, AMOUNT: 200},
 {PLATFORM: 3, PARTNER: 4, PLAYER: 5, AMOUNT: null}]

Upvotes: 8

Views: 8817

Answers (2)

Mick Mnemonic
Mick Mnemonic

Reputation: 7956

The Guava library provides functional idioms that are perfect for these sort of transformations. Here's an example implementation of your method using Guava that does not require changing the method signature:

public List<Map<TypeId, Object>> merge(List<Map<TypeId, Object>> col1,
        List<Entity> col2) {                

    // create a lookup table for getting the amounts
    // based on entities (entity keys)
    final Map<Entity, BigDecimal> entityLookupTable = Maps.toMap(col2,
             new Function<Entity, BigDecimal>() {
                 @Override
                 public BigDecimal apply(Entity entity) {
                     return entity.getAmount();
                 }
    });

    // transform the col1 list using a transform function 
    // that adds the AMOUNT fetched from the lookup table to each entry map
    return Lists.transform(col1, new Function<Map<TypeId, Object>,
                                             Map<TypeId, Object>>() {

        @Override
        public Map<TypeId, Object> apply(Map<TypeId, Object> typeToValueMap) {

                    Entity keyWrapper = new Entity(
                            new EntityKey(
                                (Integer) typeToValueMap.get(TypeId.PLAYER),
                                (Integer) typeToValueMap.get(TypeId.PLATFORM)),
                            null);

                    typeToValueMap.put(TypeId.AMOUNT,
                            entityLookupTable.get(keyWrapper));

                    return typeToValueMap;
                }
            });
}

What is required, however, is to create an EntityKey class that identifies an entity (analogous to primary key in the DB). This class can then be used for implementing equals (and hashCode) in Entity, enabling storing entities in a lookup map.

public class EntityKey {

    private int player_id;
    private int platform_id;

    public EntityKey(int player_id, int platform_id) {
        this.player_id = player_id;
        this.platform_id = platform_id;
    }

    /* Generated by Eclipse */
    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + platform_id;
        result = prime * result + player_id;
        return result;
    }

    /* Generated by Eclipse */
    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        EntityKey other = (EntityKey) obj;
        if (platform_id != other.platform_id)
            return false;
        if (player_id != other.player_id)
            return false;
        return true;
    }        
}

public class Entity {

    private EntityKey key;
    private BigDecimal amount;

    public Entity(EntityKey key, BigDecimal amount) {
        this.key = key;
        this.amount = amount;
    }      

    /* Generated by Eclipse */
    /* Simply delegates to EntityKey */
    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + ((key == null) ? 0 : key.hashCode());
        return result;
    }

    /* Generated by Eclipse */
    /* Simply delegates to EntityKey */
    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        Entity other = (Entity) obj;
        if (key == null) {
            if (other.key != null)
                return false;
        } else if (!key.equals(other.key))
            return false;
        return true;
    }

    /**
     * @return the amount
     */
    public BigDecimal getAmount() {
        return amount;
    }
}

Upvotes: 1

Tagir Valeev
Tagir Valeev

Reputation: 100279

It's easier to make changes in-place, modifying the col1 list instead of creating the new List. Here's Java-8 solution:

public List<Map<TypeId, Object> > merge(List<Map<TypeId, Object> > col1, List<Entity> col2){
    col1.forEach(map -> map.put(TypeId.AMOUNT, 
        col2.stream()
            .filter(e -> e.player_id == (int)map.get(TypeId.PLAYER) && 
                         e.platform_id == (int)map.get(TypeId.PLATFORM))
            .findFirst().map(e -> e.amount).orElse(null)
        ));
    return col1;
}

I suppose that changing the col1 in place is satisfactory in this case. Note that even if you store the result into a new list, it will be useless if you modify the existing maps. Thus to make the result totally independent from the col1, you will have to copy all the maps.

Also note that it's not very effective as for each col1 entry it traverses the col2, so the complexity is roughly col1.size()*col2.size(). It's probably better in your case to throw away an Entity class and create a new one which stores platformId and playerId only (with properly implemented equals and hashCode) and use it as map key:

public static class PlatformAndPlayer {
    private final int playerId, platformId;

    public PlatformAndPlayer(int playerId, int platformId) {
        this.playerId = playerId;
        this.platformId = platformId;
    }

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + platformId;
        result = prime * result + playerId;
        return result;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        PlatformAndPlayer other = (PlatformAndPlayer) obj;
        if (platformId != other.platformId)
            return false;
        if (playerId != other.playerId)
            return false;
        return true;
    }
}

This way instead of col2 list you will have a Map:

Map<PlatformAndPlayer, BigDecimal> col2 = new HashMap<>();
col2.put(new PlatformAndPlayer(1, 1), BigDecimal.valueOf(100));
col2.put(new PlatformAndPlayer(2, 2), BigDecimal.valueOf(200));
col2.put(new PlatformAndPlayer(3, 4), BigDecimal.valueOf(300));

Now your task can be solved easily and effectively (even with Java 5):

public static List<Map<TypeId, Object>> merge(
        List<Map<TypeId, Object>> col1,
        Map<PlatformAndPlayer, BigDecimal> col2) {
    for (Map<TypeId, Object> map : col1) {
        map.put(TypeId.AMOUNT, col2.get(new PlatformAndPlayer(
            (int) map.get(TypeId.PLAYER), (int) map.get(TypeId.PLATFORM))));
    }
    return col1;
}

Upvotes: 3

Related Questions