TestingTheTest
TestingTheTest

Reputation: 67

Groovy: Remove duplicates from a list of maps by multiple values

Having a list of maps as this

def listOfMaps = 
[
    ["car": "A", "color": "A", "motor": "A", "anything": "meh"],
    ["car": "A", "color": "A", "motor": "A", "anything": "doesn't matter"],
    ["car": "A", "color": "A", "motor": "B", "anything": "Anything"],
    ["car": "A", "color": "B", "motor": "A", "anything": "Anything"]
]

How am I supposed to find duplicates by car, color and motor? If there are more than 1 map with the same car, color and motor value it should return true. In this case it should return true since first and second map have the same car, color and motor value, value could be anything as long as they are the same.

Upvotes: 2

Views: 4734

Answers (3)

Szymon Stepniak
Szymon Stepniak

Reputation: 42184

Groovy has a handy Collection.unique(boolean,closure) method that allows you to create a new list by removing the duplicates from an input list based on the comparator defined in a closure. In your case, you could define a closure that firstly compares car field, then color, and lastly - motor. Any element that duplicates values for all these fields will be filtered out.

Consider the following example:

def listOfMaps = [
    ["car": "A", "color": "A", "motor": "A", "anything": "meh"],
    ["car": "A", "color": "A", "motor": "A", "anything": "doesn't matter"],
    ["car": "A", "color": "A", "motor": "B", "anything": "Anything"],
    ["car": "A", "color": "B", "motor": "A", "anything": "Anything"]
]

// false parameter below means that the input list is not modified
def filtered = listOfMaps.unique(false) { a, b ->
    a.car <=> b.car ?:
        a.color <=> b.color ?:
        a.motor <=> b.motor
}

println filtered

boolean hasDuplicates = listOfMaps.size() > filtered.size()

assert hasDuplicates

Output:

[[car:A, color:A, motor:A, anything:meh], [car:A, color:A, motor:B, anything:Anything], [car:A, color:B, motor:A, anything:Anything]]

Upvotes: 6

Mafor
Mafor

Reputation: 10681

You can group the maps by the appropriate fields and then check if there exists at least one group with more then one element:

boolean result = listOfMaps
         .groupBy { [car: it.car, color: it.color, motor: it.motor] }
         .any { it.value.size() > 1 }

Upvotes: 2

i9or
i9or

Reputation: 2448

Not sure I have understood question correctly, but I have come up with the next code snippet:

def listOfMaps = [
    ["car": "A", "color": "A", "motor": "A", "anything": "meh"],
    ["car": "A", "color": "A", "motor": "A", "anything": "doesn't matter"],
    ["car": "A", "color": "A", "motor": "B", "anything": "Anything"],
    ["car": "A", "color": "B", "motor": "A", "anything": "Anything"]
]

static def findDuplicatesByKeys(List<Map<String, String>> maps, List<String> keys) {
    Map<String, List<Map<String, String>>> aggregationKeyToMaps = [:].withDefault { key -> []}

    maps.each { singleMap ->
        def aggregationKey = keys.collect { key -> singleMap[key] }.join('-')

        aggregationKeyToMaps.get(aggregationKey).add(singleMap)
    }

    aggregationKeyToMaps
}

findDuplicatesByKeys(listOfMaps, ['car', 'color', 'motor'])

Basically it iterates over list of maps and groups them by values of the provided keys. The result will be a map of list of maps. Something similar to:

def aggregatedMaps = [
    "A-A-A": [
        ["car": "A", "color": "A", "motor": "A", "anything": "meh"],
        ["car": "A", "color": "A", "motor": "A", "anything": "doesn't matter"]
    ],
    "A-A-B": [
        ["car": "A", "color": "A", "motor": "B", "anything": "Anything"]
    ],
    "A-B-A": [
        ["car": "A", "color": "B", "motor": "A", "anything": "Anything"]
    ]
]

You can grab .values() for example and apply needed removals (you haven't specified which duplicate should be removed) and finally flatten the list. Hope that's helpful.

Upvotes: 1

Related Questions