OTUser
OTUser

Reputation: 3848

Groovy: Optimize code to find duplicate elements

I have invoiceList as below which is a List<Map<String:String>> and am trying to findout if all the invoices have same SENDER_COUNTRY and CLIENT_COUNTRY or not, if not it will add message to JSON array

[
    [INVOICE_DATE:20150617, INVOICE_NUMBER:617151,SENDER_COUNTRY:USA, CLIENT_COUNTRY:USA]
    [INVOICE_DATE:20150617, INVOICE_NUMBER:617152,SENDER_COUNTRY:CAD, CLIENT_COUNTRY:MEX]
    [INVOICE_DATE:20150617, INVOICE_NUMBER:617153,SENDER_COUNTRY:CAD, CLIENT_COUNTRY:MEX]
]

JSONArray jsonArray = new JSONArray();
def senderCountry = invoiceList[0]['SENDER_COUNTRY']
def clientCountry  = invoiceList[0]['CLIENT_COUNTRY']
invoiceList.each{ it ->
if(it['SENDER_COUNTRY'] != senderCountry)
  jsonArray.add((new JSONObject()).put("SENDER_COUNTRY","Multiple sender Countries Associated"));
 if(it['CLIENT_COUNTRY'] != clientCountry)
  jsonArray.add((new JSONObject()).put("CLIENT_COUNTRY","Multiple Client Countries Associated"));
}

I feel this code can be refactored/optimized to a better version in Groovy, can someone please help me with it?

Upvotes: 2

Views: 576

Answers (3)

Rao
Rao

Reputation: 21369

Here is another version to achieve the same.

def invoiceList = [
  [INVOICE_DATE:20150617, INVOICE_NUMBER:617151,SENDER_COUNTRY:'USA', CLIENT_COUNTRY:'USA'],
  [INVOICE_DATE:20150617, INVOICE_NUMBER:617152,SENDER_COUNTRY:'CAD', CLIENT_COUNTRY:'MEX'],
  [INVOICE_DATE:20150617, INVOICE_NUMBER:617153,SENDER_COUNTRY:'CAD', CLIENT_COUNTRY:'MEX']
]

def getFilteredList = { map ->
    map.collect{ k,v -> invoiceList.countBy{ it."$k" }.findAll{it.value > 1}.collectEntries{[it.key,v] } }
}

//You may change the description in the values of below map
def findEntries = [CLIENT_COUNTRY: 'Multiple Client Countries found', SENDER_COUNTRY: 'Multiple Sender Countries found']
println groovy.json.JsonOutput.toJson(getFilteredList(findEntries))

Output:

[{"MEX":"Multiple Client Countries found"},{"CAD":"Multiple Sender Countries found"}]

You can quickly try online Demo

EDIT: OP requested for additional information saying it should also return empty if all client country or sender country are same.

Use below script:

def invoiceList = [
  [INVOICE_DATE:20150617, INVOICE_NUMBER:617151,SENDER_COUNTRY:'CAD', CLIENT_COUNTRY:'USA'],
  [INVOICE_DATE:20150617, INVOICE_NUMBER:617152,SENDER_COUNTRY:'CAD', CLIENT_COUNTRY:'MEX'],
  [INVOICE_DATE:20150617, INVOICE_NUMBER:617153,SENDER_COUNTRY:'CAD', CLIENT_COUNTRY:'MEX']
]

def getFilteredList = { map->
    map.collect{ k,v -> invoiceList.countBy{ it."$k" }.findAll{it.value > 1  && (it.value != invoiceList.size())}.collectEntries{ [it.key,v] } }.findAll{it.size()>0}
}

//You may change the descript in the values of below map
def findEntries = [CLIENT_COUNTRY: 'Multiple Client Countried found', SENDER_COUNTRY: 'Multiple Sender Countries found']
println groovy.json.JsonOutput.toJson(getFilteredList(findEntries))​

Quickly try online Demo

EDIT2: OP further request modification to change the output as

[ {"message", "Multiple clients, Multiple sender"}]
def invoiceList = [
  [INVOICE_DATE:20150617, INVOICE_NUMBER:617151,SENDER_COUNTRY:'CAD1', CLIENT_COUNTRY:'USA'],
  [INVOICE_DATE:20150617, INVOICE_NUMBER:617152,SENDER_COUNTRY:'CAD', CLIENT_COUNTRY:'MEX'],
  [INVOICE_DATE:20150617, INVOICE_NUMBER:617153,SENDER_COUNTRY:'CAD', CLIENT_COUNTRY:'MEX']
]

def getFilteredList = { map->
    def result = map.collect{ k,v -> invoiceList.countBy{ it."$k" }.findAll{it.value > 1  && (it.value != invoiceList.size())}.collect{ v } }.findAll{it.size()>0}
    result ? [[message : result.flatten().join(',') ]] : []
}

//You may change the descript in the values of below map
def findEntries = [CLIENT_COUNTRY: 'Multiple Client Countried found', SENDER_COUNTRY: 'Multiple Sender Countries found']
println groovy.json.JsonOutput.toJson(getFilteredList(findEntries))​

Upvotes: 1

sebnukem
sebnukem

Reputation: 8323

What about this (note that my answer does not improve on performance):

def invoiceList = [
  [INVOICE_DATE:20150617, INVOICE_NUMBER:617151,SENDER_COUNTRY:'USA', CLIENT_COUNTRY:'USA'],
  [INVOICE_DATE:20150617, INVOICE_NUMBER:617152,SENDER_COUNTRY:'CAD', CLIENT_COUNTRY:'MEX'],
  [INVOICE_DATE:20150617, INVOICE_NUMBER:617153,SENDER_COUNTRY:'CAD', CLIENT_COUNTRY:'MEX']
]

def jsonarray = []
if (invoiceList.countBy{ it.CLIENT_COUNTRY }.size > 1)
     jsonarray << [CLIENT_COUNTRY: "Multiple client countries associated"]
if (invoiceList.countBy{ it.SENDER_COUNTRY }.size > 1)
     jsonarray << [SENDER_COUNTRY: "Multiple sender countries associated"]

groovy.json.JsonOutput.toJson(jsonarray)
// Result: [{"CLIENT_COUNTRY":"Multiple client countries associated"},{"SENDER_COUNTRY":"Multiple sender countries associated"}] 

Upvotes: 2

Raphael
Raphael

Reputation: 1800

If you can add a new library to your project, you could use GPars:

@Grab(group='org.codehaus.gpars', module='gpars', version='1.0.0') 
import static groovyx.gpars.GParsPool.withPool

def invoiceList = [
    [INVOICE_DATE:20150617, INVOICE_NUMBER:617151,SENDER_COUNTRY:USA, CLIENT_COUNTRY:USA]
    [INVOICE_DATE:20150617, INVOICE_NUMBER:617152,SENDER_COUNTRY:CAD, CLIENT_COUNTRY:MEX]
    [INVOICE_DATE:20150617, INVOICE_NUMBER:617153,SENDER_COUNTRY:CAD, CLIENT_COUNTRY:MEX]
]

def jsonArray = []
def senderCountry = invoiceList[0]['SENDER_COUNTRY']
def clientCountry  = invoiceList[0]['CLIENT_COUNTRY']

withPool( 4 ) {             
    invoiceList.eachParallel{  
        if(it['SENDER_COUNTRY'] != senderCountry)
            jsonArray.add((new JSONObject()).put("SENDER_COUNTRY","Multiple sender Countries Associated"));
        if(it['CLIENT_COUNTRY'] != clientCountry)
            jsonArray.add((new JSONObject()).put("CLIENT_COUNTRY","Multiple Client Countries Associated"))
    }         
}

​

This will create a thread pool with 4 workers and they will scan the invoiceList in parallel.

Upvotes: 1

Related Questions