LuCaZ
LuCaZ

Reputation: 27

Groovy: GroupBy and get object with max date

How can I get a new list from a list of objects, grouped by a field ts and with max startDate?

def list = [
  new Timeserie(ts:11, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:04:36')),
  new Timeserie(ts:11, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:14:36')),
  new Timeserie(ts:12, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:24:36')),
  new Timeserie(ts:12, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:34:36')),
]

list.each{ println it }             
def byTs = list.groupBy({ tss -> tss.ts })          
println "byTs Size: " + byTs.size()

Expected results:

[new Timeserie(ts:11, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:14:36'),
new Timeserie(ts:12, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:34:36'))]

Upvotes: 2

Views: 2357

Answers (2)

injecteer
injecteer

Reputation: 20699

I would use withDefault() to do the trick:

def list = [  [ts:11, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:04:36')],   [ts:11, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:14:36')],  [ts:12, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:24:36')],  [ts:12, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:34:36')], ]

def res = list.inject( [:].withDefault{ [ ts:null, startDate:new Date( 0 ) ] } ){ res, ts ->
  res[ ts.ts ].ts = ts.ts
  res[ ts.ts ].startDate = new Date( Math.max( ts.startDate.time, res[ ts.ts ].startDate.time ) )
  res
}.values()

assert '[[ts:11, startDate:Mon Feb 12 20:14:36 UTC 2018], [ts:12, startDate:Mon Feb 12 20:34:36 UTC 2018]]' == res.toString()

Note: I replaced the Timeserie class with a Map for simplicity sake

Upvotes: 0

Szymon Stepniak
Szymon Stepniak

Reputation: 42184

There are 3 operations you can chain to get expected result:

  • groupBy { it.ts } to create a map where key is ts and value is a list of timeseries Map<Integer, List<Timeserie>>
  • collectEntries { [(it.key): it.value.max { it.startDate }] } to convert Map<Integer, List<Timeserie>> to Map<Integer, Timeserie> where mapped object is a time series with highest startDate
  • values() to get Collection<Timeserie> from Map<Integer, Timeserie>

A full example looks like this:

def list = [
  new Timeserie(ts:11, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:04:36')),
  new Timeserie(ts:11, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:14:36')),
  new Timeserie(ts:12, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:24:36')),
  new Timeserie(ts:12, startDate:new Date().parse('yyyy-MM-dd HH:mm:ss', '2018-02-12 20:34:36')),
]

def result =  list.groupBy { it.ts }
  .collectEntries { [(it.key): it.value.max { it.startDate }] }
  .values()

println result

Output:

[Timeserie(11, Mon Feb 12 20:14:36 CET 2018), Timeserie(12, Mon Feb 12 20:34:36 CET 2018)]

Upvotes: 1

Related Questions