lalgudisethu
lalgudisethu

Reputation: 11

How can i get the last record in each group that are grouped by more than one columns

SELECT symbol,
       weekno,
       max(trade_date) as trade_date,
       max(high) as high,
       min(low) as low,
       close  -- <----
FROM company_data
WHERE symbol = '$company_name'
  AND weekno <= '$startweek' 
GROUP by symbol, weekno
ORDER BY weekno DESC
LIMIT 2

In the above sample the code which I have highlighted must fetch the latest close field in each grouped items

Thanks

Upvotes: 0

Views: 138

Answers (2)

Ed Bighands
Ed Bighands

Reputation: 169

Welcome to SO. Off the top of my head, you are using an aggregate query (using group by and then aggregate functions). But what you seek is not something that comes naturally as a art of group by, rather is more of windowing/partitioning(over symbol/companyname) in nature. Thus you would need to augment the query statement with joins/unions and use mysql CTE or analytical functions for fetching the latest close field value or you can do as already suggested by @Caius Jard.

To give you the pointers, think of rank/rownum over partition by clause with ordering.

Upvotes: 0

Caius Jard
Caius Jard

Reputation: 74605

What you need to appreciate is that grouping things automatically throws away any notion of "the row the value comes from" so you can't say you want some other value from the row that has the highest whatver because that notion of "the same row" isn't there any more:

Day,      LowTemp, HighTemp
Mon 1st,  5,       10
Mon 8th,  6,       12
Mon 15th, 4,       11

The MIN lowtemp here comes from Mon 15th, the MAX hightemp comes from Mon 8th. There isn't a concept of saying "find the lowest lowtemp and give me the day from that row" because when considering MIN(lowtemp) the database just sees { 4, 5, 6 }, the lowest number is 4, the end - if doesn't retain any knowledge that "and the 4 comes from the row of Mon15th with hightemp of 11" as part of this operation

As such, if you want that data back you have to query it again; you have to ask "what is the MIN lowtemp, ok it's 4, now what are all the rows that have that value of 4, ok, which of those one or more do we want to use?"


In your example "latest" is defined by something we can't see, like some date or some incremental counting number for trade on the day. Perhaps it's trade_date? You're going to have to take your results that the query gives you, so like

Symbol, Week#, max_trade_date,              max_high, min_low
MSFT,       1, 2000-01-06 23:59:59.999,     99999,    99998

And you're going to have to join it to a query that gets you the close for that latest thing, so perhaps like:

SELECT * FROM
(
 SELECT symbol,
   weekno,
   max(trade_date) as trade_date,
   max(high) as high,
   min(low) as low
 FROM company_data
 WHERE symbol = '$company_name'
   AND weekno <= '$startweek' 
 GROUP by symbol, weekno
 ORDER BY weekno DESC
 LIMIT 2
) x
INNER JOIN company_data y
ON x.symbol = y.symbol AND x.weekno = y.weekno and x.trade_date = y.trade_date
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                                      "the latest"

Be prepared for there being more than one row; you have to decide what to do in those cases, and you need to make it so that there is only one row. It might not be possible if you've got something as coarse as "the day". It might not even be possible with something as fine as microseconds. You might have to reach a point wheer you make some arbitrary decision as to which of two trades executed simultaneously, is the "later" one. Incrementing number would help here; you'd select your MAX(incrementing_number) per symbol-week and then join it in to find the close for it


In MySQL 8 you can use analytics to make some parts of your life easier but ultimately it all comes back to this notion that when you create a GROUP, you throw some information away to make the group, and you throw other info away to do the MAX/MIN/AVG etc - you can't simultaneously keep and throw the info away so you can say "MAX x oh, and also this other value from the same row" you can only get the MAX, and then join it back to the detail data again, on the max, to satisfy that "oh and the other data from the same row" part

Upvotes: 1

Related Questions