Alan
Alan

Reputation: 2086

Select most recent record grouped by 3 columns

I am trying to return the price of the most recent record grouped by ItemNum and FeeSched, Customer can be eliminated. I am having trouble understanding how I can do that reasonably.

The issue is that I am joining about 5 tables containing hundreds of thousands of rows to end up with this result set. The initial query takes about a minute to run, and there has been some trouble with timeout errors in the past. Since this will run on a client's workstation, it may run even slower, and I have no access to modify server settings to increase memory / timeouts.

Here is my data:

Customer        Price            ItemNum           FeeSched             Date
   5            70.75             01202               12             12-06-2017
   5            70.80             01202               12             06-07-2016
   5            70.80             01202               12             07-21-2017
   5            70.80             01202               12             10-26-2016
   5            82.63             02144               61             12-06-2017
   5            84.46             02144               61             06-07-2016
   5            84.46             02144               61             07-21-2017
   5            84.46             02144               61             10-26-2016

I don't have access to create temporary tables, or views and there is no such thing as a @variable in C-tree, but in most ways it acts like MySql. I wanted to use something like GROUP BY ItemNum, FeeSched and select MAX(Date). The issue is that unless I put Price into the GROUP BY I get an error.

I could run the query again only selecting ItemNum, FeeSched, Date and then doing an INNER JOIN, but with the query taking a minute to run each time, it seems there is a better way that maybe I don't know.

Here is my query I am running, it isn't really that complicated of a query other than the amount of data it is processing. Final results are about 50,000 rows. I can't share much about the database structure as it is covered under an NDA.

SELECT DISTINCT 
CustomerNum,
paid as Price,
ItemNum,
n.pdate as newest
from admin.fullproclog as f
    INNER JOIN ( 
        SELECT 
               id,
               itemId,
               MAX(TO_CHAR(pdate, 'MM-DD-YYYY')) as pdate
        from admin.fullproclog 
        WHERE pdate > timestampadd(sql_tsi_year, -3, NOW())
        group by id, itemId
    ) as n ON n.id = f.id AND n.itemId = f.itemId AND n.pdate = f.pdate

    LEFT join (SELECT itemId AS linkid, ItemNum FROM   admin.itemlist) AS codes ON codes.linkid = f.itemId AND ItemNum >0
    INNER join (SELECT DISTINCT parent_id, 
                   MAX(ins1.feesched) as CustomerNum
    FROM   admin.customers AS p 
          left join admin.feeschedule AS ins1 
                 ON ins1.feescheduleid = p.primfeescheduleid 
          left join admin.group AS c1 
                 ON c1.insid = ins1.feesched 
    WHERE status =1
                 GROUP BY parent_id) 
      AS ip ON ip.parent_id = f.parent_id 

WHERE CustomerNum >0 AND ItemNum >0
UNION ALL

SELECT DISTINCT 
CustomerNum,
secpaid as Price,
ItemNum,
n.pdate as newest
from admin.fullproclog as f
    INNER JOIN ( 
        SELECT 
               id,
               itemId,
               MAX(TO_CHAR(pdate, 'MM-DD-YYYY')) as pdate
        from admin.fullproclog 
        WHERE pdate > timestampadd(sql_tsi_year, -3, NOW())
        group by id, itemId
    ) as n ON n.id = f.id AND n.itemId = f.itemId AND n.pdate = f.pdate

    LEFT join (SELECT itemId AS linkid, ItemNum FROM   admin.itemlist) AS codes ON codes.linkid = f.itemId AND ItemNum >0
    INNER join (SELECT DISTINCT parent_id, 
                   MAX(ins1.feesched) as CustomerNum
    FROM   admin.customers AS p 
          left join admin.feeschedule AS ins1 
                 ON ins1.feescheduleid = p.secfeescheduleid 
          left join admin.group AS c1 
                 ON c1.insid = ins1.feesched 
    WHERE status =1
                 GROUP BY parent_id) 
      AS ip ON ip.parent_id = f.parent_id 

WHERE CustomerNum >0  AND ItemNum >0 

Upvotes: 1

Views: 93

Answers (2)

walter
walter

Reputation: 1239

I feel it quite simple when I'd read the first three paragraphs, but I get a little confused when I've read the whole question.

Whatever you have done to get the data posted above, once you've got the data like that it's easy to retrive "the most recent record grouped by ItemNum and FeeSched".

How to:

  • Firstly, sort the whole result set by Date DESC.
  • Secondly, select fields you need from the sorted result set and group by ItemNum, FeeSched without any aggregation methods.

So, the query might be something like this:

SELECT t.Price, t.ItemNum, t.FeeSched, t.Date 
FROM (SELECT * FROM table ORDER BY Date DESC) AS t 
GROUP BY t.ItemNum, t.FeeSched;

How it works:

When your data is grouped and you select rows without aggregation methods, it will only return you the first row of each group. As you have sorted all rows before grouping, so the first row would exactly be "the most recent record".

Contact me if you got any problems or errors with this approach.

Upvotes: 2

Parveen Singla
Parveen Singla

Reputation: 48

You can also try like this:

Select Price, ItemNum, FeeSched, Date from table where Date IN (Select MAX(Date) from table group by ItemNum, FeeSched,Customer);

Internal sql query return maximum date group by ItemNum and FeeSched and IN statement fetch only the records with maximum date.

Upvotes: 0

Related Questions