sqlsql-server-2008t-sqlgreatest-n-per-group

Reputation: 1131

Select a Column in SQL not in Group By

I have been trying to find some info on how to select a non-aggregate column that is not contained in the Group By statement in SQL, but nothing I've found so far seems to answer my question. I have a table with three columns that I want from it. One is a create date, one is a ID that groups the records by a particular Claim ID, and the final is the PK. I want to find the record that has the max creation date in each group of claim IDs. I am selecting the MAX(creation date), and Claim ID (cpe.fmgcms_cpeclaimid), and grouping by the Claim ID. But I need the PK from these records (cpe.fmgcms_claimid), and if I try to add it to my select clause, I get an error. And I can't add it to my group by clause because then it will throw off my intended grouping. Does anyone know any workarounds for this? Here is a sample of my code:

Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid 
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid

This is the result I'd like to get:

Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid, cpe.fmgcms_claimid 
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid

Upvotes: 97

Answers (8)

JimB2

Reputation: 463

It's also to possible to add info to the results simply by a select subquery. It looks a bit self-referential and you should consider the lookup costs on big tables but it works.

Example:

` select Key = s.SPACEKEY,
-- this would error, not grouped
-- Space = s.SPACENAME,
-- this is ok, just a look up
Space = (select sx.SPACENAME from SPACES sx where sx.SPACEKEY = s.SPACEKEY),
nUpdates = count(um.username),
User = um.username
from
user_mapping um
join
CONTENT cx on um.user_key = cx.LASTMODIFIER
join
SPACES s on cx.SPACEID = s.SPACEID
group by
SPACEKEY, um.username
order by
SPACEKEY, nUpdates desc

Upvotes: 0

AgungCode.Com

Reputation: 787

You can use as below,

Select X.a, X.b, Y.c from (
                Select X.a as a, sum (b) as sum_b from name_table X
                group by X.a)X
left join from name_table Y on Y.a = X.a

Example;

CREATE TABLE #products (
    product_name VARCHAR(MAX),
    code varchar(3),
    list_price [numeric](8, 2) NOT NULL
);

INSERT INTO #products VALUES ('paku', 'ACE', 2000)
INSERT INTO #products VALUES ('paku', 'ACE', 2000)
INSERT INTO #products VALUES ('Dinding', 'ADE', 2000)
INSERT INTO #products VALUES ('Kaca', 'AKB', 2000)
INSERT INTO #products VALUES ('paku', 'ACE', 2000)

--SELECT * FROM #products 
SELECT distinct x.code, x.SUM_PRICE, product_name FROM (SELECT code, SUM(list_price) as SUM_PRICE From #products 
               group by code)x
left join #products y on y.code=x.code

DROP TABLE #products

Upvotes: -1

Osher Zakin

Reputation: 71

Thing I like to do is to wrap addition columns in aggregate function, like max(). It works very good when you don't expect duplicate values.

Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid, MAX(cpe.fmgcms_claimid) As fmgcms_claimid 
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid

Upvotes: 7

Taryn

Reputation: 247870

You can join the table on itself to get the PK:

Select cpe1.PK, cpe2.MaxDate, cpe1.fmgcms_cpeclaimid 
from Filteredfmgcms_claimpaymentestimate cpe1
INNER JOIN
(
    select MAX(createdon) As MaxDate, fmgcms_cpeclaimid 
    from Filteredfmgcms_claimpaymentestimate
    group by fmgcms_cpeclaimid
) cpe2
    on cpe1.fmgcms_cpeclaimid = cpe2.fmgcms_cpeclaimid
    and cpe1.createdon = cpe2.MaxDate
where cpe1.createdon < 'reportstartdate'

Upvotes: 9

Eyad Ebrahim

Reputation: 991

What you are asking, Sir, is as the answer of RedFilter. This answer as well helps in understanding why group by is somehow a simpler version or partition over: SQL Server: Difference between PARTITION BY and GROUP BY since it changes the way the returned value is calculated and therefore you could (somehow) return columns group by can not return.

Upvotes: -1

Nicholas Carey

Reputation: 74375

The columns in the result set of a select query with group by clause must be:

an expression used as one of the group by criteria , or ...
an aggregate function , or ...
a literal value

So, you can't do what you want to do in a single, simple query. The first thing to do is state your problem statement in a clear way, something like:

I want to find the individual claim row bearing the most recent creation date within each group in my claims table

Given

create table dbo.some_claims_table
(
  claim_id     int      not null ,
  group_id     int      not null ,
  date_created datetime not null ,

  constraint some_table_PK primary key ( claim_id                ) ,
  constraint some_table_AK01 unique    ( group_id , claim_id     ) ,
  constraint some_Table_AK02 unique    ( group_id , date_created ) ,

)

The first thing to do is identify the most recent creation date for each group:

select group_id ,
       date_created = max( date_created )
from dbo.claims_table
group by group_id

That gives you the selection criteria you need (1 row per group, with 2 columns: group_id and the highwater created date) to fullfill the 1st part of the requirement (selecting the individual row from each group. That needs to be a virtual table in your final select query:

select *
from dbo.claims_table t
join ( select group_id ,
       date_created = max( date_created )
       from dbo.claims_table
       group by group_id
      ) x on x.group_id     = t.group_id
         and x.date_created = t.date_created

If the table is not unique by date_created within group_id (AK02), you you can get duplicate rows for a given group.

Upvotes: 63

MatBailie

Reputation: 86798

The direct answer is that you can't. You must select either an aggregate or something that you are grouping by.

So, you need an alternative approach.

1). Take you current query and join the base data back on it

SELECT
  cpe.*
FROM
  Filteredfmgcms_claimpaymentestimate cpe
INNER JOIN
  (yourQuery) AS lookup
    ON  lookup.MaxData           = cpe.createdOn
    AND lookup.fmgcms_cpeclaimid = cpe.fmgcms_cpeclaimid

2). Use a CTE to do it all in one go...

WITH
  sequenced_data AS
(
  SELECT
    *,
    ROW_NUMBER() OVER (PARITION BY fmgcms_cpeclaimid ORDER BY CreatedOn DESC) AS sequence_id
  FROM
    Filteredfmgcms_claimpaymentestimate
  WHERE
    createdon < 'reportstartdate'
)
SELECT
  *
FROM
  sequenced_data
WHERE
  sequence_id = 1

NOTE: Using ROW_NUMBER() will ensure just one record per fmgcms_cpeclaimid. Even if multiple records are tied with the exact same createdon value. If you can have ties, and want all records with the same createdon value, use RANK() instead.

Upvotes: 11

D'Arcy Rittich

Reputation: 171579

You can do this with PARTITION and RANK:

select * from
(
    select MyPK, fmgcms_cpeclaimid, createdon,  
        Rank() over (Partition BY fmgcms_cpeclaimid order by createdon DESC) as Rank
    from Filteredfmgcms_claimpaymentestimate 
    where createdon < 'reportstartdate' 
) tmp
where Rank = 1

Upvotes: 33

Select a Column in SQL not in Group By

Answers (8)

Related Questions