Select a Column in SQL not in Group By

user1603734 picture user1603734 · Aug 16, 2012 · Viewed 133.2k times · Source

I have been trying to find some info on how to select a non-aggregate column that is not contained in the Group By statement in SQL, but nothing I've found so far seems to answer my question. I have a table with three columns that I want from it. One is a create date, one is a ID that groups the records by a particular Claim ID, and the final is the PK. I want to find the record that has the max creation date in each group of claim IDs. I am selecting the MAX(creation date), and Claim ID (cpe.fmgcms_cpeclaimid), and grouping by the Claim ID. But I need the PK from these records (cpe.fmgcms_claimid), and if I try to add it to my select clause, I get an error. And I can't add it to my group by clause because then it will throw off my intended grouping. Does anyone know any workarounds for this? Here is a sample of my code:

Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid 
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid

This is the result I'd like to get:

Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid, cpe.fmgcms_claimid 
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid

Answer

Nicholas Carey picture Nicholas Carey · Aug 16, 2012

The columns in the result set of a select query with group by clause must be:

  • an expression used as one of the group by criteria , or ...
  • an aggregate function , or ...
  • a literal value

So, you can't do what you want to do in a single, simple query. The first thing to do is state your problem statement in a clear way, something like:

I want to find the individual claim row bearing the most recent creation date within each group in my claims table

Given

create table dbo.some_claims_table
(
  claim_id     int      not null ,
  group_id     int      not null ,
  date_created datetime not null ,

  constraint some_table_PK primary key ( claim_id                ) ,
  constraint some_table_AK01 unique    ( group_id , claim_id     ) ,
  constraint some_Table_AK02 unique    ( group_id , date_created ) ,

)

The first thing to do is identify the most recent creation date for each group:

select group_id ,
       date_created = max( date_created )
from dbo.claims_table
group by group_id

That gives you the selection criteria you need (1 row per group, with 2 columns: group_id and the highwater created date) to fullfill the 1st part of the requirement (selecting the individual row from each group. That needs to be a virtual table in your final select query:

select *
from dbo.claims_table t
join ( select group_id ,
       date_created = max( date_created )
       from dbo.claims_table
       group by group_id
      ) x on x.group_id     = t.group_id
         and x.date_created = t.date_created

If the table is not unique by date_created within group_id (AK02), you you can get duplicate rows for a given group.