SQL query using a case statement within the group by fields

saspert picture saspert · May 20, 2011 · Viewed 33.4k times · Source

I have a complex query that joins different tables to get the count. There are a few fields to group by. Now, I want to add an additional field which needs a case statement. And this field also has to be in the group by list. My query originally looks like this -

SELECT DMAGATR.WRK_LOC_LEVEL4 
     , DMBR.WRK_LOC_NM 
     , DMBR.RELCD 
     , COUNT(DISTINCT DMBR.DMBRKEY) AS ELIG_COUNT
      FROM DMBR 
INNER JOIN DCUST DCUST ON DMBR.DCUSTKEY = DCUST.DCUSTKEY
INNER JOIN DMAGATR DMAGATR ON DMBR.DMBRKEY = DMAGATR.DMBRKEY
 LEFT JOIN DMDYNATR DMDYNATR ON DMBR.DMBRKEY = DMDYNATR.DMBRKEY
     WHERE DMBR.C_TIMESSTAMP <= '12/31/2011'
       AND DMBR.RELCD IN ('0', '1') 
       AND DMBR.EE_STS IN ( 'A','L')
       AND (DMBR.DEL_DT IS NULL
        OR DMBR.DEL_DT > '12/31/2011')
       AND DCUST.PRCD = 'TAR'
  GROUP BY DMAGATR.WRK_LOC_LEVEL4, DMBR.WRK_LOC_NM, D_MEMBER.REL_CD

But the new field looks something like this -

(SELECT CASE
          WHEN (DMBR.WRK_LOC_NM = '6' AND DMBR.GDR = 'M' AND DMBR.REL_CD in ('0','1') 
            AND DMBR.EE_STS IN ('A','L')) THEN 'SEG 1'
          ELSE 'OTHER'
        END 
   FROM DMBR) as CMPN

I tried to add it in the select list but it did not work. Then I added it in two places - in the select and also in the group by list. That did not work either.

The errors I got were:

  1. ORA-00904 - CMPN not a valid column
  2. ORACLE prepare error: ORA-22818: subquery expressions not allowed here.

I did some research online found examples that were close but not exactly identical to mine.

SQL GROUP BY CASE statement with aggregate function Not sure if I understood the question here SQL query with count and case statement This is quite different from my need. http://jerrytech.blogspot.com/2008/04/can-you-group-by-case-statement-in-sql.html (this is close but I dont need the insert statements I tried this approach but it did not work for me)

Any suggestions would be appreciated.

Answer

DRapp picture DRapp · May 20, 2011

I think the error is you are describing a FIELD (ie: result column) for the query like the others: DMAGATR.WRK_LOC_LEVEL4 ,DMBR.WRK_LOC_NM ,DMBR.RELCD ,COUNT (DISTINCT DMBR.DMBRKEY...

I think the error is that when using a SQL-Select statement for a resulting COLUMN, it must only return a single row. Since your query is just "... FROM DMBR ) as CMPN", you are returning more than one row for the field and no Database knows how to guess your result.

What you are probably missing is both a WHERE clause on the field, and possibly a GROUP by if you are looking for a distinct value from within the DMBR table.

Fix that and it should get you MUCH further along. Not knowing the rest of data structure or relationships, I can't figure what your ultimate result is meant to be.


ADDITIONAL COMMENT...

By looking at other answers provided, they have offered to do an immediate CASE WHEN on whatever the current "DMBR" record you are on, which would be correct, but not quite working. I think due to the two possible results, that too will have to be part of the group by.. as count(DISTINCT), the group by has to be based on any non-aggregation columns... of which, this case/when would be as such.. So your ultimate result would have

Lvl, Work Loc, RelCD, Case/when, count(distinct)  where...
                        SEG 1     999
                        Other     999

Additionally, your CASE/WHEN had two components exactly matching your WHERE clause, so I took it out of there since no records of that set would have been returned anyway.

So, all that being said, I would write it as...

SELECT
      DMAGATR.WRK_LOC_LEVEL4,
      DMBR.WRK_LOC_NM,
      DMBR.RELCD,
      CASE WHEN (DMBR.WRK_LOC_NM = '6' 
             AND DMBR.GDR = 'M' ) 
           THEN 'SEG 1'
           ELSE 'OTHER'
           END as WhenStatus,
      COUNT (DISTINCT DMBR.DMBRKEY) AS ELIG_COUNT
   FROM
      DMBR 
         JOIN DCUST 
            ON  DMBR.DCUSTKEY = DCUST.DCUSTKEY
         JOIN DMAGATR
            ON  DMBR.DMBRKEY = DMAGATR.DMBRKEY
         LEFT JOIN DMDYNATR
            ON  DMBR.DMBRKEY = DMDYNATR.DMBRKEY
   WHERE
          DMBR.C_TIMESSTAMP <= '12/31/2011'
      AND DMBR.REL_CD in ('0','1') 
      AND DMBR.EE_STS IN ('A','L')) 
      AND DCUST.PRCD = 'TAR'
      AND (    DMBR.DEL_DT IS NULL
           OR  DMBR.DEL_DT > '12/31/2011')
   GROUP BY 
      DMAGATR.WRK_LOC_LEVEL4,
      DMBR.WRK_LOC_NM,
      D_MEMBER.REL_CD,
      CASE WHEN (DMBR.WRK_LOC_NM = '6' 
            AND DMBR.GDR = 'M' ) 
          THEN 'SEG 1'
          ELSE 'OTHER'
          END

Finally, sometimes, I've seen where a group by will choke on a complex column, such as a case / when. However, different servers allow ordinal reference to the group by (and order by too) positions. So, since the query has 4 non-aggregate columns (all listed first), then the count of distinct, you MIGHT be able to get away with changing the GROUP BY clause to...

GROUP BY 1, 2, 3, 4

All pertaining to the sequential order of columns STARTING the SQL-Select call.

--- CLARIFICATION about group by and case-sensitivity

First, the case-sensitivity, most engines are case-sensitive on keywords, hence CASE WHEN ... AND ... THEN ... ELSE ... END.

As for the "group by" (and also works for the "order by"), its more of a shortcut to the ordinal columns in your query instead of explicitly listing the long names to them and having to re-type the entire CASE construct a second time, you can just let the engine know which column of the result set you want to order by look at the following (unrelated) query...

select
      lastname,
      firstname,
      sum( orderAmount ) TotalOrders
   from
      customerOrders
   group by
      lastname,
      firstname
   order by 
     TotalOrders DESC

and

select
      lastname,
      firstname,
      sum( orderAmount ) TotalOrders
   from
      customerOrders
   group by
      1,
      2
   order by 
      3 DESC

Each would produce the same results... The fictitious customerOrders table would be pre-aggregated by last name and first name and show the total per person (all assuming no duplicate names for this example, otherwise, I would have used a customer ID). Once that is done, the order by kicks in and will put in order of the most sales to a given customer in DESCENDING order at the top of the list.

The numbers just represent the ordinal columns being returned in the query instead of long-hand typing the field names. More for the issue you have of your "CASE/WHEN" clause to prevent oops retyping and missing it up in the group by and pulling your hair out figuring out why.