SQL Concatenate multiple rows

user2888246 picture user2888246 · Oct 25, 2013 · Viewed 15.8k times · Source

I'm using Teradata, I have a table like this

ID       String
123      Jim
123      John
123      Jane
321      Jill
321      Janine
321      Johan

I want to query the table so I get

ID       String
123      Jim, John, Jane
321      Jill, Janine, Johan

I tried partition but there can be many names. How do I get this result. Even, to point me in the right direction would be great.

Answer

dnoeth picture dnoeth · Oct 26, 2013

Unfortunately there's no PIVOT in Teradata (only a TD_UNPIVOT in 14.10).

If you got luck there's an aggregate UDF at your site to do a group concat (probably low possibility).

Otherwise there are two options: recursion or aggregation.

If the maximum number of rows per id is known aggregation is normally faster. It's a lot of code, but most of it is based on cut&paste.

SELECT
  id,
     MAX(CASE WHEN rn = 1 THEN string END)
  || MAX(CASE WHEN rn = 2 THEN ',' || string ELSE '' END)
  || MAX(CASE WHEN rn = 3 THEN ',' || string ELSE '' END)
  || MAX(CASE WHEN rn = 4 THEN ',' || string ELSE '' END)
  || ... -- repeat up to the known maximum
FROM
 (
   SELECT
      id, string, 
      ROW_NUMBER() 
      OVER (PARTITION BY id
            ORDER BY string) AS rn
   FROM t
 ) AS dt
GROUP BY 1;

For large tables it's much more efficient when you materialize the result of the Derived Table in a Volatile Table first using the GROUP BY column as PI.

For recursion you should use a Volatile Table, too, as OLAP functions are not allowed in the recursive part. Using a view instead will repeatedly calculate the OLAP function and thus result in bad performance.

CREATE VOLATILE TABLE vt AS
 (
   SELECT
      id
      ,string
      ,ROW_NUMBER()
       OVER (PARTITION BY id
             ORDER BY string DESC) AS rn -- reverse order!
      ,COUNT(*)
       OVER (PARTITION BY id) AS cnt
   FROM t
 ) WITH DATA 
UNIQUE PRIMARY INDEX(id, rn)
ON COMMIT PRESERVE ROWS;

WITH RECURSIVE cte
(id, list, rn) AS
 (
   SELECT
      id
      ,CAST(string AS VARCHAR(1000)) -- define maximum size based on maximum number of rows 
      ,rn
   FROM vt
   WHERE rn = cnt

   UNION ALL

   SELECT
      vt.id
      ,cte.list || ',' || vt.string
      ,vt.rn
   FROM vt
   JOIN cte
   ON vt.id = cte.id
   AND vt.rn = cte.rn - 1
)
SELECT id, list
FROM cte
WHERE rn = 1;

There's one problem with this approach, it might need a lot of spool which is easy to see when you omit theWHERE rn = 1.