Concat multiple rows with a delimiter in Hive

Vaishak picture Vaishak · Mar 22, 2017 · Viewed 25k times · Source

I need to concat string values row wise with '~' as delimiter. I have the following data:

enter image description here

I need to concat 'Comment' column for each 'id' in the ascending order of 'row_id' with '~' as delimiter.

Expected output is as below:

enter image description here

GROUP_CONCAT is not an option since its not recognized in my Hive version. I can use collect_set or collect_list, but I won't be able to insert delimiter in between.

Is there any workaround?

Answer

David דודו Markovitz picture David דודו Markovitz · Mar 22, 2017

collect_list returns array, not string.
Array can be converted to delimited string using concat_ws.


This will work, with no specific order of comments.

select      id
           ,concat_ws('~',collect_list(comment)) as comments

from        mytable 

group by    id
;

+----+-------------+
| id |  comments   |
+----+-------------+
|  1 | ABC~PRQ~XYZ |
|  2 | LMN~OPQ     |
+----+-------------+