Combine many tables in Hive using UNION ALL?

baha-kev picture baha-kev · Apr 24, 2013 · Viewed 65.8k times · Source

I'm trying to append one variable from several tables together (aka row-bind, concatenate) to make one longer table with a single column in Hive. I think this is possible using UNION ALL based on this question ( HiveQL UNION ALL ), but I'm not sure an efficient way to accomplish this?

The pseudocode would look something like this:

CREATE TABLE tmp_combined AS
SELECT b.var1 FROM tmp_table1 b
UNION ALL
SELECT c.var1 FROM tmp_table2 c
UNION ALL
SELECT d.var1 FROM tmp_table3 d
UNION ALL
SELECT e.var1 FROM tmp_table4 e
UNION ALL
SELECT f.var1 FROM tmp_table5 f
UNION ALL
SELECT g.var1 FROM tmp_table6 g
UNION ALL
SELECT h.var1 FROM tmp_table7 h;

Any help is appreciated!

Answer

Marimuthu Kandasamy picture Marimuthu Kandasamy · Apr 24, 2013

Try with following coding...

Select * into tmp_combined  from 
(
    SELECT b.var1 FROM tmp_table1 b
    UNION ALL
    SELECT c.var1 FROM tmp_table2 c
    UNION ALL
    SELECT d.var1 FROM tmp_table3 d
    UNION ALL
    SELECT e.var1 FROM tmp_table4 e
    UNION ALL
    SELECT f.var1 FROM tmp_table5 f
    UNION ALL
    SELECT g.var1 FROM tmp_table6 g
    UNION ALL
    SELECT h.var1 FROM tmp_table7 h
) CombinedTable 

Use with the statement : set hive.exec.parallel=true

This will execute different selects simultaneously otherwise it would be step by step.