I'm trying to append one variable from several tables together (aka row-bind, concatenate) to make one longer table with a single column in Hive. I think this is possible using UNION ALL
based on this question ( HiveQL UNION ALL ), but I'm not sure an efficient way to accomplish this?
The pseudocode would look something like this:
CREATE TABLE tmp_combined AS
SELECT b.var1 FROM tmp_table1 b
UNION ALL
SELECT c.var1 FROM tmp_table2 c
UNION ALL
SELECT d.var1 FROM tmp_table3 d
UNION ALL
SELECT e.var1 FROM tmp_table4 e
UNION ALL
SELECT f.var1 FROM tmp_table5 f
UNION ALL
SELECT g.var1 FROM tmp_table6 g
UNION ALL
SELECT h.var1 FROM tmp_table7 h;
Any help is appreciated!
Try with following coding...
Select * into tmp_combined from
(
SELECT b.var1 FROM tmp_table1 b
UNION ALL
SELECT c.var1 FROM tmp_table2 c
UNION ALL
SELECT d.var1 FROM tmp_table3 d
UNION ALL
SELECT e.var1 FROM tmp_table4 e
UNION ALL
SELECT f.var1 FROM tmp_table5 f
UNION ALL
SELECT g.var1 FROM tmp_table6 g
UNION ALL
SELECT h.var1 FROM tmp_table7 h
) CombinedTable
Use with the statement : set hive.exec.parallel=true
This will execute different selects simultaneously otherwise it would be step by step.