Hive: dynamic partition adding to external table

Tomer picture Tomer · Sep 25, 2011 · Viewed 19.7k times · Source

I am running hive 071, processing existing data which is has the following directory layout:
-TableName
- d= (e.g. 2011-08-01)
- d=2011-08-02
- d=2011-08-03

... etc
under each date I have the date files.
now to load the data I'm using

CREATE EXTERNAL TABLE table_name (i int)  
PARTITIONED BY (date String)  
LOCATION '${hiveconf:basepath}/TableName';**  

I would like my hive script to be able to load the relevant partitions according to some input date, and number of days. so if I pass date='2011-08-03' and days='7'
The script should load the following partitions - d=2011-08-03
- d=2011-08-04
- d=2011-08-05
- d=2011-08-06
- d=2011-08-07
- d=2011-08-08
- d=2011-08-09

I havn't found any discent way to do it except explicitlly running:

ALTER TABLE table_name ADD PARTITION (d='2011-08-03');  
ALTER TABLE table_name ADD PARTITION (d='2011-08-04');  
ALTER TABLE table_name ADD PARTITION (d='2011-08-05');  
ALTER TABLE table_name ADD PARTITION (d='2011-08-06');  
ALTER TABLE table_name ADD PARTITION (d='2011-08-07');  
ALTER TABLE table_name ADD PARTITION (d='2011-08-08');  
ALTER TABLE table_name ADD PARTITION (d='2011-08-09');  

and then running my query

select count(1) from table_name;

however this is offcourse not automated according to the date and days input

Is there any way I can define to the external table to load partitions according to date range, or date arithmetics?

Answer

Ptah picture Ptah · Jul 30, 2015

I have a very similar issue where, after a migration, I have to recreate a table for which I have the data, but not the metadata. The solution seems to be, after recreating the table:

MSCK REPAIR TABLE table_name;

Explained here

This also mentions the "alter table X recover partitions" that OP commented on his own post. MSCK REPAIR TABLE table_name; works on non-Amazon-EMR implementations (Cloudera in my case).