How to set variables in HIVE scripts

user1678312 picture user1678312 · Sep 17, 2012 · Viewed 204.3k times · Source

I'm looking for the SQL equivalent of SET varname = value in Hive QL

I know I can do something like this:

SET CURRENT_DATE = '2012-09-16';
SELECT * FROM foo WHERE day >= @CURRENT_DATE

But then I get this error:

character '@' not supported here

Answer

libjack picture libjack · Sep 19, 2012

You need to use the special hiveconf for variable substitution. e.g.

hive> set CURRENT_DATE='2012-09-16';
hive> select * from foo where day >= ${hiveconf:CURRENT_DATE}

similarly, you could pass on command line:

% hive -hiveconf CURRENT_DATE='2012-09-16' -f test.hql

Note that there are env and system variables as well, so you can reference ${env:USER} for example.

To see all the available variables, from the command line, run

% hive -e 'set;'

or from the hive prompt, run

hive> set;

Update: I've started to use hivevar variables as well, putting them into hql snippets I can include from hive CLI using the source command (or pass as -i option from command line). The benefit here is that the variable can then be used with or without the hivevar prefix, and allow something akin to global vs local use.

So, assume have some setup.hql which sets a tablename variable:

set hivevar:tablename=mytable;

then, I can bring into hive:

hive> source /path/to/setup.hql;

and use in query:

hive> select * from ${tablename}

or

hive> select * from ${hivevar:tablename}

I could also set a "local" tablename, which would affect the use of ${tablename}, but not ${hivevar:tablename}

hive> set tablename=newtable;
hive> select * from ${tablename} -- uses 'newtable'

vs

hive> select * from ${hivevar:tablename} -- still uses the original 'mytable'

Probably doesn't mean too much from the CLI, but can have hql in a file that uses source, but set some of the variables "locally" to use in the rest of the script.