I'm looking for the SQL equivalent of SET varname = value
in Hive QL
I know I can do something like this:
SET CURRENT_DATE = '2012-09-16';
SELECT * FROM foo WHERE day >= @CURRENT_DATE
But then I get this error:
character '@' not supported here
You need to use the special hiveconf for variable substitution. e.g.
hive> set CURRENT_DATE='2012-09-16';
hive> select * from foo where day >= ${hiveconf:CURRENT_DATE}
similarly, you could pass on command line:
% hive -hiveconf CURRENT_DATE='2012-09-16' -f test.hql
Note that there are env and system variables as well, so you can reference ${env:USER}
for example.
To see all the available variables, from the command line, run
% hive -e 'set;'
or from the hive prompt, run
hive> set;
Update:
I've started to use hivevar variables as well, putting them into hql snippets I can include from hive CLI using the source
command (or pass as -i option from command line).
The benefit here is that the variable can then be used with or without the hivevar prefix, and allow something akin to global vs local use.
So, assume have some setup.hql which sets a tablename variable:
set hivevar:tablename=mytable;
then, I can bring into hive:
hive> source /path/to/setup.hql;
and use in query:
hive> select * from ${tablename}
or
hive> select * from ${hivevar:tablename}
I could also set a "local" tablename, which would affect the use of ${tablename}, but not ${hivevar:tablename}
hive> set tablename=newtable;
hive> select * from ${tablename} -- uses 'newtable'
vs
hive> select * from ${hivevar:tablename} -- still uses the original 'mytable'
Probably doesn't mean too much from the CLI, but can have hql in a file that uses source, but set some of the variables "locally" to use in the rest of the script.