Passing parameters to Hive scripts

Like Pig and other scripting languages, Hive provides you with the ability to create parameterized scripts – greatly increasing the re-usability of the scripts.  To take advantage, write your Hive scripts like this:

select yearid, sum(HR)
from   batting_stats
where  teamid = '${hiveconf:TEAMID}' 
group  by yearid
order  by yearid desc;

Note that the restriction on teamid is ‘${hiveconf:TEAMID}’ rather than an actual value.  This is an instruction to read this variable’s value from the hiveconf namespace.  When you execute the script, you’ll run it as shown below:

hive -f batting.hive -hiveconf TEAMID='LAA'

If you define the parameter in the script but fail to specify a value at run-time, you won’t get any error like you would with Pig.  Instead, the restriction effectively becomes “where teamid = ””.  If you have blanks then you might get a result back; if not, you’ll go through all the necessary mechanics of executing the script sans the results.

Advertisements
This entry was posted in hadoop, hive, scripting and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s