Category Archives: hive

To copy or move: Implications of loading Hive managed table from HDFS versus local filesystem

When using the load function to populate a Hive table, it’s important to understand what Hive does with the actual data files when the input data resides on your local file system or on the HDFS file system. For example, … Continue reading

Posted in hadoop, hive, scripting, Uncategorized | Tagged , , | Leave a comment

Hive’s collection data types

Hive offers several collection data types: struct, map and array. These data types don’t necessarily make a lot of sense if you are moving the data from the well-structured world of the RDBMS but if you are working directly with … Continue reading

Posted in hadoop, hive, scripting | Tagged , | Leave a comment

Passing parameters to Hive scripts

Like Pig and other scripting languages, Hive provides you with the ability to create parameterized scripts – greatly increasing the re-usability of the scripts.  To take advantage, write your Hive scripts like this: select yearid, sum(HR) from   batting_stats where  teamid … Continue reading

Posted in hadoop, hive, scripting | Tagged , , | Leave a comment