Securing (and sharing) password information in Sqoop jobs

Sqoop is a utility that allows you to move data from a relational database system to an HDFS file system (or export from Hadoop to RDBMS!).  One of the things to keep in mind as you start building Sqoop jobs is that the password information shouldn’t be passed via the command line.

Sqoop has a couple of ways to secure this information, one of which is creating a more secure parameters file that you pass to Sqoop at runtime.  For example:

1. Create a file containing the connection string information in your UNIX/Linux home directory:

--connect jdbc:postgresql:// 
--user hduser
--password 'password'

2.  Secure that file by changing the permissions to owner read-only

chmod 400

3.  Modify the appropriate Sqoop jobs to use this file

sqoop import --table mytable --options-file pg.parms

Another way to secure this information is with a password file stored on the HDFS file system itself; writing that one up next.

This entry was posted in General, hadoop, scripting, sqoop and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s