How to change sqoop metastore?

Chetan Shirke picture Chetan Shirke · Jun 6, 2014 · Viewed 7.6k times · Source

I am using sqoop 1.4.2 version. I am trying to change the sqoop metastore from default hsqldb to mysql.

I have configured following properties in sqoop-site.xml file.

    <property>
    <name>sqoop.metastore.client.enable.autoconnect</name>
    <value>false</value>
    <description>If true, Sqoop will connect to a local metastore
      for job management when no other metastore arguments are
      provided.
    </description>
  </property>
  <property>
    <name>sqoop.metastore.client.autoconnect.url</name>
    <value>jdbc:mysql://ip:3206/sqoop?createDatabaseIfNotExist=true</value>
  </property>
  <property>
    <name>sqoop.metastore.client.autoconnect.username</name>
    <value>userName</value>
  </property>
  <property>
    <name>sqoop.metastore.client.autoconnect.password</name>
    <value>password</value>
  </property>
</configuration>

When I try to create a sqoop jobs with meta-connect url it fails to connect to configured mysql db.

sqoop job --create --meta-connect {mysql_jdbc_url} sqoop job defination

it is throwing following exception.

    14/06/06 15:04:54 INFO sqoop.Sqoop: Running Sqoop version: 1.4.4.2.0.6.1-101
14/06/06 15:04:55 WARN hsqldb.HsqldbJobStorage: Could not interpret as a number: null
14/06/06 15:04:55 ERROR hsqldb.HsqldbJobStorage: Can not interpret metadata schema
14/06/06 15:04:55 ERROR hsqldb.HsqldbJobStorage: The metadata schema version is null
14/06/06 15:04:55 ERROR hsqldb.HsqldbJobStorage: The highest version supported is 0
14/06/06 15:04:55 ERROR hsqldb.HsqldbJobStorage: To use this version of Sqoop, you must downgrade your metadata schema.
14/06/06 15:04:55 ERROR tool.JobTool: I/O error performing job operation: java.io.IOException: Invalid metadata version.
        at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.init(HsqldbJobStorage.java:202)
        at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.open(HsqldbJobStorage.java:161)
        at org.apache.sqoop.tool.JobTool.run(JobTool.java:274)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:240)

Does sqoop 1.4.2 supports metastore other than hsql db?

Please suggest.

Answer

ravik picture ravik · Jul 23, 2014

The answer is Yes, in my case I am using PostgreSQL. I ran into this recently and I am using Version 1.4.4. I am not sure if what I did is the recommended way, but it works. Here are the steps I followed

  1. In sqoop-site.xml I configured it with, the connect string to my database, username and password.

  2. Created the following object in the database, as Sqoop was failing at it.

    CREATE TABLE SQOOP_ROOT ( version INT, propname VARCHAR(128) NOT NULL, propval VARCHAR(256), CONSTRAINT SQOOP_ROOT_unq UNIQUE (version, propname) );

  3. Inserted the following row (This seems to be the reason your script is failing)

    INSERT INTO SQOOP_ROOT VALUES( NULL, 'sqoop.hsqldb.job.storage.version', '0' );

I think the correct way might be is to download the source, and extend org.apache.sqoop.metastore.JobStorage with you DB implementation.