How to import/export hbase data via hdfs (hadoop commands)

Hafiz Muhammad Shafiq picture Hafiz Muhammad Shafiq · Sep 18, 2014 · Viewed 45.5k times · Source

I have saved my crawled data by nutch in Hbase whose file system is hdfs. Then I copied my data (One table of hbase) from hdfs directly to some local directory by command

hadoop fs -CopyToLocal /hbase/input ~/Documents/output

After that, I copied that data back to another hbase (other system) by following command

hadoop fs -CopyFromLocal ~/Documents/input /hbase/mydata

It is saved in hdfs and when I use list command in hbase shell, it shows it as another table i.e 'mydata' but when I run scan command, it says there is no table with 'mydata' name.

What is problem with above procedure? In simple words:

  1. I want to copy hbase table to my local file system by using a hadoop command
  2. Then, I want to save it directly in hdfs in another system by hadoop command
  3. Finally, I want the table to be appeared in hbase and display its data as the original table

Answer

Nanda picture Nanda · Oct 9, 2014

If you want to export the table from one hbase cluster and import it to another, use any one of the following method:

Using Hadoop

  • Export

    $ bin/hadoop jar <path/to/hbase-{version}.jar> export \
         <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]
    

    NOTE: Copy the output directory in hdfs from the source to destination cluster

  • Import

    $ bin/hadoop jar <path/to/hbase-{version}.jar> import <tablename> <inputdir>
    

Note: Both outputdir and inputdir are in hdfs.

Using Hbase

  • Export

    $ bin/hbase org.apache.hadoop.hbase.mapreduce.Export \
       <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]
    
  • Copy the output directory in hdfs from the source to destination cluster

  • Import

    $ bin/hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>
    

    Reference: Hbase tool to export and import