Scan with filter using HBase shell

Gandalf StormCrow picture Gandalf StormCrow · Aug 31, 2011 · Viewed 77.8k times · Source

Does anybody know how to scan records based on some scan filter i.e.:

column:something = "somevalue"

Something like this, but from HBase shell?

Answer

bhavanki picture bhavanki · Sep 16, 2011

Try this. It's kind of ugly, but it works for me.

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.util.Bytes
scan 't1', { COLUMNS => 'family:qualifier', FILTER =>
    SingleColumnValueFilter.new
        (Bytes.toBytes('family'),
         Bytes.toBytes('qualifier'),
         CompareFilter::CompareOp.valueOf('EQUAL'),
         SubstringComparator.new('somevalue'))
}

The HBase shell will include whatever you have in ~/.irbrc, so you can put something like this in there (I'm no Ruby expert, improvements are welcome):

# imports like above
def scan_substr(table,family,qualifier,substr,*cols)
    scan table, { COLUMNS => cols, FILTER =>
        SingleColumnValueFilter.new
            (Bytes.toBytes(family), Bytes.toBytes(qualifier),
             CompareFilter::CompareOp.valueOf('EQUAL'),
             SubstringComparator.new(substr)) }
end

and then you can just say in the shell:

scan_substr 't1', 'family', 'qualifier', 'somevalue', 'family:qualifier'