How to filter out rows with given column(not null)?

user1573269 picture user1573269 · Oct 12, 2012 · Viewed 8.5k times · Source

I want to do a hbase scan with filters. For example, my table has column family A,B,C, and A has a column X. Some rows have the column X and some do not. How can I implement the filter to filter out all the rows with column X?

Answer

ankitk picture ankitk · Oct 12, 2012

I guess you are looking for SingleColumnValueFilter in HBase. As mentioned in the API

To prevent the entire row from being emitted if the column is not found on a row, use setFilterIfMissing(boolean) on Filter object. Otherwise, if the column is found, the entire row will be emitted only if the value passes. If the value fails, the row will be filtered out.

But SingleColumnValueFilter would want a value to have Column X "CompareOp" to something, say bring this row if ColumnX == "X" or bring this row if ColumnX != "A sentinel value that ColumnX can never take" and setFilterIfMissing(true) so that if ColumnX has some value, it is returned.

I hope this nudges you in the right direction.