For Example for hbase table 'test_table', Values inserted are:
Row1 - Val1 => t
Row1 - Val2 => t + 3
Row1 - Val3 => t + 5
Row2 - Val1 => t
Row2 - Val2 => t + 3
Row2 - Val3 => t + 5
on scan 'test_table' where version = t + 4 should return
Row1 - Val1 => t + 3
Row2 - Val2 => t + 3
How do i achieve time stamp based scans (Based on latest available value less than or equal to the timestamp) in HBase?
Consider this table:
hbase(main):009:0> create 't1', { NAME => 'f1', VERSIONS => 100 }
hbase(main):010:0> put 't1', 'key1', 'f1:a', 'value1'
hbase(main):011:0> put 't1', 'key1', 'f1:a', 'value2'
hbase(main):012:0> put 't1', 'key1', 'f1:a', 'value3'
hbase(main):013:0> put 't1', 'key2', 'f1:a', 'value4'
hbase(main):014:0> put 't1', 'key2', 'f1:a', 'value5'
hbase(main):015:0> put 't1', 'key1', 'f1:a', 'value6'
Here's its scan in shell with all the versions:
hbase(main):003:0> scan 't1', {VERSIONS => 100 }
ROW COLUMN+CELL
key1 column=f1:a, timestamp=1416083314098, value=value6
key1 column=f1:a, timestamp=1416083294981, value=value3
key1 column=f1:a, timestamp=1416083293273, value=value2
key1 column=f1:a, timestamp=1416083291009, value=value1
key2 column=f1:a, timestamp=1416083305050, value=value5
key2 column=f1:a, timestamp=1416083299840, value=value4
Here's the scan limited to a specific timestamp, as you requested:
hbase(main):002:0> scan 't1', { TIMERANGE => [0, 1416083300000] }
ROW COLUMN+CELL
key1 column=f1:a, timestamp=1416083294981, value=value3
key2 column=f1:a, timestamp=1416083299840, value=value4
Here's the same in Java code:
package org.example.test;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
import java.io.IOException;
public class test {
public static void main (String[] args) throws IOException {
HTable table = new HTable(HBaseConfiguration.create(), "t1");
Scan s = new Scan();
s.setMaxVersions(1);
s.setTimeRange (0L, 1416083300000L);
ResultScanner scanner = table.getScanner(s);
for (Result rr = scanner.next(); rr != null; rr = scanner.next()) {
System.out.println(Bytes.toString(rr.getRow()) + " => " +
Bytes.toString(rr.getValue(Bytes.toBytes("f1"), Bytes.toBytes("a"))));
}
}
}
Be aware that specifying the time range maximal value is excluded, which means that if you want to get the last value for all the keys with maximum timestamp T, you should specify upper bound of the range to T+1