Guarantees Provided by HBase
We'll cover the following
The HBase provides the atomicity, consistency and isolation, and durability guarantee. These guarantees are discussed in detail in the following section:
Atomicity
Operations that mutate a single row are atomic. For example:
- An operation that returns a success code has completely succeeded.
- An operation that returns a failure code has completely failed.
- An operation that times out may have succeeded or may have failed. However, it cannot have partially succeeded or failed.
- This is true even if the mutation crosses multiple column families within a row. This is achieved by fine-grained, per-row locking.
Note: HFiles are essentially immutable, so only the MemStore needs to participate. This makes the process very efficient.
Operations that mutate multiple rows will not be atomic. For example, a mutative operation on rows a
, b
and c
may return having mutated some but not all of the rows. In this case, the operation will return a list of codes, some of which may be successes, failures, or timeouts.
Hbase provides a conditional operation, called checkAndPut, which happens atomically like the typical compareAndSet (CAS) operation found in many hardware architectures.
Consistency & Isolation
Single-row reads/writes are linearizable.When a client receives a successful response for any mutation, this mutation is immediately visible to both that client and any client with whom it later communicates through side channels.
HBase provides a scan operation that provides efficient iteration over multiple rows. This operation does not provide a consistent view of the table and does not exhibit snapshot isolation.
Instead:
- Any row returned by the scan is a consistent view, i.e., that a version of the complete row existed at some point in time.
- A scan operation must reflect all mutations committed prior to the construction of the scanner and may reflect some mutations committed subsequent to the construction of the scanner.
Durability
All visible data is also durable. This means that a read will never return data that has not been made durable on a disk.Further, any mutative operation that returns a successful response has been made durable.Finally, any operation that has been made durable is stored in at least n different servers (Namenodes), where n is the configurable replication factor of HDFS.
Mapping between HBase and Bigtable
As mentioned earlier, HBase and Bigtable have a very similar architecture with slightly different names for the various components and have different dependencies. The table below contains a mapping between HBase concepts and the associated concepts in Bigtable.
HBase | Bigtable |
---|---|
region | tablet |
region server | tablet server |
Zookeeper | Chubby |
HDFS | GFS |
HFile | SSTable |
MemStore | Memtable |
Get hands-on with 1400+ tech skills courses.