Appends and Read Operations in HBase
Let's inspect how appends are more efficient than random writes and how we can optimize the inefficiency of read operations.
Appends
Appends are more efficient than random writes, especially in a filesystem like HDFS. Region servers try to take advantage of this fact by employing the following components for storage and data retrieval.
MemStore
MemStore is used as a write cache. Writes are initially written in this data structure, which is stored in-memory and can be sorted efficiently before being written to disk. Writes are buffered in this data structure and periodically written to HDFS after being sorted.
HFile
This is the file in HDFS that stores sorted key-value entries on disk.
Write ahead log (WAL)
It stores operations that are not persisted to permanent storage and are only stored in the MemStore. WAL is also stored in HDFS and is used for recovery in the case of a region server failure.
BlockCache
BlockCache is the read-cache that stores frequently read data in memory and evicts the least recently used data when the cache is full.
As a result, write operations go through WAL and MemStore first and then eventually end up being stored in HFiles, as shown in the following illustration:
Note: This pattern originates from a data structure, called a
tree. log-structured merge (LSM) P. O’Neil, E. Cheng, D. Gawlick, and E. O’Neil, “The log-structured merge-tree (LSM-tree),” Acta Informatica, Volume 33 Issue 4, 1996, 1996.
Inefficiency of read operations
Read operations have to read from the MemStore, the BlockCache, and the existing HFiles and merge the results. This can be quite inefficient, so several optimizations are used to make the read operations efficient.
Optimizing read operations
As mentioned previously, columns are grouped by their column family and stored separately. As a result, only the HFiles containing the required column family need to be queried. All the entries in an HFile are stored in lexicographical order, and they contain an index at the end of the file, which can be kept in memory, so read can find the required data without reading the whole file.
Each HFile also contains the time range of the entries contained in it to avoid unnecessary reads of files that cannot contain the requested data.
Bloom filters
Bloom filters reduce the number of HFiles that need to be read; special data structures make it easy to identify whether some data is not contained in a file using minimal memory. There is also a background process, called compaction, which merges multiple HFiles into a single HFile removing older versions of data that are not needed anymore, thus reducing the number of HFiles that need to be inspected during read operations.
Get hands-on with 1400+ tech skills courses.