Determining Block Liveness

Let's explore how LFS determines whether a segment is live or not.

We'll cover the following

Segment summary block

Given a data block DD within an on-disk segment SS, LFS must be able to determine whether DD is live. To do so, LFS adds a little extra information to each segment that describes each block. Specifically, LFS includes, for each data block DD, its inode number (which file it belongs to) and its offset (which block of the file this is). This information is recorded in a structure at the head of the segment known as the segment summary block.

Given this information, it is straightforward to determine whether a block is live or dead. For a block DD located on disk at address AA, look in the segment summary block and find its inode number NN and offset TT. Next, look in the imap to find where NN lives and read NN from disk (perhaps it is already in memory, which is even better). Finally, using the offset TT, look in the inode (or some indirect block) to see where the inode thinks the Tth block of this file is on disk. If it points exactly to the disk address AA, LFS can conclude that the block DD is live. If it points anywhere else, LFS can conclude that DD is not in use (i.e., it is dead) and thus know that this version is no longer needed.

Here is a pseudocode summary:

Press + to interact
(N, T) = SegmentSummary[A];
inode = Read(imap[N]);
if (inode[T] == A)
// block D is alive
else
// block D is garbage

Here is a diagram depicting the mechanism, in which the segment summary block (marked SSSS) records that the data block at address A0A0 is actually a part of file kk at offset 0. By checking the imap for kk, you can find the inode, and see that it does indeed point to that location.

There are some shortcuts LFS takes to make the process of determining liveness more efficient. For example, when a file is truncated or deleted, LFS increases its version number and records the new version number in the imap. By also recording the version number in the on-disk segment, LFS can short circuit the longer check described above simply by comparing the on-disk version number with a version number in the imap, thus avoiding extra reads.

Get hands-on with 1400+ tech skills courses.