One Last Problem: Lost Writes
This lesson discusses another issue in disks, the lost writes.
Unfortunately, misdirected writes (discussed in the previous lesson) are not the last problem we will address. Specifically, some modern storage devices also have an issue known as a lost write, which occurs when the device informs the upper layer that a write has completed but in fact, it never is persisted. Thus, what remains is the old contents of the block rather than the updated new contents.
The obvious question here is: do any of our checksumming strategies from above (e.g., basic checksums, or physical identity) help to detect lost writes? Unfortunately, the answer is no: the old block likely has a matching checksum, and the physical ID used above (disk number and block offset) will also be correct. Thus our final problem:
CRUX: HOW TO HANDLE LOST WRITES
How should a storage system or disk controller detect lost writes? What additional features are required from the checksum?
There are a
Some systems add a checksum elsewhere in the system to detect lost writes. For example, Sun’s Zettabyte File System (ZFS) includes a checksum in each file system inode and indirect block for every block included within a file. Thus, even if the write to a data block itself is lost, the checksum within the inode will not match the old data. Only if the writes to both the inode and the data are lost simultaneously will such a scheme fail an unlikely (but unfortunately, possible!) situation.
Get hands-on with 1400+ tech skills courses.