Introduction to FSCK and Journaling

This lesson introduces the crash-inconsistency problem, which we will explore more throughout this chapter.

We'll cover the following

Crash-consistency problem

As we’ve seen thus far, the file system manages a set of data structures to implement the expected abstractions: files, directories, and all of the other metadata needed to support the basic abstraction that we expect from a file system. Unlike most data structures (for example, those found in memory of a running program), file system data structures must persist, i.e., they must survive over the long haul, stored on devices that retain data despite power loss (such as hard disks or flash-based SSDs).

Crash-consistency problem

One major challenge faced by a file system is how to update persistent data structures despite the presence of a power loss or system crash. Specifically, what happens if, right in the middle of updating on-disk structures, someone trips over the power cord and the machine loses power? Or the operating system encounters a bug and crashes? Because of power losses and crashes, updating a persistent data structure can be quite tricky, and leads to a new and interesting problem in file system implementation, known as the crash-consistency problem.

This problem is quite simple to understand. Imagine you have to update two on-disk structures, A and B, in order to complete a particular operation. Because the disk only services a single request at a time, one of these requests will reach the disk first (either A or B). If the system crashes or loses power after one write completes, the on-disk structure will be left in an inconsistent state. And thus, we have a problem that all file systems need to solve:

THE CRUX: HOW TO UPDATE THE DISK DESPITE CRASHES

The system may crash or lose power between any two writes, and thus the on-disk state may only partially get updated. After the crash, the system boots and wishes to mount the file system again (in order to access files and such). Given that crashes can occur at arbitrary points in time, how do we ensure the file system keeps the on-disk image in a reasonable state?

Get hands-on with 1400+ tech skills courses.

Introduction

Virtualization: Processes

Virtualization: Process API

Virtualization: Direct Execution

Virtualization: CPU Scheduling

Virtualization: Multi-Level Feedback

Virtualization: Lottery Scheduling

Virtualization: Multi-CPU Scheduling

Virtualization: Address Space

Virtualization: Memory API

Virtualization: Address Translation

Virtualization: Segmentation

Virtualization: Free Space Management

Virtualization: Introduction to Paging

Virtualization: Translation Lookaside Buffers

Virtualization: Advanced Page Tables

Virtualization: Swapping: Mechanisms

Virtualization: Swapping: Policies

Virtualization: Complete VM Systems

Concurrency: Concurrency and Threads

Concurrency: Thread API

Concurrency: Locks

Concurrency: Locked Data Structures

Concurrency: Conditional Variables

Concurrency: Semaphores

Concurrency: Concurrency Bugs

Concurrency: Event-Based Concurrency

Persistence: I/O Devices

Persistence: Hard Disk Drives

Persistence: Redundant Disk Arrays (RAID)

Persistence: Files and Directories

Persistence: File System Implementation

Persistence: Fast File System

Persistence: FSCK and Journaling

Persistence: Log-Structured File System

Persistence: Flash-based SSDs

Persistence: Data Integrity and Protection

Distribution: Distributed Systems

Distribution: Network File System (NFS)

Distribution: Andrew File System (AFS)

Introduction to FSCK and Journaling

Crash-consistency problem