The Linux Completely Fair Scheduler (CFS)

This lesson gives an introduction to the Linux Completely Fair Scheduler (or CFS) and explains how it works.

We'll cover the following

- The min_granularity parameter

Despite these earlier works in fair-share scheduling, the current Linux approach achieves similar goals in an alternate manner. The scheduler, entitled the Completely Fair Scheduler (or CFS)“Inside The Linux 2.6 Completely Fair Scheduler” by M. Tim Jones. December 15, 2009. http://ostep.org/Citations/inside-cfs.pdf. A simple overview of CFS from its earlier days. CFS was created by Ingo Molnar in a short burst of creativity which led to a 100K kernel patch developed in 62 hours., implements fair-share scheduling but does so in a highly efficient and scalable manner.

To achieve its efficiency goals, CFS aims to spend very little time making scheduling decisions, through both its inherent design and its clever use of data structures well-suited to the task. Recent studies have shown that scheduler efficiency is surprisingly important; specifically, in a study of Google data centers, Kanev et al. show that even after aggressive optimization, scheduling uses about 5% of overall data center CPU time. Reducing that overhead as much as possible is thus a key goal in modern scheduler architecture.

Basic operation

Whereas most schedulers are based around the concept of a fixed time slice, CFS operates a bit differently. Its goal is simple: to fairly divide a CPU evenly among all competing processes. It does so through a simple counting-based technique known as virtual runtime (vruntime).

As each process runs, it accumulates vruntime. In the most basic case, each process’s vruntime increases at the same rate, in proportion with physical (real) time. When a scheduling decision occurs, CFS will pick the process with the lowest vruntime to run next.

This raises a question: how does the scheduler know when to stop the currently running process, and run the next one? The tension here is clear: if CFS switches too often, fairness is increased, as CFS will ensure that each process receives its share of CPU even over minuscule time windows, but at the cost of performance (too much context switching); if CFS switches less often, performance is increased (reduced context switching), but at the cost of near-term fairness.

The `sched_latency` parameter

CFS manages this tension through various control parameters. The first is sched_latency. CFS uses this value to determine how long one process should run before considering a switch (effectively determining its time slice but in a dynamic fashion). A typical sched_latency value is 48 (milliseconds); CFS divides this value by the number ( $n$ ) of processes running on the CPU to determine the time slice for a process, and thus ensures that over this period of time, CFS will be completely fair.

For example, if there are $n = 4$ processes running, CFS divides the value of sched_latency by $n$ to arrive at a per-process time slice of 12 ms. CFS then schedules the first job and runs it until it has used 12 ms of (virtual) runtime, and then checks to see if there is a job with lower vruntime to run instead. In this case, there is, and CFS would switch to one of the three other jobs, and so forth.

Example

The figure below shows an example where the four jobs (A, B, C, D) each run for two time slices in this fashion; two of them (C, D) then complete, leaving just two remaining, which then each run for 24 ms in round-robin fashion.

Get hands-on with 1400+ tech skills courses.

Introduction

Virtualization: Processes

Virtualization: Process API

Virtualization: Direct Execution

Virtualization: CPU Scheduling

Virtualization: Multi-Level Feedback

Virtualization: Lottery Scheduling

Virtualization: Multi-CPU Scheduling

Virtualization: Address Space

Virtualization: Memory API

Virtualization: Address Translation

Virtualization: Segmentation

Virtualization: Free Space Management

Virtualization: Introduction to Paging

Virtualization: Translation Lookaside Buffers

Virtualization: Advanced Page Tables

Virtualization: Swapping: Mechanisms

Virtualization: Swapping: Policies

Virtualization: Complete VM Systems

Concurrency: Concurrency and Threads

Concurrency: Thread API

Concurrency: Locks

Concurrency: Locked Data Structures

Concurrency: Conditional Variables

Concurrency: Semaphores

Concurrency: Concurrency Bugs

Concurrency: Event-Based Concurrency

Persistence: I/O Devices

Persistence: Hard Disk Drives

Persistence: Redundant Disk Arrays (RAID)

Persistence: Files and Directories

Persistence: File System Implementation

Persistence: Fast File System

Persistence: FSCK and Journaling

Persistence: Log-Structured File System

Persistence: Flash-based SSDs

Persistence: Data Integrity and Protection

Distribution: Distributed Systems

Distribution: Network File System (NFS)

Distribution: Andrew File System (AFS)

The Linux Completely Fair Scheduler (CFS)

Basic operation

The sched_latency parameter

Example

The `sched_latency` parameter