Policies: How to Allocate Files and Directories

This lesson builds on our discussion from the last lesson and shows how files and directories are allocated on FFS.

With the group structure in place, FFS now has to decide how to place files and directories and associated metadata on disk to improve performance. The basic mantra is simple: keep related stuff together (and its corollary, keep unrelated stuff far apart).

Allocation policy

Thus, to obey the mantra, FFS has to decide what is “related” and place it within the same block group. Conversely, unrelated items should be placed into different block groups. To achieve this end, FFS makes use of a few simple placement heuristics.

Placement of directories

The first is the placement of directories. FFS employs a simple approach. First, find the cylinder group with a low number of allocated directories (to balance directories across groups) and a high number of free inodes (to subsequently be able to allocate a bunch of files). Then, put the directory data and inode in that group. Of course, other heuristics could be used here (e.g., taking into account the number of free data blocks).

Placement of files

For files, FFS does two things. First, it makes sure (in the general case) to allocate the data blocks of a file in the same group as its inode, thus preventing long seeks between inode and data (as in the old file system). Second, it places all files that are in the same directory in the cylinder group of the directory they are in. Thus, if a user creates four files, /a/b, /a/c, /a/d, and b/f, FFS would try to place the first three near one another (same group) and the fourth far away (in some other group).

Let’s look at an example of such an allocation. In the example, assume that there are only 10 inodes and 10 data blocks in each group (both unrealistically small numbers) and that the three directories (the root directory /, /a, and /b) and four files (/a/c, /a/d, /a/e, /b/f) are placed within them per the FFS policies. Assume the regular files are each two blocks in size, and that the directories have just a single block of data. For this figure, we use the obvious symbols for each file or directory (i.e., / for the root directory, a for /a, f for /b/f, and so forth).

Note that the FFS policy does two positive things: the data blocks of each file are near each file’s inode, and files in the same directory are near one another (namely, /a/c, /a/d, and /a/e are all in Group 1, and directory /b and its file /b/f are near one another in Group 2).

In contrast, let’s now look at an inode allocation policy that simply spreads inodes across groups, trying to ensure that no group’s inode table fills up quickly. The final allocation might thus look something like this:

As you can see from the figure, while this policy does indeed keep file (and directory) data near its respective inode, files within a directory are arbitrarily spread around the disk, and thus name-based locality is not preserved. Access to files /a/c, /a/d, and /a/e now spans three groups instead of one as per the FFS approach.

The FFS policy heuristics are not based on extensive studies of file-system traffic or anything particularly nuanced; rather, they are based on good old-fashioned common sense (isn’t that what CS stands for after all?)Some people refer to common sense as horse sense, especially people who work regularly with horses. However, we have a feeling that this idiom may be lost as the “mechanized horse”, a.k.a. the car, gains in popularity. What will they invent next? A flying machine??!!. Files in a directory are often accessed together: imagine compiling a bunch of files and then linking them into a single executable. Because such namespace-based locality exists, FFS will often improve performance, making sure that seeks between related files are nice and short.

Get hands-on with 1400+ tech skills courses.