File Organization: The Inode
In this lesson, we discuss the layout of an inode while explaining the file organization in a vsfs.
We'll cover the following
One of the most important on-disk structures of a file system is the inode; virtually all file systems have a structure similar to this. The name inode is short for index node, the historical name given to it in
ASIDE: DATA STRUCTURE — THE INODE
The inode is the generic name that is used in many file systems to describe the structure that holds the metadata for a given file, such as its length, permissions, and the location of its constituent blocks. The name goes back at least as far as UNIX and probably further back to Multics if not earlier systems. It is short for index node, as the inode number is used to index into an array of on-disk inodes in order to find the inode of that number. As we’ll see, the design of the inode is one key part of file system design. Most modern systems have some kind of structure like this for every file they track, but perhaps call them different things (such as dnodes, fnodes, etc.).
Each inode is implicitly referred to by a number (called the i-number), which we’ve earlier called the low-level name of the file. In vsfs (and other simple file systems), given an i-number, you should directly be able to calculate where on the disk the corresponding inode is located. For example, take the inode table of vsfs as shown in the previous lesson: 20-KB in size (5 4-KB blocks) and thus consisting of 80 inodes (assuming each inode is 256 bytes). Further assume that the inode region starts at 12KB (i.e, the superblock starts at 0KB, the inode bitmap is at address 4KB, the data bitmap at 8KB, and thus the inode table comes right after). In vsfs, we thus have the following layout for the beginning of the file system partition (in closeup view):
To read inode number 32, the file system would first calculate the offset into the inode region ( or 8192), add it to the start
address of the inode table on disk (inodeStartAddr
= ), and thus
arrive upon the correct byte address of the desired block of inodes: .
Recall that disks are not byte addressable, but rather consist of a large
number of addressable sectors, usually 512 bytes. Thus, to fetch the block
of inodes that contains inode 32, the file system would issue a read to a sector , or 40, to fetch the desired inode block. More generally, the sector address sector of the inode block can be calculated as follows:
blk = (inumber * sizeof(inode_t)) / blockSize;sector = ((blk * blockSize) + inodeStartAddr) / sectorSize;
Inside each inode is virtually all of the information you need about a file:
- its type (e.g., regular file, directory, etc.),
- its size,
- the number of blocks allocated to it,
- protection information (such as who owns the file, as well as who can access it),
- some time information, including when the file was created, modified, or last accessed,
- information about where its data blocks reside on disk (e.g., pointers of some kind).
We refer to all such information about a file as metadata; in fact, any information inside the file system that isn’t pure user data is often referred to as such. An
1 example inode from
One of the most important decisions in the design of the inode is how it refers to where data blocks are. One simple approach would be to have one or more direct pointers (disk addresses) inside the inode; each pointer refers to one disk block that belongs to the file. Such an approach is limited: for example, if you want to have a file that is really big (e.g., bigger than the block size multiplied by the number of direct pointers in the inode), you are out of luck.
The multi-level index
To support bigger files, file system designers have had to introduce different structures within inodes. One common idea is to have a special pointer known as an indirect pointer. Instead of pointing to a block that contains user data, it points to a block that contains more pointers, each of which point to user data. Thus, an inode may have some fixed number of direct pointers (e.g., 12), and a single indirect pointer. If a file grows large enough, an indirect block is allocated (from the data-block region of the disk), and the inode’s slot for an indirect pointer is set to point to it. Assuming 4-KB blocks and 4-byte disk addresses, that adds another 1024 pointers; the file can grow to be or 4144KB.
Not surprisingly, in such an approach, you might want to support even larger files. To do so, just add another pointer to the inode: the double indirect pointer. This pointer refers to a block that contains pointers to indirect blocks, each of which contains pointers to data blocks. A double indirect block thus adds the possibility to grow files with an additional or 1-million 4KB blocks, in other words supporting files that are over 4GB in size. You may want even more, though, and we bet you know where this is headed: the triple indirect pointer.
Overall, this imbalanced tree is referred to as the multi-level index approach to pointing to file blocks. Let’s examine an example with twelve direct pointers, as well as both a single and a double indirect block. Assuming a block size of 4 KB, and 4-byte pointers, this structure can accommodate a file of just over 4 GB in size (i.e., ). Can you figure out how big of a file can be handled with the addition of a triple-indirect block? (hint: pretty big)
TIP: CONSIDER EXTENT-BASED APPROACHES
A different approach is to use extents instead of pointers. An extent is simply a disk pointer plus a length (in blocks); thus, instead of requiring a pointer for every block of a file, all one needs is a pointer and a length to specify the on-disk location of a file. Just a single extent is limiting, as one may have trouble finding a contiguous chunk of on-disk free space when allocating a file. Thus, extent-based file systems often allow for more than one extent, thus giving more freedom to the file system during file allocation.
In comparing the two approaches, pointer-based approaches are the most flexible but use a large amount of metadata per file (particularly for large files). Extent-based approaches are less flexible but more compact. In particular, they work well when there is enough free space on the disk and files can be laid out contiguously (which is the goal for virtually any file allocation policy anyhow).
Many file systems use a multi-level index, including commonly-used file systems such as Linux
You might be wondering: why use an imbalanced tree like this? Why not a different approach? Well, as it turns out, many researchers have studied file systems and how they are used, and virtually every time they find certain “truths” that hold across the decades. One such finding is that most files are small. This imbalanced design reflects such a reality; if most files are indeed small, it makes sense to optimize for this case. Thus, with a small number of direct pointers (12 is a typical number), an inode can directly point to 48 KB of data, needing one (or more) indirect blocks for larger files. See
Of course, in the space of inode design, many other possibilities exist. After all, the inode is just a data structure, and any data structure that stores the relevant information, and can query it effectively, is sufficient. As file system software is readily changed, you should be willing to explore different designs should workloads or technologies change.
Get hands-on with 1400+ tech skills courses.