AFS Version 1

This lesson discusses the working of version 1 of AFS.

We'll cover the following

We will discuss two versions of AFS1-“Scale and Performance in a Distributed File System” by John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, M. Satyanarayanan, Robert N. Sidebotham, Michael J. West. ACM Transactions on Computing Systems (ACM TOCS), Volume 6:1, February 1988. The long journal version of the famous AFS system, still in use in a number of places throughout the world, and also probably the earliest clear thinking on how to build distributed file systems. A wonderful combination of the science of measurement and principled engineering. 2-“The ITC Distributed File System: Principles and Design” by M. Satyanarayanan, J.H. Howard, D.A. Nichols, R.N. Sidebotham, A. Spector, M.J. West. SOSP ’85, Orcas Island, Washington, December 1985. The older paper about a distributed file system. Much of the basic design of AFS is in place in this older system but not the improvements for scale. The name change to “Andrew” is an homage to two people both named Andrew, Andrew Carnegie and Andrew Mellon. These two rich dudes started the Carnegie Institute of Technology and the Mellon Institute of Industrial Research, respectively, which eventually merged to become what is now known as Carnegie Mellon University.. The first version (which we will call AFSv1, but actually the original system was called the ITC distributed file system“The ITC Distributed File System: Principles and Design” by M. Satyanarayanan, J.H. Howard, D.A. Nichols, R.N. Sidebotham, A. Spector, M.J. West. SOSP ’85, Orcas Island, Washington, December 1985. The older paper about a distributed file system. Much of the basic design of AFS is in place in this older system, but not the improvements for scale. The name change to “Andrew” is an homage to two people both named Andrew, Andrew Carnegie and Andrew Mellon. These two rich dudes started the Carnegie Institute of Technology and the Mellon Institute of Industrial Research, respectively, which eventually merged to become what is now known as Carnegie Mellon University.) had some of the basic design in place, but didn’t scale as desired, which led to a re-design and the final protocol (which we will call AFSv2, or just AFS)“Scale and Performance in a Distributed File System” by John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, M. Satyanarayanan, Robert N. Sidebotham, Michael J. West. ACM Transactions on Computing Systems (ACM TOCS), Volume 6:1, February 1988. The long journal version of the famous AFS system, still in use in a number of places throughout the world, and also probably the earliest clear thinking on how to build distributed file systems. A wonderful combination of the science of measurement and principled engineering.. We now discuss the first version.

Press + to interact
TestAuth Test whether a file has changed
(used to validate cached entries)
GetFileStat Get the stat info for a file
Fetch Fetch the contents of file
Store Store this file on the server
SetFileStat Set the stat info for a file
ListDir List the contents of a directory

Whole-file caching

One of the basic tenets of all versions of AFS is whole-file caching on the local disk of the client machine that is accessing a file. When you open() a file, the entire file (if it exists) is fetched from the server and stored in a file on your local disk. Subsequent application read() and write() operations are redirected to the local file system where the file is stored; thus, these operations require no network communication and are fast. Finally, upon close(), the file (if it has been modified) is flushed back to the server. Note the obvious contrasts with NFS, which caches blocks (not whole files, although NFS could of course cache every block of an entire file) and does so in client memory (not local disk).

Let’s get into the details a bit more. When a client application first calls open(), the AFS client-side code (which the AFS designers call Venus) would send a Fetch protocol message to the server. The Fetch protocol message would pass the entire pathname of the desired file (for example, /home/remzi/notes.txt) to the file server (the group of which they called Vice), which would then traverse the pathname, find the desired file, and ship the entire file back to the client. The client-side code would then cache the file on the local disk of the client (by writing it to local disk). As we said above, subsequent read() and write() system calls are strictly local in AFS (no communication with the server occurs); they are just redirected to the local copy of the file. Because the read() and write() calls act just like calls to a local file system, once a block is accessed, it also may be cached in client memory. Thus, AFS also uses client memory to cache copies of blocks that it has in its local disk. Finally, when finished, the AFS client checks if the file has been modified (i.e., that it has been opened for writing). If so, it flushes the new version back to the server with a Store protocol message, sending the entire file and pathname to the server for permanent storage.

The next time the file is accessed, AFSv1 does so much more efficiently. Specifically, the client-side code first contacts the server (using the TestAuth protocol message) in order to determine whether the file has changed. If not, the client would use the locally-cached copy, thus improving performance by avoiding a network transfer. The figure above shows some of the protocol messages in AFSv1. Note that this early version of the protocol only cached file contents; directories, for example, were only kept at the server.

Get hands-on with 1400+ tech skills courses.