Which Segment Are We Referring To?

Learn how the hardware is able to determine the segment to which a particular address refers.

We'll cover the following

The hardware uses segment registers during translation. How does it know the offset into a segment, and to which segment an address refers?

Explicit approach

One common approach, sometimes referred to as an explicit approach, is to chop up the address space into segments based on the top few bits of the virtual address; this technique was used in the VAX/VMS system“Virtual Memory Management in the VAX/VMS Operating System” by Henry M. Levy, Peter H. Lipman. IEEE Computer, Volume 15:3, March 1982. A classic memory management system, with lots of common sense in its design. We’ll study it in more detail in a later chapter.. In our example above, we have three segments; thus we need two bits to accomplish our task. If we use the top two bits of our 14-bit virtual address to select the segment, our virtual address looks like this:

In our example, then, if the top two bits are 0000, the hardware knows the virtual address is in the code segment, and thus uses the code base and bounds pair to relocate the address to the correct physical location. If the top two bits are 01, the hardware knows the address is in the heap, and thus uses the heap base and bounds. Let’s take our example heap virtual address from above (42004200) and translate it, just to make sure this is clear. The virtual address 42004200, in binary form, can be seen here:

As you can see from the picture, the top two bits (0101) tell the hardware which segment we are referring to. The bottom 12 bits are the offset into the segment: 0000011010000000 0110 1000, or hex 0x068, or 104104 in decimal. Thus, the hardware simply takes the first two bits to determine which segment register to use, and then takes the next 12 bits as the offset into the segment. By adding the base register to the offset, the hardware arrives at the final physical address. Note the offset eases the bounds check too: we can simply check if the offset is less than the bounds; if not, the address is illegal. Thus, if base and bounds were arrays (with one entry per segment), the hardware would be doing something like this to obtain the desired physical address:

Press + to interact
// get top 2 bits of 14-bit VA
Segment = (VirtualAddress & SEG_MASK) >> SEG_SHIFT
// now get offset
Offset = VirtualAddress & OFFSET_MASK
if (Offset >= Bounds[Segment])
RaiseException(PROTECTION_FAULT)
else
PhysAddr = Base[Segment] + Offset
Register = AccessMemory(PhysAddr)

In our running example, we can fill in values for the constants above. Specifically, SEG_MASK would be set to 0x3000, SEG_SHIFT to 12, and OFFSET_MASK to 0xFFF.

You may also have noticed that when we use the top two bits, and we only have three segments (code, heap, stack), one segment of the address space goes unused. To fully utilize the virtual address space (and avoid an unused segment), some systems put code in the same segment as the heap and thus use only one bit to select which segment to use“Virtual Memory Management in the VAX/VMS Operating System” by Henry M. Levy, Peter H. Lipman. IEEE Computer, Volume 15:3, March 1982. A classic memory management system, with lots of common sense in its design. We’ll study it in more detail in a later chapter..

Another issue with using the top so many bits to select a segment is that it limits use of the virtual address space. Specifically, each segment is limited to a maximum size, which in our example is 4KB (using the top two bits to choose segments implies the 16KB address space gets chopped into four pieces, or 4KB in this example). If a running program wishes to grow a segment (say the heap, or the stack) beyond that maximum, the program is out of luck.

Implicit approach

There are other ways for the hardware to determine which segment a particular address is in. In the implicit approach, the hardware determines the segment by noticing how the address was formed. If, for example, the address was generated from the program counter (i.e., it was an instruction fetch), then the address is within the code segment; if the address is based off of the stack or base pointer, it must be in the stack segment; any other address must be in the heap.

Get hands-on with 1400+ tech skills courses.