Virtualizing Memory

In this lesson, you will study the concept of virtualizing memory.

Now, let’s consider memory. The model of physical memory presented by modern machines is very simple. Memory is just an array of bytes; to read memory, one must specify an address to be able to access the data stored there; to write (or update) memory, one must also specify the data to be written to the given address.

Memory is accessed all the time when a program is running. A program keeps all of its data structures in memory and accesses them through various instructions, like loads and stores or other explicit instructions that access memory in doing their work. Don’t forget that each instruction of the program is in memory too; thus memory is accessed on each instruction fetch.

Running mem.c

Let’s take a look at a program that allocates some memory by calling malloc():

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include "common.h"

int main(int argc, char *argv[]) {
    if (argc != 2) { 
        fprintf(stderr, "usage: mem <value>\n"); 
        exit(1); 
    } 
    int *p; 
    p = malloc(sizeof(int));                                      //a1
    assert(p != NULL);
    printf("(%d) addr pointed to by p: %p\n", (int) getpid(), p); //a2
    *p = atoi(argv[1]); // assign value to addr stored in p.      //a3
    while (1) {
        Spin(1);                                                  //a4
            *p = *p + 1;
            printf("(%d) value of p: %d\n", getpid(), *p);
        }
    return 0;
}
A Program That Accesses Memory (mem.c)

The program does a couple of things. First, it allocates some memory (line a1). Then, it prints out the address of the memory (a2), and then puts the number zero (0 is passed to the program as argv[1]) into the first slot of the newly-allocated memory (a3). Finally, it loops, delaying for a second and incrementing the value stored at the address held in p. With every print statement, it also prints out what is called the process identifier (the PID) of the running program. This PID is unique per running process.

The output of the program specified will be similar to the following:

Press + to interact
prompt> ./mem 0
(2134) address pointed to by p: 0x200000
(2134) p: 1
(2134) p: 2
(2134) p: 3
(2134) p: 4
(2134) p: 5
ˆC

This first result shown above is not too interesting. The newly-allocated memory is at address 0x200000. As the program runs, it slowly updates the value and prints out the result.

Running multiple instances of mem.c

Now, we run multiple instances of this same program again to see what happens. We see from the example that each running program has allocated memory at the same address (0x200000) and yet, each seems to be updating the value at 0x200000 independently! It is as if each running program has its own private memory, instead of sharing the same physical memory with other running programs.

Press + to interact
prompt> ./mem 0 & ./mem 0 &
[1] 24113
[2] 24114
(24113) address pointed to by p: 0x200000
(24114) address pointed to by p: 0x200000
(24113) p: 1
(24114) p: 1
(24114) p: 2
(24113) p: 2
(24113) p: 3
(24114) p: 3
(24113) p: 4
(24114) p: 4
...

You can try out the command above in the widget provided previously.

For the above example to work, you need to make sure address-space randomization is disabled; randomization, as it turns out, can be a good defense against certain kinds of security flaws. Read more about it on your own, especially if you want to learn how to break into computer systems via stack-smashing attacks. Not that we would recommend such a thing…

Indeed, that is exactly what is happening here as the OS is virtualizing memory. Each process accesses its own private virtual address space (sometimes just called its address space), which the OS somehow maps onto the physical memory of the machine. A memory reference within one running program does not affect the address space of other processes (or the OS itself); as far as the running program is concerned, it has physical memory all to itself. The reality, however, is that physical memory is a shared resource, managed by the operating system. Exactly how all of this is accomplished is also the subject of the first part of this course, on the topic of virtualization.

Create a free account to view this lesson.

Continue your learning journey with a 14-day free trial.

By signing up, you agree to Educative's Terms of Service and Privacy Policy