Virtual memory or virtual memory addressing is a memory management technique, used by multitasking computer operating systems wherein non-contiguous memory is presented to a software (aka process) as contiguous memory. This contiguous memory is referred to as the virtual address space.
Virtual memory addressing is typically used in paged memory systems. This in turn is often combined with memory swapping (also known as anonymous memory paging), whereby memory pages stored in primary storage are written to secondary storage (often to a swap file or swap partition), thus freeing faster primary storage for other processes to use.
In technical terms, virtual memory allows software to run in a memory address space whose size and addressing are not necessarily tied to the computer's physical memory. To properly implement virtual memory the CPU (or a device attached to it) must provide a way for the operating system to map virtual memory to physical memory and for it to detect when an address is required that does not currently relate to main memory so that the needed data can be swapped in. While it would certainly be possible to provide virtual memory without the CPU's assistance it would essentially require emulating a CPU that did provide the needed features.
Most computers possess four kinds of memory: registers in the CPU, CPU caches (generally some kind of static RAM) both inside and adjacent to the CPU, main memory (generally dynamic RAM) which the CPU can read and write to directly and reasonably quickly; and disk storage, which is much slower, but much larger. CPU register use is generally handled by the compiler and this isn't a huge burden as data doesn't generally stay in them very long (in the sense of usage). The decision of when to use cache and when to use main memory is generally dealt with by hardware so generally both are regarded together by the programmer as simply physical memory.
Many applications require access to more information (code as well as data) than can be stored in physical memory. This is especially true when the operating system allows multiple processes/applications to run seemingly in parallel. The obvious response to the problem of the maximum size of the physical memory being less than that required for all running programs is for the application to keep some of its information on the disk, and move it back and forth to physical memory as needed, but there are a number of ways to do this.
One option is for the application software itself to be responsible both for deciding which information is to be kept where, and also for moving it back and forth. The programmer would do this by determining which sections of the program (and also its data) were mutually exclusive, and then arranging for loading and unloading the appropriate sections from physical memory, as needed. The disadvantage of this approach is that each application's programmer must spend time and effort on designing, implementing, and debugging this mechanism, instead of focusing on his or her application; this hampers programmers' efficiency. Also, if any programmer could truly choose which of their items of data to store in the physical memory at any one time, they could easily conflict with the decisions made by another programmer, who also wanted to use all the available physical memory at that point.
Another option is to store some form of handles to data rather than direct pointers and let the OS deal with swapping the data associated with those handles between the swap area and physical memory as needed. This works but has a couple of problems, namely that it complicates application code, that it requires applications to play nice (they generally need the power to lock the data into physical memory to actually work on it) and that it stops the languages standard library doing its own suballocations inside large blocks from the OS to improve performance. The best known example of this kind of arrangement is probably the 16-bit versions of Windows.
The modern solution is to use virtual memory, in which a combination of special hardware and operating system software makes use of both kinds of memory to make it look as if the computer has a much larger main memory than it actually does and to lay that space out differently at will. It does this in a way that is invisible to the rest of the software running on the computer. It usually provides the ability to simulate a main memory of almost any size (In practice there's a limit imposed on this by the size of the addresses. For a 32-bit system, the total size of the virtual memory can be 232, or approximately 4 gigabytes. For the newer 64-bit chips and operating systems that use 64 or 48 bit addresses, this can be much higher. Many operating systems do not allow the entire address space to be used by applications to simplify kernel access to application memory but this is not a hard design requirement.)
Virtual memory makes the job of the application programmer much simpler. No matter how much memory the application needs, it can act as if it has access to a main memory of that size and can place its data wherever in that virtual space that it likes. The programmer can also completely ignore the need to manage the moving of data back and forth between the different kinds of memory. Having said that if the programmer cares about performance when working with large volumes of data they need to minimise the number of blocks that are being accessed close together to avoid unnecessary swapping.
Virtual memory is usually (but not necessarily) implemented using paging. In paging, the low order bits of the binary representation of the virtual address are preserved, and used directly as the low order bits of the actual physical address; the high order bits are treated as a key to one or more address translation tables, which provide the high order bits of the actual physical address.
For this reason a range of consecutive addresses in the virtual address space whose size is a power of two will be translated in a corresponding range of consecutive physical addresses. The memory referenced by such a range is called a page. The page size is typically in the range of 512 to 8192 bytes (with 4K currently being very common), though page sizes of 4 megabytes or larger may be used for special purposes. (Using the same or a related mechanism, contiguous regions of virtual memory larger than a page are often mappable to contiguous physical memory for purposes other than virtualization, such as setting access and caching control bits.)
The operating system stores the address translation tables, the mappings from virtual to physical page numbers, in a data structure known as a page table.
If a page that is marked as unavailable (perhaps because it is not present in physical memory, but instead is in the swap area), when the CPU tries to reference a memory location in that page, the MMU responds by raising an exception (commonly called a page fault) with the CPU, which then jumps to a routine in the operating system. If the page is in the swap area, this routine invokes an operation called a page swap, to bring in the required page.
The page swap operation involves a series of steps. First it selects a page in memory, for example, a page that has not been recently accessed and (preferably) has not been modified since it was last read from disk or the swap area. (See page replacement algorithms for details.) If the page has been modified, the process writes the modified page to the swap area. The next step in the process is to read in the information in the needed page (the page corresponding to the virtual address the original program was trying to reference when the exception occurred) from the swap file. When the page has been read in, the tables for translating virtual addresses to physical addresses are updated to reflect the revised contents of the physical memory. Once the page swap completes, it exits, and the program is restarted and continues on as if nothing had happened, returning to the point in the program that caused the exception.
It is also possible that a virtual page was marked as unavailable because the page was never previously allocated. In such cases, a page of physical memory is allocated and filled with zeros, the page table is modified to describe it, and the program is restarted as above
The translation from virtual to physical addresses is implemented by an MMU (Memory Management Unit). This may be either a module of the CPU, or an auxiliary, closely coupled chip.
The operating system is responsible for deciding which parts of the program's simulated main memory are kept in physical memory. The operating system also maintains the translation tables which provide the mappings between virtual and physical addresses, for use by the MMU. Finally, when a virtual memory exception occurs, the operating system is responsible for allocating an area of physical memory to hold the missing information (and possibly in the process pushing something else out to disk), bringing the relevant information in from the disk, updating the translation tables, and finally resuming execution of the software that incurred the virtual memory exception.
In most computers, these translation tables are stored in physical memory. Therefore, a virtual memory reference might actually involve two or more physical memory references: one or more to retrieve the needed address translation from the page tables, and a final one to actually do the memory reference.
To minimize the performance penalty of address translation, most modern CPUs include an on-chip MMU, and maintain a table of recently used virtual-to-physical translations, called a Translation Lookaside Buffer, or TLB. Addresses with entries in the TLB require no additional memory references (and therefore time) to translate, However, the TLB can only maintain a fixed number of mappings between virtual and physical addresses; when the needed translation is not resident in the TLB, action will have to be taken to load it in.
On some processors, this is performed entirely in hardware; the MMU has to do additional memory references to load the required translations from the translation tables, but no other action is needed. In other processors, assistance from the operating system is needed; an exception is raised, and on this exception, the operating system replaces one of the entries in the TLB with an entry from the translation table, and the instruction which made the original memory reference is restarted.
The hardware that supports virtual memory almost always supports memory protection mechanisms as well. The MMU may have the ability to vary its operation according to the type of memory reference (for read, write or execution), as well as the privilege mode of the CPU at the time the memory reference was made. This allows the operating system to protect its own code and data (such as the translation tables used for virtual memory) from corruption by an erroneous application program and to protect application programs from each other and (to some extent) from themselves (e.g. by preventing writes to areas of memory which contain code).
Virtual memory has been a feature of Microsoft Windows since Windows 3.0 in 1990; it was done in an attempt to slash the system requirements for the operating system in response to the failures of Windows 1.0 and Windows 2.0 respectively. 386SPART.PAR or WIN386.SWP is a hidden file created by Windows 3.x for use as a virtual memory swap file. It is generally found in the root directory, but it may appear elsewhere (typically in the WINDOWS directory). Its size depends on how much virtual memory the system has set up under Control Panel - Enhanced under "Virtual Memory". If a user moves or deletes this file, Windows will BSoD the next time it is started with "The permanent swap file is corrupt" and will ask the user if he wants to delete the file (It asks this question whether or not the file exists).
Windows 95 uses a similar file and the controls for it are located under Control Panel - System - Performance tab - Virtual Memory. Windows automatically sets the page file to start 1.5x physical memory, and expand up to 3x physical memory if necessary. If a user runs memory intensive applications on a low physical memory system, it is preferable to manually set these sizes to a value higher than default.
Under NT-based versions of Windows (including Windows 2000 and Windows XP) the name is pagefile.sys. The default location of the page file is in the root directory of the partition where Windows is installed. Windows can be configured to use free space on any available drives for page files.
Some believe that a page file can become heavily fragmented and cause performance issues. The common advice given to avoid this problem is to set a single "locked" page file size so that Windows will not resize the page file. Other people believe this to be problematic in the case that a Windows application requests more memory than the total size of physical and virtual memory. In this case, memory is not successfully allocated and programs (including Windows) may crash. Supporters of this view will note that the page file is rarely read or written in sequential order, so the performance advantages of having a completely sequential page file is minimal. It is however, generally agreed that a large page file will allow use of memory-heavy applications, and there is no penalty except that more disk space is used.
Defragmenting the page file is also occasionally recommended to increase performance when a Windows system is chronically using much more memory than its total physical memory. In this case, while a defragged page file can help slightly, performance concerns are much more effectively dealt with by adding more physical memory.