Windows Memory Management Basics
Memory Management
One of the most important and least understood administrative tasks an administrator will have to do is manage the memory used by the operating system and applications. This section will cover the various types of operating system memory tuning and practical recommendations how to use them. There are several types of memory tuning available, and each has their own advantages, disadvantages, and uses.
This is not intended as a complete guide to the operating system memory manager. Rather, this is a brief overview designed to give the reader enough of an understanding of the memory manager to comprehend the various memory tuning techniques available and the implications that can arise from their usage.
Before we move into the various types of memory tuning, however, we’re going to take a brief tour of the operating system memory manager. This will help us to understand the how’s and the why’s of when and why various types of memory tuning should be employed.
Understanding The Memory Manager
Since the first days of Windows NT 3.1, Windows NT 4.0, and its successors (including Windows 2000 and Windows Server 2003; they will be referred to simply as Windows from here on) have all been 32-bit operating systems until the release of the 64-bit versions of Windows Server 2003. You’re probably thinking “This is all very fascinating, but so what?” The “so what” part of this is answered directly by the fact that the amount of memory the operating system can address is directly tied to how many bits the operating system is. The 32-bit versions of Windows are is able to address 232 bytes of memory out of the box, or 4294967296 bytes. To put it more plainly, Windows can address 4 GB of memory. The obvious implication is that the more bits your operating system is, the more memory you can address, so a 64 bit operating system would be able to address at most 264 bits worth of memory (which is 18,446,744,073,709,551,616 bytes, - a really big number). By this discussion you would think that 32-bit versions of Windows can only address 4 GB of real memory only, but that is not necessarily the case thanks to some creative engineering and a little smoke and mirrors.
However, before you get excited and start writing applications that will require 4 GB of memory to just load, there are a few ground rules about how all of this memory can be used. Windows utilizes something known as a memory split. This means that by default, the kernel can utilize up to 2 GB of memory in its own separate memory space an each application can utilize up to 2 GB of memory in its own private address space.

Figure 1
An example of a memory split
By now you are probably thinking, “This stuff is all old hat, so what?” Once again, there are some interesting reasons why the memory manager works this way. There are really two main reasons: operating system self protection and application protection. The kernel mode functions get 2 GB all to itself so that it will have enough room to do whatever it needs to do without running out of space. Plain and simple, applications will not run if the kernel space does not have enough memory. Along with this self protection, user mode applications cannot address memory in the kernel mode space and accidentally corrupt kernel mode memory and cause a bugcheck.
Notice that previously we said that each application can utilize up to 2 GB of memory. That may seem a little odd given that natively, the operating system can only recognize 4 GB of memory total. For example, pretend that a server which has 4GB of RAM (which we’ll call \\myserver for purposes of example), the kernel mode functions are using 1 GB of memory, leaving 3 GB for all other applications. Now, also imagine that an instance of SQL Server 2000 is active and using its full range of 2 GB of memory, now leaving only 1 GB for all my other applications. Now, add in web services, native usermode applications, management and monitoring processes and we are now running at a full 3.9 GB of memory used. What happens when a new process is started that need 200 MB of memory? This single request would put us over the actual amount of memory that the system has in total. So now, with each application getting its own 2 GB range, the memory picture begins to look a little more like Figure 2.

Figure 2
To combat this hard limit of physical memory in the server, the memory manager uses disk space to substitute for real memory. This disk space is called the pagefile. Whenever there is not enough real memory left to satisfy requests, data that is currently memory but not in use will be written to the pagefile on disk, freeing real memory to satisfy requests (actually, the process is somewhat more complicated that described here, but for our purposes it is sufficient).
Paging is a fairly generic term that refers to the memory manager writing from memory to disk. This can be writing data out to a pagefile or simply saving a file to disk. In context for this text, the term paging will be used to refer to writing to the pagefile for the purposes or memory management unless otherwise noted.
So far, we have 4 GB of memory in the operating system with 2 GB available to kernel mode functions and 2 GB available to each user mode application. If real memory becomes constrained, items in memory that have not been recently used will be written out to a pagefile to free real memory for more pressing needs and our memory picture looks more like Figure 3 But, what happens when an application needs some of its memory that has been written out to the pagefile?
Figure 3

To enable applications to not have to keep track of memory in case it gets moved or worry about overwriting the memory of other applications and processes, the memory manager virtualizes all of the memory addresses used by user mode applications and processes. For example, imagine that we have two processes, process A and process B, and both ask for 1 meg of memory. Both processes get an answer from the memory manager that the have been allocated one megabyte of memory starting at address 0×28394532. What happens when both applications write to that address? Will one of the processes erroneously overwrite the others data? Will one of the processes receive an error? Will one or both of the applications crash?
The answer is none of the above. Both processes will be able to write to seemingly the same address because of the virtualization that goes on behind the scenes. Usermode processes are never able to directly write to real memory and never actually know where their data resides. A user mode process can request a block of memory, write to it, in the meantime, the memory can be written out to the pagefile and when then application does something with the memory, the memory manager will go out to the pagefile and retrieve the data from the pagefile for the application. The key thing here is that the application never knew anything about the virtualization process, it simply thinks it is writing to a memory location.
To answer the original question, how does an app get its memory back once it is in the pagefile? It does not get it back. Whenever an application references memory in the pagefile, the memory manager will retrieve the data out of the pagefile for the application because, remember, the application is not responsible for keeping track of where the data really is.
Because of this virtualization, each application can write to the virtual locations of 0×00000000 – 0×7ffffffff (or 2 GB) without affecting any other process. Each application thinks that its memory location 0×38344134 is the only one on the server when in fact a hundred applications are simultaneously using that virtual location, all the while, in the background, the memory manager is writing memory for those applications and keeping track of where they are in real memory and in the pagefile.
Keeping this in mind, the previous statement that the kernel space and each application could utilize up to 2 GB of memory was somewhat incorrect. Now that the concept of virtual memory has been introduced, its far more accurate to state that the kernel mode space as well as each application can utilize up to 2 GB of virtual memory. There are certain portions of the memory manager, however, that cannot be paged to disk.
The kernel address space can also 2 GB of address virtual memory, but the key difference is that all processes in the kernel space share the same 2 GB . Each process does not get its own unique memory space use without affecting other kernel mode processes. Any usermode thread that enters and does its work in this space and then returns, it will lose access to that memory it just used. Because of this, it is very important for drivers and kernel processes to be very good at handling memory. The results of kernel address space memory being corrupted or overwritten by rogue processes is disastrous indeed. Errors in the kernel mode space often lead to the dreaded blue screen of death.
Most memory in the kernel mode space is also virtualized and depending upon the function, some memory can be paged out to disk as well. There is a special type of memory that cannot be paged out to disk, however. One of the key items that resides in that space is the portion of the memory manager that handles virtual address translation and its related items and drivers for disk access. Those must remain resident in physical memory at all times (this will be addressed more later).
Now that you have been inundated with information about the memory manager, it is time to start to put all of that material into context to help you understand what impact memory tuning can have, both negatively and positively, on server performance. Before this is done, however, a few key points about the pagefile will be emphasized and its related functions that are absolutely crucial to understanding to the good bits later on (namely, memory tuning).
For example, imagine that an application named memoryhog.exe references three pieces of data it has stored in memory: mem1, mem2, and mem3. Mem1 is right where memoryhog.exe left it. The memory manager has not shuffled it around at all. Mem2 hasn’t been moved by the memory manger either, but the virtual memory address no longer points to the data that is still in real memory. The data is not out on the pagefile yet. Mem3, however, has been moved into the pagefile by the memory manager. memoryhog.exe does not know this, of course - it simply references the memory virtual addresses it wrote to in the first place. Which reference will occur faster and why? The answer is mem1, and here is why.
When memoryhog.exe references mem1, nothing “extra” happens. It is right where memoryhog.exe left it and no extra looking and retrieving goes on. When mem2 is referenced, it is a little slower because the memory manager has to repoint the virtual address to the real memory, but the performance hit is negligible because the item still remained in real memory. Mem3, however, must be retrieved from disk, and will be the slowest of all. Disk access times are measured in milliseconds and memory access time is measured in nanoseconds, so every time memory has to be retrieved from disk, it takes approximately one million times longer to retrieve the data that had it still been resident in physical memory.
Whenever data is not where an application originally left it and the memory manager has to go retrieve the data, a pagefault is incurred. There are two types of pagefaults that interest us: soft pagefaults, as occurred with mem2, and hard pagefaults, as occurred with mem3.
Because hard pagefaults are so devastating to system performance, something was needed to help applications to use and retain more data in physical memory. Those somethings are what this entire section has been leading up to, memory tuning. Modifying the way the memory manager works to get better performance for critical applications to reduce paging without hurting operating system performance; specifically, using the methods of 4 GB tuning (/3gb and /userva switches), physical address extensions (PAE) and application windowing extensions (AWE).
Breaking the 2GB Barrier
The default 2 GB that a single user mode process can utilize is often insufficient for real world needs. In order to combat this and increase performance (or to prevent performance degradation with increased load), administrators and developers may need to tune the memory usage for applications and for the OS.
The following memory tuning options are discussed below:
· /3GB switch (also known as 4 GB tuning)
· PAE)
·
While each option has it’s own advantages and drawbacks, they fundamentally all have the same goal: to increase performance of an application by modifying the way the memory manager works. Ultimately, all three of these options are an attempt to reduce the need to use the disk in the virtual memory process as paging to disk can be expensive from a speed and I/O perspective. This may not be readily apparent at first, but because you have thoroughly read the previous section, you will understand this as the memory tuning methods are explained.
/3GB boot.ini Switch
/3GB is a switch configured in the startup line for the specific installation of Windows in boot.ini option that allows an application such as SQL Server to utilize up to 3 GB of virtual memory in its private address space rather than the default maximum of 2 GB. The memory is virtual and can still be paged out if necessary. An example of a boot.ini entry with the switch:
multi(0)disk(0)rdisk(0)partition(1)\WINDOWS=” Windows Server 2003, Enterprise” /fastdetect /3GB
The benefits of using /3GB are immediately obvious. An application can map one more GB of memory into its private memory space. Specifically, SQL is able to map 3 GB of data into its memory space, a gain of 50%.
Applications have to be specifically compiled with the IMAGE_FILE_LARGE_ADDRESS_AWARE flag to take advantage of the /3GB switch.
Applications such as SQL or Exchange that can frequently refer to the same data multiple times to satisfy different requests can greatly benefit from an extra gigabyte of memory to map files into because once the data is resident in memory, SQL or Exchange can refer to it multiple times and while still having room for other data that needs to be swapped more frequently.
There are several considerations that an administrator will need to keep in mind before enabling the /3GB switch, however. Usage of this switch will limit the kernel memory space to the remaining 1 GB. In some case, this can cause undesirable results if the kernel mode space is not large enough (such as a blue screen of death or degraded performance). It would never be wise to allow a single application to starve the kernel of memory since could not only affect other applications but could case the kernel to not have enough memory to continue to function at all. Also, because the memory is still virtual, SQL, or any other application that uses a 3 GB virtual address space may not necessarily realize a performance benefit.
The /USERVA boot.ini Switch
Because of these possible problems, a new subswitch has been introduced in Windows 2003 Server (as well as Windows XP Service Pack 1 or later) that can only be used in conjunction with the /3GB switch, the /USERVA switch. The /USERVA switch allows an administrator to determine the size of the usermode virtual address space between 2 GB and 3 GB. For example, assume an administrator determines (through testing and benchmarks) that the system will perform at its peak with a usermode virtual address space of 2.5 GB, and the kernel will be able to work comfortably with a 1.5 GB virtual address space. He could them modify boot.ini to the following:
multi(0)disk(0)rdisk(0)partition(1)\WINDOWS=”Windows Server 2003, Enterprise” /fastdetect /3GB /userva=2560
In some cases, restricting the kernel to only 1 GB of memory may not be desirable, but the kernel may not need 2 GB, either, so the /USERVA switch allows administrators a middle ground to the all or nothing approach of the /3GB switch.
Without the /3GB switch, the /USERVA switch will be ignored if it is put into boot.ini alone.
Despite all of these ominous warnings, using the /3GB switch is a perfectly valid approach to memory tuning. The key to determining whether or not it will be useful in YOUR environment is testing. Usage trends and performance changes can only rarely be predicted, so before and after performance benchmarks and tests are essential to determining whether or not memory tuning is beneficial in a particular case.
You may have noticed that earlier, a mention was made that a condition may exist where total system memory usage is over 4 GB. What does an administrator do then since earlier we pointed out that a 32-bit operating system can only address 4 GB of memory? Simple, make the operating system use more than 32 bits.
Physical Address Extensions (PAE)
Implementing PAE is done via a /PAE switch configured in the startup line for the specific installation of Windows in boot.ini option.An example of a boot.ini entry with the switch: PAE is also enabled via a switch in boot.ini.
multi(0)disk(0)rdisk(0)partition(2)\WINNT=”Windows 2000 Advanced Server” /PAE /basevideo /sos
PAE is a hardware modification to allow a 32-bit hardware platform to address more than 4 GB. Essentially, PAE changes the addressing from 32-bit to 36-bit addressing mode on an Intel server (37-bit on AMD). The calculation for the amount of memory is 2n, where n is the number of bits of the operating system. So an Intel-based processor when PAE is enabled will allow an operating system to address up to 64 GB of memory, since 236 works out to 64 GB.
PAE allows the operating system to see and utilize more physical memory than otherwise possible. The obvious benefit to this is that with more real memory available, the operating system will have less need of a paging file since there it is more likely that real memory will be available to service memory requests.
The following is an example to illustrate the above. A server, named MyServer, has 4 GB of memory. SQL is using 2 GB, the various kernel functions are using 1.5 GB and other applications, such as virus scanners, essential services and support applications are using 2 GB of memory. Whoops, that adds up to 5.5 GB of memory. Since normally a server can only support up to 4 GB of memory, this means that at LEAST 1.5 GB worth of data is page out to disk at any given time (again, this is not strictly true, but this is a light overview of the memory manager). Constantly swapping 1.5 GB of data in and our of the pagefile is a performance intensive task, so it would really be nice if somehow the system could use more than 4 GB of real memory to decrease the pressure, if not eliminate, the pressure on the paging file.
Enter PAE as solution, With PAE, a server can recognize and use more than 4 GB of REAL memory. This is a great performance booster because, even though all memory is virtualized for applications (remember, meaning they never know where is it in real memory or if it has been pushed out to the pagefile), data does not get pushed out to the pagefile unless there is a deficiency of real memory. Because of this, increasing the amount of real memory available for use decreases the amount of pagefile usage that takes place and will therefore increase performance.
How does this help SQL? It does not help out directly. While applications directly take advantage of the /3GB switch, the performance gains that come from PAE are hidden from the application. In fact, applications are completely unaware of PAE or the notion of more or less that 4 GB of memory. Applications get an indirect performance boost because more of their data can remain resident in real memory on a system with >4 GB of memory with PAE enabled. That is good news, because applications do not natively need to do anything to take advantage of PAE, it just happens.
Combining /PAE and /3GB
For some, better news may be that you can combine both the /3GB and /PAE switches to allow applications (properly compiled, of course) to use of to 3 GB of memory and for more of that data to remain resident in real memory, providing a performance boost from both ends.As the saying goes, the candle that burns twice as bright also burns out twice as fast.
Earlier it was mentioned that the memory manager virtualizes memory for applications and for portions of the kernel so they do not have to do the work of keeping track of where their data is. Because of this, the memory manager must index where in real memory or pagefile data is so that when an application asks for the data in 0×12345678, the memory manager can look up in a translation table and see that data is really in the pagefile and go get it for the application. One of the structures involved in this translation and lookup process is called a page frame number (PFN).
Because PAE creates a larger range of physical addresses for the memory manager to keep track of and be able to index, the amount of PFN entries that are required grows dramatically. On a system that is booted with both the /PAE and /3GB switches, an interesting thing happens. The amount of memory that must be indexed/translated in a lookup table is dramatically increased, moore key data structures involved in that lookup process doubles are used and the area of memory where that structure is stored, kernel mode memory, is capped at 1 GB.
This combination will exhaust system kernel space much earlier than normal. Because of this, the memory manager imposes a hard cap of 16 GB on a system booted with both the /3GB and /PAE boot options. Even if a system has 32 GB, if it is booted with both options then only 16 will be recognized. So if your application instance requires more than 16 GB of memory, you cannot mix these two memory models.
Even though the memory manager imposes a hard limit of 16 GB in this configuration, it is possible to encounter problems even with lesser amounts of memory (say, 8 GB or 12 GB), so it is always a good idea to give the kernel as much room as possible either by reducing the memory load or by using the /USERVA sub-switch to increase kernel memory.
But what if 3 GB is not enough for an application? What if an application needs to use 5 GB of real memory and cannot ever have that memory paged out to disk? There is a way to have an application use gigabytes of virtual memory and ensure that data mapped into that real memory was never written to the pagefile, and it is Application Windowing Extensions (AWE), which is enabled in SQL Server 2000.
Application Windowing Extensions
AWE is an API set that compliments PAE. Unlike PAE, AWE is not a boot option and applications are not ignorant of its existence or function. Rather, it is quite the opposite, applications must directly invoke the AWE APIs in order to use them. Applicationsmust be developed using the AWE API set in order to benefit from it. Do not worry, the specifics of the AWE API is beyond the scope of this book and will not be covered here.
To learn about specific APIs and get code samples, please seen MSDN online at http://msdn.microsoft.com.
Since we are not getting into code samples of AWE or even looking at the API set, we’re going to look simply at the functionality it provides and how it affects memory usage.
Because an application virtual address space can only be extended to 3 GB on the IA32 platform (the industry name for the Intel 32-bit platform), applications (and database applications in particular, since they deal with such large datasets and do a lot of swapping of data from disk to memory) needed a method to map large portions of data into memory and keep that data in real memory at all times to increase the performance. This also allows an application to map/remap data into it’s virtual memory space very quickly without failures.
AWE does just that. AWE allows an application to “reserve” chunks of real memory that cannot be paged out to the paging file or otherwise manipulated except by the reserving application, AWE will keep the data in real memory at all times. Because the memory manager cannot manage this memory, it will never be swapped out to the paging file. The application is now completely responsible for handling the memory in a responsible manner.
Because this technique is only useful if an application is able to reserve large chunks of memory, it is a technology that best compliments existing technology, namely PAE, rather than exists as a stand alone memory tuning solution.
The same dangers exist with AWE that exist with all of the other memory tuning techniques, however. Using any one or a combination of these methods to boost SQL performance via memory tuning can starve other applications or even the kernel itself of memory that it needs to perform adequately. In some cases, the starvation caused by weighing memory too heavily in favor of an application can cause performance degradation in other areas severe enough to cause an instance’s performance to drop.
From Microsoft:
Operating SystemOperating System
Filed under: Operating System


Leave a Reply