VMware ballooning is one of the hard concept to grasp. There are a lot of misunderstanding out there about this feature. I have been discussing this feature with customers and students during the last 5 years. This is my attempt to explain balloning.
VMware ballooning is a memory reclamation technique used when and ESXi host is running low on memory. You should not see balloning if your hosts is performing like it should. To understand ballooning we would have to take a look at the following picture:
This picture shows the three levels of memory in a virtual environment. In a physical world we would only have the two top levels (virtual memory & guest physical memory) but in the virtual world we also have the host physical memory. What is important to know is that the hypervisor (ESXi) has no knowledge of what is happening inside the virtual machine (grey area). The hypervisor maps memory when the virtual machines asks for it. The hypervisor will then give it memory from “host physical memory” but only if memory is available. If memory is not available the memory can med mapped to the .vswp file on a vmfs or nfs datastore. The virtual machine has no knowledge if the memory is mapped to physical memory or to a disk. This is called hypervisor swapping, and this is the last resort for the vmkernel to use this mechanism.
Ballooning in short is a process where the hypervisor reclaims memory back from the virtual machine. Ballooning is an activity that happens when the ESXi host is running out of physical memory. The demand of the virtual machine is too high for the host to handle.
Lets take a high level example:
- Inside a virtual machine you start an application. For instance solitaire
- solitaire as an application will ask the guest operating system (in this case windows) for memory. Windows will give it memory and map it from the virtual memory -> guest physical memory
- what happens next is that the hypervisor sees the request for memory and the hypervisor maps guest physical memory -> host physical memory
- Now everything is perfect. You play solataire for a few hours. And then you close it down.
- When you close solitaire the guest operating system will mark the memory as “free” and make it available for other applications. BUT since the hypervisor does not have access to Windows’ “free memory” list the memory will still be mapped in “host physical memory” and putting memory load on the ESXi host.
- This is where ballooning comes into place. In case of an ESXi host running low on memory the hypervisor will ask the “balloon” driver installed inside the virtual machine (with VMware Tools) to “inflate”
- The balloon driver will inflate and because it is “inside” the operating system it will start by getting memory from the “free list”. The hypervisor will detect what memory the balloon driver has reclaimed and will free it up on the “host physical memory” layer!
The balloon driver can inflate up to a maximum of 65%. For instance a VM with 1000MB memory the balloon can inflate to 650MB. The way to avoid ballooning is not to uninstall the balloon driver but to create a “Memory Reservation” for the virtual machine. In case of full inflation for this particular VM the result is the hypervisor gets 650MB memory reclaimed. The downfall of this is that you risk your VM to do Guest OS Swapping to its page file! Just remember page file swapping is better than hypervisor swapping. Hypervisor swapping happens without the guest operating system is aware of it. Page file swapping it is the OS that decides what pages to swap to disk!
To check for ballooning you can either open ESXTOP or the vCenter Performance Graphs.