0

KSM me baby

One of the methods for increasing density in the virtual workload space that is common today is deduplication. It is possible to dedupe any computing storage resource as the process simply scans for identical blocks and then stores only one of them with the rest being a pointer to the original.

In the above image the blue blocks are identical. As you can see only one is being stored in the hypervisor memory and shared among three different virtual machines. For environments that have identical operating systems or applications this can free up some memory and allow for greater density.

You might think that something as simple as a hash scan and COW pointer replacement would be available in any type 1 hypervisor. VMware ESXi used to have this functionality enabled. It was called transparent page sharing. Ever since ESXi 5.0 this was disabled by default and required every virtual machine .vmx file to be modified with a hash salt value in order to allow TPS to work. This decision was likely made due to security implications that are more theoretical than concrete. The two obvious issues being the weakening of ASLR and the possibility of rowhammer type memory leak attacks. Later on due to Large Pages support TPS became less useful as the likelihood of an entire 2MB memory page being identical was much smaller than when the pages were 4KB. As such VMware documentation suggests that this feature is only useful during memory overcommit to avoid OOM situations.

Microsoft’s Hyper-V platform has no such memory deduplication technology available.

The Linux kernel has had full scale memory deduplication available since 2009. The 2.6.32 kernel merged Kernel Same-page Merging. This is a fully configurable daemon that scans the host memory and combines identical pages to reduce memory usage. The addition of KSM allowed the KVM hypervisor at the time to run 52 instances of Windows XP each allocated with 1GB of memory on a 16GB hypervisor host. That’s a 325% memory density increase! Let’s see what it is doing on a production host with minimal OS re-use. This is a far from ideal scenario, but should yield some benefit.

root@cluster0n0:~# cat /sys/kernel/mm/ksm/pages_sharing 
326715

So we have 326,715 pages shared. Those are 4096 bits each. Some simple math shows us we are looking at a savings of 1.33GiB of memory due to KSM. Most of this is likely down to the two Windows servers running on the host.

For a more detailed academic overview you can check out this paper. Since KSM is totally tunable you can read about the available options in the Red Hat documentation.

kyle

Leave a Reply