1,734
edits
Changes
Marked this version for translation
<translate>
<!--T:1-->
KSM is a memory-saving de-duplication feature, developed by Red Hat. It first appeared in the Linux kernel version 2.6.32.
<!--T:2-->
KSM replaces RAM pages of identical content with a single write-protected page, which in turn gets automatically copied to a new one if a process later wants to update its content. This makes the de-duplication mechanism transparent to applications. This strategy is commonly known as COW (''Copy On Write'').
<!--T:3-->
Although KSM was originally developed for use with KVM, it can be used with OpenVZ containers as well.
== Performance and overhead ==<!--T:4-->
KSM is more effective when the hardware node has plenty of RAM and is running many containers with the same applications (e.g. a HN running 50 containers with Apache, PHP and MySQL).
<!--T:5-->
The amount of RAM that can be saved is in the 30-50% range depending on the application, with applications that have a mixed I/O and CPU footprint (e.g. MySQL and Apache) getting the best results.<ref>[http://www.researchgate.net/publication/220946080_An_Empirical_Study_on_Memory_Sharing_of_Virtual_Machines_for_Server_Consolidation ''An Empirical Study on Memory Sharing of Virtual Machines for Server Consolidation'']</ref>
<!--T:6-->
It is worth noting that KSM and Virtuozzo use totally different strategies for RAM deduplication: KSM provides a constantly running daemon (called ''ksmd'') which scans the HN's memory and merges identical RAM pages gradually over time, whereas Virtuozzo merges requests from different containers to the same physical binaries on disk. In doing so, Virtuozzo incurs no overhead at all, while KSM has a 5-10% CPU overhead depending on its configuration (faster scanning of the HN RAM will require more CPU power).
== Enabling KSM on the hardware node ==<!--T:7-->
First, you'll want to verify that KSM support is present and enabled in your OpenVZ kernel:
[root@HN ~]# grep KSM /boot/config-`uname -r`
CONFIG_KSM=y
<!--T:8-->
The KSM daemon is controlled by sysfs files in <code>/sys/kernel/mm/ksm/</code>, readable by all but writable only by root. On your hardware node, if KSM is not yet active, you'll see these default values:
<nowiki>[root@HN ~]# grep -H '' /sys/kernel/mm/ksm/*
/sys/kernel/mm/ksm/sleep_millisecs:20</nowiki>
<!--T:9-->
For the meaning of each parameter, please refer to https://www.kernel.org/doc/Documentation/vm/ksm.txt
<!--T:10-->
To start ''ksmd'', issue
[root@HN ~]# echo 1 > /sys/kernel/mm/ksm/run
<!--T:11-->
You can copy the same command in <code>/etc/rc.local</code> on the HN to make it persistent at boot.
<!--T:12-->
Verify that ''ksmd'' is running:
[root@HN ~]# ps aux | grep ksmd
root 989187 0.0 0.0 103252 896 pts/0 S+ 09:30 0:00 grep ksmd
== Enabling memory deduplication libraries in containers ==<!--T:13-->
In order to have KSM consider a memory page as a candidate for deduplication, the application itself must mark it as mergeable:
<!--T:14-->
:''KSM only operates on those areas of address space which an application has advised to be likely candidates for merging, by using the madvise(2) system call: int madvise(addr, length, MADV_MERGEABLE).''
<!--T:15-->
Note that most applications do not use madvise() at all: that's why KSM is generally used in conjunction with KVM, which takes care of marking pages as mergeable on behalf of the applications running within each virtual machine.
<!--T:16-->
Luckily, we can override the default behaviour of each application using the '''ksm_preload''' package (http://vleu.net/ksm_preload/) available in CentOS base repo.
<!--T:17-->
:''Linux ≥ 2.6.32 features a memory-saving mechanism that works by deduplicating areas of memory that are identical in different processes (even if they were generated at runtime and after the fork() of their common ancestors).''
<!--T:18-->
:''This mechanism requires the application to opt-in using the madvise() syscall. KSM Preload enables legacy applications (about any current application) to leverage this system by calling madvise(…, MADV_MERGEABLE) on every heap-allocated pages.''
<!--T:19-->
Do not use the usual <code># yum install ksm_preload</code> command inside your containers, as it will install an unnecessary stream of dependencies. Assuming your container is running on a recent CentOS 6.x template, issue the following commands instead:
<!--T:20-->[root@container /]# cd /usr/local/src
[root@container /]# yum install -y yum-downloadonly
''...bunch of output...''
exiting because --downloadonly specified ''// this is OK''
<!--T:21-->
We can now install just the ksm_preload RPM:
<!--T:22-->[root@container src]# rpm -i ksm_preload-0.10-3.el6.x86_64.rpm --nodeps
== Enabling memory deduplication in applications ==<!--T:23-->
In order to make an application take advantage of ksm_preload and use KSM on the HN, add this line into its startup script (assuming your container is running CentOS 6.x x86_64):
LD_PRELOAD=/usr/lib64/libksm_preload.so
<!--T:24-->
E.g., if you want to make Percona Server use KSM, modify its startup script like the following:
[root@container /]# nano /etc/init.d/mysql
''...''
<!--T:25-->
Then (re)start your Percona Server as usual.
== How to check efficiency of KSM ==<!--T:26-->
To check if KSM is actually reducing memory usage, issue this command on the HN:
[root@HN /]# cat /sys/kernel/mm/KSM/pages_sharing
If the value is greater than 0, you're saving memory. Refer to https://www.kernel.org/doc/Documentation/vm/ksm.txt for more details.
<!--T:27-->
To see all the KSM parameters, issue the following command on the HN:
<nowiki>grep -H '' /sys/kernel/mm/ksm/*</nowiki>
<!--T:28-->
On [https://gist.github.com/wankdanker/1206923 this page] you'll find a simple script which displays the same information in MB.
== Tuning ==<!--T:29-->
On a production machine, you'll want to modify some of the default values. A more sane value for <code>/sys/kernel/mm/KSM/sleep_millisecs</code> is usually between 50 and 250 (YMMV though):
<!--T:30-->[root@HN ~]# echo 50 > /sys/kernel/mm/ksm/sleep_millisecs
== Caveats ==<!--T:31-->
The ksmd daemon will take one or two minutes to start deduplicating memory and will require several minutes to reach stable state. During the boot phase your HN could start swapping if you have heavily overcommitted your RAM. You might want to use more aggressive settings (higher <code>pages_to_scan</code>, lower <code>sleep_millisecs</code>) at the beginning, effectively trading CPU utilization for less chances of disk swapping, and then relax them after 10 mins or so. Another possibility is to place your swap onto an SSD drive.
== References ==<!--T:32-->
<!--T:33-->
<references/>
== External links ==<!--T:34-->
<!--T:35-->
* [[wikipedia:Kernel same-page merging]]
</translate>
[[Category: HOWTO]]