Difference between revisions of "Performance tuning"

From OpenVZ Virtuozzo Containers Wiki
Jump to: navigation, search
(kernels with different versions)
(Marked this version for translation)
 
(21 intermediate revisions by 8 users not shown)
Line 1: Line 1:
This page describes how to do correct performance measurements on OpenVZ system.
+
<translate>
 +
<!--T:1-->
 +
This page describes how to improve the performance of an OpenVZ system.
  
= Test conditions =
+
== HW node environment tuning == <!--T:2-->
  
== kernels with different versions ==
+
=== Disable unnecessary services === <!--T:3-->
* If you want to compare performance of the kernel on different hosts, or measure OpenVZ performance overhead, it's strongly recommended to compare the same kernel version with similar .config file and on the same linux distribution.
 
  
* If you compare kernels with different versions, please check all .config options that are differ, especially _DEBUG_ options. For example, on unixbench pipe throughput test on 2.6.18 kernel, disabled CONFIG_DEBUG_HIGHMEM option will increase performance up to <font color=red>20%</font>
+
<!--T:4-->
 +
Disable all default services that you do not need to use and then reboot your host.  
  
== running services ==
+
<!--T:5-->
* Before performing test measurement, you should disable '''all''' default services in your runlevel and then '''reboot''' your host.
+
For example, the <code>audit</code> daemon can significantly decrease performance of linux kernel system calls (up to ~<font color=red>20%</font>) even if you do not use any audit rules, or even if you just stopped this service without host reboot!
  
It is not enough to just stop services, because some services, like <code>audit</code> will affect performance even if the daemon is already stopped. For some unixbench tests it can be <font color=red>~20%</font> overhead if <code>auditd</code> was started once on a host before reboot.
+
<!--T:6-->
 +
To setup default services, use <code>chkconfig</code> or <code>ntsysv</code> in RedHat, or <code>rc-update</code> in Gentoo, <code>update-rc.d</code> on Debian
  
On RedHat distributions use <code>chkconfig</code> or <code>ntsysv</code> utility to disable default services. (<code>rc-update</code> in Gentoo, <code>update-rc.dv</code> for Debian)
+
=== Shell scripts performance improvement === <!--T:7-->
  
== filesystem tests ==
+
<!--T:8-->
* If you perform filesystem tests, please keep in mind filesystem type, block size, mount options and so on.
+
To improve performance of small shell scripts, which spends a lot of time starting the shell binary itself (like the shell scripts test from the [http://www.tux.org/pub/tux/niemi/unixbench/ unixbench] package), you can set your <code>LANG</code> environment variable to <code>"C"</code>.
  
For example, ext3 filesystem performance highly depends on journal type and mount options.
+
<!--T:9-->
 +
To see current settings, type  
  
* Also please always note/report IO-scheduler type. Different IO-schedulers can highly affect your tests results (up to <font color=red>30%</font>).
+
  <!--T:10-->
 +
# locale
  
If your kernel support different IO-schedulers, you can get/set the type here:
+
<!--T:11-->
 +
If you want to change it only for the current shell session, do:
  
  # cat /sys/block/hda/queue/scheduler
+
  <!--T:12-->
noop anticipatory deadline [cfq]
+
# export LANG=C
# echo noop > /sys/block/hda/queue/scheduler
 
  
== network isolation ==
+
<!--T:13-->
* You should disable local network/internet connection if your tests doesn't require it.
+
If you want to change the default value, modify the <code>/etc/sysconfig/i18n</code> file.
  
== CPU distribution inside VE on SMP hosts ==
+
<!--T:14-->
* If the number of VE's in your host is more than CPUs number, and there are many tasks/tests running inside each VE, and that tasks are scheduled quite often, it's better to give just one CPU for each VE. In this case the VirtualCPU-scheduler performance overhead can be significantly decreased, and performance can increase up to <font color=red>100%</font>!
+
If your default <code>LANG</code> environment variable was set to something like <code>en_US.UTF-8</code>, you can reduce shell (bash) startup time up to ~<font color=red>15%</font> with <code>LANG=C</code>. 
  
To set the number of CPUs available inside VE use:
+
== Container tuning == <!--T:15-->
  
# vzctl set $VEID --cpus N
+
=== CPU distribution inside container on SMP hosts === <!--T:16-->
  
== network performance ==
+
<!--T:17-->
* please do not use file transferring utilities to test the network performance, because the bottleneck of these tests is usually file system performance - not TCP/IP stack
+
If the total number of containers in your host is more than CPUs number, and there are many '''threads''' running inside each container it is better to give just a single VCPU to each container.
 +
In this case thread memory locality will significantly reduce overhead on SMP memory coherence and overall performance can be increased up to ~<font color=red>50-100%</font>!
  
== network checksumming ==
+
<!--T:18-->
'''TODO'''
+
To set the number of CPUs available inside a container, use:
 +
 
 +
<!--T:19-->
 +
# vzctl set $CTID --cpus N
 +
 
 +
=== Network checksumming === <!--T:20-->
 +
 
 +
<!--T:21-->
 +
RHEL 5 based kernel supports IP checksum offload.
 +
If network ethernet cards in your host support IP checksum offload then you can switch this feature on also for the virtual network devices (venet, veth).
 +
 
 +
<!--T:22-->
 +
To check current offload setting for the hardware ethernet card (eth0, for instance) type
 +
 
 +
  <!--T:23-->
 +
# ethtool -k eth0
 +
 +
Make sure that tx/rx features are switched on.
 +
 
 +
<!--T:24-->
 +
To see current offload settings for the venet0 device, type
 +
 
 +
  <!--T:25-->
 +
# ethtool -k venet0
 +
 
 +
<!--T:26-->
 +
To set offload settings on for the venet0 device, type
 +
 
 +
  <!--T:27-->
 +
# ethtool -K venet0 tx on sg on
 +
 
 +
<!--T:28-->
 +
Note, that 'tx on/off' enables/disables both tx and rx checksumming features for the all venet devices for all containers and HN.
 +
 
 +
<!--T:29-->
 +
The same applies to the veth device except that 'tx on/off' enables/disables tx and rx checksumming features for only given virtual ethernet device in HN and corresponding container.
 +
 
 +
=== Shell scripts performance improvement === <!--T:30-->
 +
 
 +
<!--T:31-->
 +
Please note, that on container creation the default <code>LANG</code> value will be the same as in the HW node. So you can tune it in node (see [[#Shell scripts performance improvement]] above), or set it in container the same way.
 +
 
 +
<!--T:32-->
 +
The second important thing is the locale cache. On <code>rpm</code> based distributions, usually it is created by the <code>glibc-common-XXX.rpm</code> post install script and it can be up to 50 MBytes on some distributions. So on some container templates it can be missed to save disk space. But you can always create it inside container later by the following command (you must be the root user): 
 +
 
 +
<!--T:33-->
 +
# build-locale-archive
 +
 
 +
<!--T:34-->
 +
And again, in some cases shell (bash) startup time can be reduced up to ~<font color=red>15%</font>.
 +
</translate>
 +
 
 +
[[Category: HOWTO]]
 +
[[Category: Troubleshooting]]

Latest revision as of 08:42, 26 December 2015

<translate> This page describes how to improve the performance of an OpenVZ system.

HW node environment tuning[edit]

Disable unnecessary services[edit]

Disable all default services that you do not need to use and then reboot your host.

For example, the audit daemon can significantly decrease performance of linux kernel system calls (up to ~20%) even if you do not use any audit rules, or even if you just stopped this service without host reboot!

To setup default services, use chkconfig or ntsysv in RedHat, or rc-update in Gentoo, update-rc.d on Debian

Shell scripts performance improvement[edit]

To improve performance of small shell scripts, which spends a lot of time starting the shell binary itself (like the shell scripts test from the unixbench package), you can set your LANG environment variable to "C".

To see current settings, type

  1. locale

If you want to change it only for the current shell session, do:

  1. export LANG=C

If you want to change the default value, modify the /etc/sysconfig/i18n file.

If your default LANG environment variable was set to something like en_US.UTF-8, you can reduce shell (bash) startup time up to ~15% with LANG=C.

Container tuning[edit]

CPU distribution inside container on SMP hosts[edit]

If the total number of containers in your host is more than CPUs number, and there are many threads running inside each container it is better to give just a single VCPU to each container. In this case thread memory locality will significantly reduce overhead on SMP memory coherence and overall performance can be increased up to ~50-100%!

To set the number of CPUs available inside a container, use:

  1. vzctl set $CTID --cpus N

Network checksumming[edit]

RHEL 5 based kernel supports IP checksum offload. If network ethernet cards in your host support IP checksum offload then you can switch this feature on also for the virtual network devices (venet, veth).

To check current offload setting for the hardware ethernet card (eth0, for instance) type

  1. ethtool -k eth0

Make sure that tx/rx features are switched on.

To see current offload settings for the venet0 device, type

  1. ethtool -k venet0

To set offload settings on for the venet0 device, type

  1. ethtool -K venet0 tx on sg on

Note, that 'tx on/off' enables/disables both tx and rx checksumming features for the all venet devices for all containers and HN.

The same applies to the veth device except that 'tx on/off' enables/disables tx and rx checksumming features for only given virtual ethernet device in HN and corresponding container.

Shell scripts performance improvement[edit]

Please note, that on container creation the default LANG value will be the same as in the HW node. So you can tune it in node (see #Shell scripts performance improvement above), or set it in container the same way.

The second important thing is the locale cache. On rpm based distributions, usually it is created by the glibc-common-XXX.rpm post install script and it can be up to 50 MBytes on some distributions. So on some container templates it can be missed to save disk space. But you can always create it inside container later by the following command (you must be the root user):

  1. build-locale-archive

And again, in some cases shell (bash) startup time can be reduced up to ~15%. </translate>