Difference between revisions of "Resource shortage"

From OpenVZ Virtuozzo Containers Wiki
Jump to: navigation, search
(cpulimit: warn -> warning)
(Additional explanation of cpulimit on multi-core servers.)
Line 133: Line 133:
 
vzctl set 101 --cpulimit 25 --save
 
vzctl set 101 --cpulimit 25 --save
 
</pre>
 
</pre>
says that container 101 cannot ever have more than 25 percent of the cpu even if the cpu is idle for the other 75% of the time.
+
says that container 101 cannot ever have more than 25 percent of a CPU even if the CPU is idle for the other 75% of the time. The limit is calculated as a percentage of a single CPU, not as a percentage of the server's CPU resources as a whole. In other words, if you have more than one CPU, you can set a cpulimit > 100. In a quad-core server, setting cpulimit to 100 permits a container to consume one entire core (and not 100% of the server).
  
 
{{Warning|cpulimit is not yet implemented in kernels > 2.6.18 (i.e. development ones). Use stable kernel if you want this feature.}}
 
{{Warning|cpulimit is not yet implemented in kernels > 2.6.18 (i.e. development ones). Use stable kernel if you want this feature.}}

Revision as of 10:04, 21 March 2009

Sometimes you see strange failures from some programs inside your container. In some cases it means one of the resources controlled by OpenVZ has hit the limit.

The first thing to do is to check the contents of the /proc/user_beancounters file in your container. The last column of output is the fail counter. Each time a resource hits the limit, the fail counter is incremented. So, if you see non-zero values in the failcnt column that means something is wrong.

There are two ways to fix the situation: reconfigure (in some cases recompile) the application, or change the resource management settings.

UBC parameters

Here is an example of current UBC values obtained from /proc/user_beancounters file in container 123:

# cat /proc/user_beancounters
Version: 2.5
       uid  resource           held    maxheld    barrier      limit    failcnt
       123: kmemsize         836919    1005343    2752512    2936012          0
            lockedpages           0          0         32         32          0
            privvmpages        4587       7289      49152      53575          0
            shmpages             39         39       8192       8192          0
            dummy                 0          0          0          0          0
            numproc              20         26         65         65          0
            physpages          2267       2399          0 2147483647          0
            vmguarpages           0          0       6144 2147483647          0
            oomguarpages       2267       2399       6144 2147483647          0
            numtcpsock            3          3         80         80          0
            numflock              3          4        100        110          0
            numpty                1          1         16         16          0
            numsiginfo            0          1        256        256          0
            tcpsndbuf             0          0     319488     524288          0
            tcprcvbuf             0          0     319488     524288          0
            othersockbuf       6684       7888     132096     336896          0
            dgramrcvbuf           0       8372     132096     132096          0
            numothersock          8         10         80         80          0
            dcachesize        87672      92168    1048576    1097728          0
            numfile             238        306       2048       2048          0
            dummy                 0          0          0          0          0
            dummy                 0          0          0          0          0
            dummy                 0          0          0          0          0
            numiptent            10         16        128        128          0

You can see if you hit the limit for some UBC parameters by analyzing the last column (named failcnt). It shows a number of failures for this counter, i.e. a number of times a parameter hit the limit. Usually what you need to do is to increase the parameter in question. But you need to do it carefully, and here is how.

  1. Get the current values for the parameter's barrier and limit. For example, we want to increase kmemsize values. From /proc/user_beancounters we see that kmemsize barrier is 2752512, and its limit is 2936012.
  2. Increase the values. Say, we want to double kmemsize. This is how it can be done using built-in bash arithmetics:
    # vzctl set 123 --kmemsize $((2752512*2)):$((2936012*2)) --save
    

    By using the --save flag, we indicate we want to apply the new setting to the running container and save it in the configuration file (from which the settings will be taken during next container start).

  3. Check the new configuration. Issue the following command:
    # vzcfgvalidate /etc/vz/conf/123.conf
    

    If something is wrong, you need to fix it as suggested by the utility.

For more in-depth explanation of different parameters, their meaning and how to set them properly, see setting UBC parameters.

Disk quota

To check if your container exceeded its disk quota, use the following commands (inside a container):

# df
Filesystem           1K-blocks      Used Available Use% Mounted on
simfs                  1048576    327664    720912  32% /
# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
simfs                 200000   18857  181143   10% /

The first command shows disk space usage and the second command shows the inodes usage (you can roughly use the inodes count as a number of files/directories on your system).

If one of the commands shows a usage of 100% you have exceeded one of the disk quota limits.

You can increase the limit from the host system (CT0 aka VE0) only. This is how:

  1. Get the current values for disk quota:
    # vzquota stat 123
       resource          usage       softlimit      hardlimit    grace
      1k-blocks         327664         1048576        1153434
         inodes          18857          200000         220000
    
  2. To increase the disk space quota, use vzctl set --diskspace. For example, we want to increase it by a factor of 2:
    vzctl set 123 --diskspace $(( 1048576*2 )):$(( 1153434*2 )) --save
    
  3. To increase the disk inodes quota, use vzctl set --diskinodes. For example, we want to increase it by a factor of 3:
    vzctl set 123 --diskinodes $(( 200000*3 )):$(( 220000*3 )) --save
    
Yellowpin.svg Note: shell does not support floating-point arithmetic, i.e. you can not use expressions like $(( 220000*1.5 )). To use floating point, try bc instead, something like this: $(echo 220000*1.5 | bc).

CPU

There are two parameters controlling fair CPU scheduler in OpenVZ: cpuunits and cpulimit.

cpuunits

Cpuunits are set via

vzctl set 101 --cpuunits 1000 --save

For example. If you set a cpuunit for one container to a value and set a cpuunit on another container to a different value, the time allotted to each of the containers will be the ratio of the two units. Let's use a real example.

We did the following:

vzctl set 101 --cpuunits 1000 --save
vzctl set 102 --cpuunits 2000 --save
vzctl set 103 --cpuunits 3000 --save

If we started a CPU intensive application on each CT, then 103 would be given 3 times as much cpu time as 101 and 102 would get twice as much as 101, but some fraction of what 103 got. Here's how to determine what the real ratios are.

Add the three units, 1000+2000+3000 = 6000

101 gets 1000/6000 or 1/6th of the time. (16%) 102 gets 2000/6000 or 1/3rd of the time. (34%) 103 gets 3000/6000 or 1/2 of the time. (50%)

cpulimit

The cpulimit parameter sets the absolute maximum limit for a container to a percent value. For instance:

vzctl set 101 --cpulimit 25 --save

says that container 101 cannot ever have more than 25 percent of a CPU even if the CPU is idle for the other 75% of the time. The limit is calculated as a percentage of a single CPU, not as a percentage of the server's CPU resources as a whole. In other words, if you have more than one CPU, you can set a cpulimit > 100. In a quad-core server, setting cpulimit to 100 permits a container to consume one entire core (and not 100% of the server).

Warning.svg Warning: cpulimit is not yet implemented in kernels > 2.6.18 (i.e. development ones). Use stable kernel if you want this feature.