Setting UBC parameters

From OpenVZ Virtuozzo Containers Wiki
Revision as of 22:25, 12 July 2011 by 165.98.114.94 (talk) (Are you aware of a script that would run as a cron job and do the OOM math on a regular basis and email me if I've hit the red zone or maybe at 85-90%?)
Jump to: navigation, search
User Beancounters
Definition
/proc/user_beancounters
/proc/bc/
General information
Units of measurement
VSwap
Parameters description
Primary parameters
numproc, numtcpsock, numothersock, vmguarpages
Secondary parameters
kmemsize, tcpsndbuf, tcprcvbuf, othersockbuf, dgramrcvbuf, oomguarpages, privvmpages
Auxiliary parameters
lockedpages, shmpages, physpages, numfile, numflock, numpty, numsiginfo, dcachesize, numiptent, swappages
Internals
User pages accounting
RSS fractions accounting
On-demand accounting
UBC consistency
Consistency formulae
System-wide configuration
vzubc(8)
Configuration examples
Basic
Derived
Intermediate configurations
Tables
List of parameters
Parameter properties
Consistency
Config examples

This thread has been summarized here on this page. To keep it simple as close to the original format, it is in the form of questions and answers.

Question: I want to know how much RAM is allocated to a particular container. I ran cat /proc/user_beancounters from inside my container and I have the following results below. But I am not sure how to interpret the results.

root@srv1 [~]# cat /proc/user_beancounters
Version: 2.5
      uid  resource           held    maxheld    barrier      limit    failcnt
    10039: kmemsize        5125208    5128321   40098656   44108521          0
           lockedpages           0          0        881        881          0
           privvmpages       77431      77666     750000     825000          0
           shmpages           9051       9051      33324      33324          0
           dummy                 0          0          0          0          0
           numproc              67         67        440        440          0
           physpages         44243      44371          0 2147483647          0
           vmguarpages           0          0     125000 2147483647          0
           oomguarpages      59239      59367     125000 2147483647          0
           numtcpsock           37         38        440        440          0
           numflock              3          3        704        704          0
           numpty                1          1         44         44          0
           numsiginfo            0          1       1024       1024          0
           tcpsndbuf         79920      88800    4212558    6014798          0
           tcprcvbuf          2220       4440    4212558    6014798          0
           othersockbuf      19552      91280    2106279    3908519          0
           dgramrcvbuf           0       2220    2106279    2106279          0
           numothersock         18         20        440        440          0
           dcachesize       406435     410022    8750726    9013248          0
           numfile            1080       1081       7040       7040          0
           dummy                 0          0          0          0          0
           dummy                 0          0          0          0          0
           dummy                 0          0          0          0          0
           numiptent            71         71        512        512          0

Answer: let's just look at UBC parameters.

           kmemsize        5125208    5128321   40098656   44108521          0
           privvmpages       77431      77666     750000     825000          0
           physpages         44243      44371          0 2147483647          0

  • kmemsize: Kernel memory, used 5125208 bytes. Kernel memory keeps process information and cannot be swapped.
  • privvmpages: Private virtual memory (mem + swap in fact): ALLOCATED (but probably not used) 77431 pages (each page is 4Kb)
  • physpages: Physical pages: really USED 44243 pages of memory, out of what was allocated above.

Question: What parameters would tell us how much RAM is guaranteed and we burst upto?

           vmguarpages           0          0     125000 2147483647          0
           oomguarpages      59239      59367     125000 2147483647          0

The guarantee is vmguarpages barrier = 125000 pages. In any case, the container will be able to allocate that size of memory. Well... in almost any case, with one exception...

...when the whole node is in out of memory situation. In that case, the guarantee is oomguarpages barrier (the same 125000 pages in this example). It means that no container process will be killed if memory usage by the container is less than 125000 * 4096 bytes. If it is more, some process can be killed, and failcounter of oomguarpages will be increased.

Maximum size of memory which can be allocated is defined by privvmpages (see my previous post).

Question: can we elaborate further?

With the help of your answer, we can say that the guaranteed RAM (vmguarpages) in the given example is 125000 * 4096 = 512000000 (488.28125 MB).

Further, we may also calculate privvmpages (RAM+SWAP) as 750000 * 4096 = 3072000000 (2929.6875 MB).

would I be wrong here to note that RAM and SWAP allocation ratio is about 1:5. It seems there is a wide gap betwen the two, am i right? I used to think the ratio of RAM and SWAP should usually be 1:2.

What I also fail to understand is the fact that out of the two parameters (vmguarpages & oomguarpages), only oomguarpages seems to be used up to some extent. vmguarpages seems be completely unused at all. What does this situation indicate? Or am i missing something?

Answer:

You are a bit wrong here. Let me explain it more carefully.

vmguarpages barrier is a guarantee: in any case (except out-of-memory aka OOM) you will be able to allocate 125000 pages. Well, probably you will be able to allocate more, up to 750000 pages (and if you are cool high priority process, you will be able to allocate 825000 pages, but definitely not more). We don't say anything about RAM or SWAP here. These pages can be swapped by HWnode kernel if needed, and can live in RAM if possible. Frankly, you can't say "I want this container to have X MB RAM and Y MB SWAP". There is no such parameter.

vmguarpages does not account anything, so its current usage always equals to zero.

oomguarpages parameter's 'current usage' field accounts total amount of RAM+SWAP used.

physpages parameter accounts total amount of RAM used by the container (the memory shared between containeres is divided and accounted in physpages for all containeres which use it in equal fractions). It cannot be limited, it just shows the current situation.

Let's do some math here.

1. Normal situation:

guarantee: vmguarpages barrier = 125000 pages = 488 MB.

current usage: physpages current value = 44243 physical (RAM) pages = 172 MB.

current usage (RAM + SWAP): oomguarpages current value = 59239 pages = 231 MB.

may allocate (but no garantees if allocation will be successful) up to privvmpages barrier = 750000 pages = 2929 MB, and system processes -- up to privvmpages limit = 825000 pages = 3222 MB.

2. OOM situation.

If your oomguarpages current value + sockets buffers current value + kmemsize current value is more than oomguarpages barrier, you are the bad guy here and we'll kill your process.

Question: can you illustrate the OOM situation more in detail

OK, here is the math. OOM situation on the node means that there is no free memory (or sometimes that there is no free low memory).

In that case, the system will count the sum of:

oomguarpages current value (MEM+SWAP actual usage) = 59239 pages = 242642944 bytes

socket buffers current value: 79920 + 2220 + 19552 + 0 bytes

kmemsize current value: 5125208 bytes

the sum is 247869844 bytes. If this sum is more than oomguarpages barrier = 125000 pages = 512000000 bytes (it is not), some of the processes on the container can be killed in case of OOM situation.

Question: Is there a way to know for a container account holder (me) that a server has reached an OOM situation?

'free' inside a container will show memory statistics for the hardware node (total memory, used memory, total swap, used swap). It's OK for Linux system to use as much RAM as possible, so free memory size is going to be not more than a few megabytes.

Are you aware of a script that would run as a cron job and do the OOM math on a regular basis and email me if I've hit the red zone or maybe at 85-90%?

We don't have such script but you can just take a look at the last column of /proc/user_beancounters. Nonzero values there mean that resource allocations were rejected.

Rojoblandino wrote:
Thank you for everything, unfortunately the only way I have to help is to work with code,
I leave something here that I hope you will enjoy and see it very usefull. And yes there
was not any script but now there is one. Greetings and have fun.
#!/bin/bash
# Openvz Containers Memory Status
# Version 0.8
# Reference: http://wiki.openvz.org/Setting_UBC_parameters

# The program is placed under the GNU Public License (GPL)
# Author: Roberto Blandino Cisneros.
# Under any change please notify me
# rojoblandino@yahoo.com


# Advantages or changes:
# 1) Calculation of vz containers.
# 2) Percentage Calculation.
# 3) Avoid simples wrong characters and out of range entry.
# 4) Use the same order of vzlist output.

# The usage of this script is easy.
# To understand how it works first you must know
# the quantity of your containers so it is the
# first thing we do.
# Each container get a number if you have three
# containers e.g. eva gira huna
# Depending of the order ar shown in 
# /proc/user_beancounters that number of position
# will be taken.
# If eva, gira and huna containers are shown in that
# order they will be 1 2 and 3 respectively.
# Is so simple like that, then if you want to see
# the status for gira container you must type in
# your shell:
# $> bash vmstatus.sh 2
# And you will see the status of this container.

# First thing to do is to get the list of containers
vzn=`vzlist | wc -l`
# Save the output into a var and erase the first
# garbage line by subtracting 1.
let $n "n=vzn-1"

case $# in
    1)
        # Get the container to be calculated
        vm_param=$1

        # Just avoiding our human error
        datachk=`echo $vm_param | grep -e [a-z] | wc -l`
        if let "datachk>0";then
                echo "That's not a valid number!"
                exit
        fi

        # Avoiding another mistake maybe the user forgot the quantity of vm allocated
        if let "vm_param<1" || let "vm_param>n";then
                echo "Container out of range, please try a number between 1 and "$n
                exit
        fi

        # Take the real position of the container to get from the vzlist
        # for avoiding the first line we add one line more, this mean that id
        # will be taken from the 2nd line ahead
        let $vm "vm=vm_param+1"

        # Uncomment this and comment the line above if you want to switch the order
        #let $vm "vm=(n-vm_param)+2"

        # Take the id from the vzlist output
        id_vm=`vzlist | head -$vm | tail -1|awk '{print $1}'`

        # Get the line where is located the vm to be calculated
        val=`grep -n $id_vm /proc/user_beancounters | awk -F: '{print $1}'`

        # Adding the 23 block where the info is located
        let $val "val=23+(val)"

        # If you want to see all values of the vm uncomment this
        #head -$val /proc/user_beancounters  | tail -24

        # The math start here.
        # To understand what i am doing here i will repeat again that
        # you must read carefully the page http://wiki.openvz.org/Setting_UBC_parameters
        # the only thing i do was to get the values explained already there.
        kmemsize=`head -$val /proc/user_beancounters | tail -24 | head -1 | tail -1 | awk '{print $3}'`
        oomguarpages=`head -$val /proc/user_beancounters | tail -24 | head -9 | tail -1 | awk '{print $2}'`
        tcpsndbuf=`head -$val /proc/user_beancounters | tail -24 | head -14 | tail -1 | awk '{print $2}'`
        tcprcvbuf=`head -$val /proc/user_beancounters  | tail -24 | head -15 | tail -1 | awk '{print $2}'`
        othersockbuf=`head -$val /proc/user_beancounters | tail -24 | head -16 | tail -1 | awk '{print $2}'`
        dgramrcvbuf=`head -$val /proc/user_beancounters | tail -24 | head -17 | tail -1 | awk '{print $2}'`
        oomguarpages_barrier=`head -$val /proc/user_beancounters | tail -24 | head -9 | tail -1 | awk '{print $4}'`

        # Three current values need to be calculated:
        # oomguarpages current value (MEM+SWAP actual usage) 
        # socket buffers current value
        # kmemsize current value
        let $resp "resp=((oomguarpages*4096)+tcpsndbuf+tcprcvbuf+othersockbuf+dgramrcvbuf+kmemsize)"

        # Calculating the oom Barrier
        let $oom "oom=(oomguarpages_barrier*4096)"

        # Getting the percentage of the output
        let $porc "porc=(resp*100)/oom"

        # Showing the ID and Percentage of the container
        echo `head -$val /proc/user_beancounters | tail -24| head -1 | awk -F: '{print $1}'`" Status: "$porc"%"

        # Showing the two important values.
        echo -e "System Count:\t-\tOOM Barrier:"
        echo -e $resp"\t-\t"$oom

        # Uncoment this if you just want to now if it is below to barrier
        # it means is healthy if resp is less than oom
        # if let "resp<oom";then
        # Or let the following if you want to evaluate as a percentage
        # 75% is by default you must change it to your own warning barrier.
        if let "porc<75";then
            echo "It is ok!"
        else
            echo "Some of the processes will be killed!"
        fi
        ;;
    *)
        echo "Just one parameter is allowed for this script"
        echo "e.g. "$0" 1 "
        ;;

esac

Can you please elaborate the difference between oomguarpages and privvmpages?

Both seem to be showing mem + swap. Is the difference between two about allocated but not used pages? I mean, privvmpages show allocated (some of which might not be used) for mem + swap, whereas oomguarpages show allocated, all of which are used, mem + swap? ====

Answer:

Yes, oomguarpages current value shows actual usage of MEM+SWAP.