Setting UBC parameters
|
This thread has been summarized here on this page. To keep it simple as close to the original format, it is in the form of questions and answers.
Contents
- 1 Question: I want to know how much RAM is allocated to a particular container. I ran cat /proc/user_beancounters from inside my container and I have the following results below. But I am not sure how to interpret the results.
- 2 Question: What parameters would tell us how much RAM is guaranteed and we burst upto?
- 3 Question: can we elaborate further?
- 4 Question: can you illustrate the OOM situation more in detail
- 5 Question: Is there a way to know for a container account holder (me) that a server has reached an OOM situation?
- 6 Are you aware of a script that would run as a cron job and do the OOM math on a regular basis and email me if I've hit the red zone or maybe at 85-90%?
- 7 Can you please elaborate the difference between oomguarpages and privvmpages?
Question: I want to know how much RAM is allocated to a particular container. I ran cat /proc/user_beancounters from inside my container and I have the following results below. But I am not sure how to interpret the results.
root@srv1 [~]# cat /proc/user_beancounters
Version: 2.5
uid resource held maxheld barrier limit failcnt
10039: kmemsize 5125208 5128321 40098656 44108521 0
lockedpages 0 0 881 881 0
privvmpages 77431 77666 750000 825000 0
shmpages 9051 9051 33324 33324 0
dummy 0 0 0 0 0
numproc 67 67 440 440 0
physpages 44243 44371 0 2147483647 0
vmguarpages 0 0 125000 2147483647 0
oomguarpages 59239 59367 125000 2147483647 0
numtcpsock 37 38 440 440 0 begin_of_the_skype_highlighting 37 38 440 440 0 end_of_the_skype_highlighting
numflock 3 3 704 704 0
numpty 1 1 44 44 0
numsiginfo 0 1 1024 1024 0
tcpsndbuf 79920 88800 4212558 6014798 0
tcprcvbuf 2220 4440 4212558 6014798 0
othersockbuf 19552 91280 2106279 3908519 0
dgramrcvbuf 0 2220 2106279 2106279 0
numothersock 18 20 440 440 0
dcachesize 406435 410022 8750726 9013248 0
numfile 1080 1081 7040 7040 0
dummy 0 0 0 0 0
dummy 0 0 0 0 0
dummy 0 0 0 0 0
numiptent 71 71 512 512 0
Answer: let's just look at UBC parameters.
kmemsize 5125208 5128321 40098656 44108521 0
privvmpages 77431 77666 750000 825000 0
physpages 44243 44371 0 2147483647 0
- kmemsize: Kernel memory, used 5125208 bytes. Kernel memory keeps process information and cannot be swapped.
- privvmpages: Private virtual memory (mem + swap in fact): ALLOCATED (but probably not used) 77431 pages (each page is 4Kb)
- physpages: Physical pages: really USED 44243 pages of memory, out of what was allocated above.
Question: What parameters would tell us how much RAM is guaranteed and we burst upto?
vmguarpages 0 0 125000 2147483647 0
oomguarpages 59239 59367 125000 2147483647 0
The guarantee is vmguarpages barrier = 125000 pages. In any case, the container will be able to allocate that size of memory. Well... in almost any case, with one exception...
...when the whole node is in out of memory situation. In that case, the guarantee is oomguarpages barrier (the same 125000 pages in this example). It means that no container process will be killed if memory usage by the container is less than 125000 * 4096 bytes. If it is more, some process can be killed, and failcounter of oomguarpages will be increased.
Maximum size of memory which can be allocated is defined by privvmpages (see my previous post).
Question: can we elaborate further?
With the help of your answer, we can say that the guaranteed RAM (vmguarpages) in the given example is 125000 * 4096 = 512000000 (488.28125 MB).
Further, we may also calculate privvmpages (RAM+SWAP) as 750000 * 4096 = 3072000000 (2929.6875 MB).
would I be wrong here to note that RAM and SWAP allocation ratio is about 1:5. It seems there is a wide gap betwen the two, am i right? I used to think the ratio of RAM and SWAP should usually be 1:2.
What I also fail to understand is the fact that out of the two parameters (vmguarpages & oomguarpages), only oomguarpages seems to be used up to some extent. vmguarpages seems be completely unused at all. What does this situation indicate? Or am i missing something?
Answer:
You are a bit wrong here. Let me explain it more carefully.
vmguarpages barrier is a guarantee: in any case (except out-of-memory aka OOM) you will be able to allocate 125000 pages. Well, probably you will be able to allocate more, up to 750000 pages (and if you are cool high priority process, you will be able to allocate 825000 pages, but definitely not more). We don't say anything about RAM or SWAP here. These pages can be swapped by HWnode kernel if needed, and can live in RAM if possible. Frankly, you can't say "I want this container to have X MB RAM and Y MB SWAP". There is no such parameter.
vmguarpages does not account anything, so its current usage always equals to zero.
oomguarpages parameter's 'current usage' field accounts total amount of RAM+SWAP used.
physpages parameter accounts total amount of RAM used by the container (the memory shared between containeres is divided and accounted in physpages for all containeres which use it in equal fractions). It cannot be limited, it just shows the current situation.
Let's do some math here.
1. Normal situation:
guarantee: vmguarpages barrier = 125000 pages = 488 MB.
current usage: physpages current value = 44243 physical (RAM) pages = 172 MB.
current usage (RAM + SWAP): oomguarpages current value = 59239 pages = 231 MB.
may allocate (but no garantees if allocation will be successful) up to privvmpages barrier = 750000 pages = 2929 MB, and system processes -- up to privvmpages limit = 825000 pages = 3222 MB.
2. OOM situation.
If your oomguarpages current value + sockets buffers current value + kmemsize current value is more than oomguarpages barrier, you are the bad guy here and we'll kill your process.
Question: can you illustrate the OOM situation more in detail
OK, here is the math. OOM situation on the node means that there is no free memory (or sometimes that there is no free low memory).
In that case, the system will count the sum of:
oomguarpages current value (MEM+SWAP actual usage) = 59239 pages = 242642944 bytes
socket buffers current value: 79920 + 2220 + 19552 + 0 bytes
kmemsize current value: 5125208 bytes
the sum is 247869844 bytes. If this sum is more than oomguarpages barrier = 125000 pages = 512000000 bytes (it is not), some of the processes on the container can be killed in case of OOM situation.
Question: Is there a way to know for a container account holder (me) that a server has reached an OOM situation?
'free' inside a container will show memory statistics for the hardware node (total memory, used memory, total swap, used swap). It's OK for Linux system to use as much RAM as possible, so free memory size is going to be not more than a few megabytes.
Are you aware of a script that would run as a cron job and do the OOM math on a regular basis and email me if I've hit the red zone or maybe at 85-90%?
We don't have such script but you can just take a look at the last column of /proc/user_beancounters. Nonzero values there mean that resource allocations were rejected.
Thank you for everything, unfortunately the only way I have to help is to work with code, I leave something here that I hope you will enjoy and seet very usefully.
#!/bin/bash # Version 0.6 # Reference: http://wiki.openvz.org/Setting_UBC_parameters # Author: Roberto Blandino Cisneros. # rojoblandino@yahoo.com # The usage of this script is easy. # To understand how it works first you must know # the quantity of your containers so it is the # first thing we do. # Each container get a number if you have three # containers e.g. eva gira huna # Depending of the order ar shown in # /proc/user_beancounters that number of position # will be taken. # If eva, gira and huna containers are shown in that # order they will be 1 2 and 3 respectively. # Is so simple like that, then if you want to see # the status for gira container you must type in # your shell: # $> bash vmstatus.sh 2 # And you will see the status of this container. # First thing to do is to get the list of containers vzn=`vzlist | wc -l` # Save the output into a var and erase the first # garbage line by subtracting 1. let $n "n=vzn-1" case $# in 1) # We get the container to be calculated vm=$1 # Just avoiding our human error datachk=`echo $vm | grep -e [a-z] | wc -l` if let "datachk>0";then echo "That's not a valid number!" exit fi # Avoiding another mistake maybe we forgot the quantity of our containers if let "vm>n";then echo "Container out of range, please try a number between 1 and "$n exit fi # Calculating lines to be jump from the output of user_bancounters. # Becuase there is a first garbage line i need 26 spaces from the # beginning, then i need to show the last 24. # (n-vm) because i do not want to see it bacward if you want to change the # order switch the comment with the following line: # let $val "val=26+(24*vm)" let $val "val=26+(24*(n-vm))" # If you want to see all values of the vm uncomment this #head -$val /proc/user_beancounters | tail -24 # The math start here. # To understand what i am doing here i will repeat again that # you must read carefully the page http://wiki.openvz.org/Setting_UBC_parameters # the only thing i do was to get the values explained already there. kmemsize=`head -$val /proc/user_beancounters | tail -24 | head -1 | tail -1 | awk '{print $3}'` oomguarpages=`head -$val /proc/user_beancounters | tail -24 | head -9 | tail -1 | awk '{print $2}'` tcpsndbuf=`head -$val /proc/user_beancounters | tail -24 | head -14 | tail -1 | awk '{print $2}'` tcprcvbuf=`head -$val /proc/user_beancounters | tail -24 | head -15 | tail -1 | awk '{print $2}'` othersockbuf=`head -$val /proc/user_beancounters | tail -24 | head -16 | tail -1 | awk '{print $2}'` dgramrcvbuf=`head -$val /proc/user_beancounters | tail -24 | head -17 | tail -1 | awk '{print $2}'` oomguarpages_barrier=`head -$val /proc/user_beancounters | tail -24 | head -9 | tail -1 | awk '{print $4}'` # We need to calculate three current values: # oomguarpages current value (MEM+SWAP actual usage) # socket buffers current value # socket buffers current value let $resp "resp=((oomguarpages*4096)+tcpsndbuf+tcprcvbuf+othersockbuf+dgramrcvbuf+kmemsize)" # We need to calculate the oom Barrier let $oom "oom=(oomguarpages_barrier*4096)" # Getting the percentage of the output let $porc "porc=(resp*100)/oom" # We show the container ID and Percentage echo `head -$val /proc/user_beancounters | tail -24| head -1 | awk -F: '{print $1}'`" Status: "$porc"%" # We show the two important values. echo -e "System Count:\t-\tOOM Barrier:" echo -e $resp"\t-\t"$oom # Uncoment this if you just want to now if is below to barrier # it means is healthy if resp is less than oom # if let "resp<oom";then # Or let the following if you want to evaluate as a percentage # 75% is by default you must change it to your own warning barrier. if let "porc<75";then echo "It is ok!" else echo "Some of the processes will be killed!" fi ;; *) echo "Just one parameter is allowed for this script" echo "e.g. "$0" 1 " ;; esac
Can you please elaborate the difference between oomguarpages and privvmpages?
Both seem to be showing mem + swap. Is the difference between two about allocated but not used pages? I mean, privvmpages show allocated (some of which might not be used) for mem + swap, whereas oomguarpages show allocated, all of which are used, mem + swap? ====
Answer:
Yes, oomguarpages current value shows actual usage of MEM+SWAP.