Setting UBC parameters
| 
 | 
This thread has been summarized here on this page. To keep it simple as close to the original format, it is in the form of questions and answers.
Contents
- 1 Question: I want to know how much RAM is allocated to a particular container. I ran cat /proc/user_beancounters from inside my container and I have the following results below. But I am not sure how to interpret the results.
- 2 Question: What parameters would tell us how much RAM is guaranteed and we burst upto?
- 3 Question: can we elaborate further?
- 4 Question: can you illustrate the OOM situation more in detail
- 5 Question: Is there a way to know for a container account holder (me) that a server has reached an OOM situation?
- 6 Are you aware of a script that would run as a cron job and do the OOM math on a regular basis and email me if I've hit the red zone or maybe at 85-90%?
- 7 Can you please elaborate the difference between oomguarpages and privvmpages?
Question: I want to know how much RAM is allocated to a particular container. I ran cat /proc/user_beancounters from inside my container and I have the following results below. But I am not sure how to interpret the results.
root@srv1 [~]# cat /proc/user_beancounters
Version: 2.5
      uid  resource           held    maxheld    barrier      limit    failcnt
    10039: kmemsize        5125208    5128321   40098656   44108521          0
           lockedpages           0          0        881        881          0
           privvmpages       77431      77666     750000     825000          0
           shmpages           9051       9051      33324      33324          0
           dummy                 0          0          0          0          0
           numproc              67         67        440        440          0
           physpages         44243      44371          0 2147483647          0
           vmguarpages           0          0     125000 2147483647          0
           oomguarpages      59239      59367     125000 2147483647          0
           numtcpsock           37         38        440        440          0
           numflock              3          3        704        704          0
           numpty                1          1         44         44          0
           numsiginfo            0          1       1024       1024          0
           tcpsndbuf         79920      88800    4212558    6014798          0
           tcprcvbuf          2220       4440    4212558    6014798          0
           othersockbuf      19552      91280    2106279    3908519          0
           dgramrcvbuf           0       2220    2106279    2106279          0
           numothersock         18         20        440        440          0
           dcachesize       406435     410022    8750726    9013248          0
           numfile            1080       1081       7040       7040          0
           dummy                 0          0          0          0          0
           dummy                 0          0          0          0          0
           dummy                 0          0          0          0          0
           numiptent            71         71        512        512          0
Answer: let's just look at UBC parameters.
           kmemsize        5125208    5128321   40098656   44108521          0
           privvmpages       77431      77666     750000     825000          0
           physpages         44243      44371          0 2147483647          0
- kmemsize: Kernel memory, used 5125208 bytes. Kernel memory keeps process information and cannot be swapped.
- privvmpages: Private virtual memory (mem + swap in fact): ALLOCATED (but probably not used) 77431 pages (each page is 4Kb)
- physpages: Physical pages: really USED 44243 pages of memory, out of what was allocated above.
Question: What parameters would tell us how much RAM is guaranteed and we burst upto?
           vmguarpages           0          0     125000 2147483647          0
           oomguarpages      59239      59367     125000 2147483647          0
The guarantee is vmguarpages barrier = 125000 pages. In any case, the container will be able to allocate that size of memory. Well... in almost any case, with one exception...
...when the whole node is in out of memory situation. In that case, the guarantee is oomguarpages barrier (the same 125000 pages in this example). It means that no container process will be killed if memory usage by the container is less than 125000 * 4096 bytes. If it is more, some process can be killed, and failcounter of oomguarpages will be increased.
Maximum size of memory which can be allocated is defined by privvmpages (see my previous post).
Question: can we elaborate further?
With the help of your answer, we can say that the guaranteed RAM (vmguarpages) in the given example is 125000 * 4096 = 512000000 (488.28125 MB).
Further, we may also calculate privvmpages (RAM+SWAP) as 750000 * 4096 = 3072000000 (2929.6875 MB).
would I be wrong here to note that RAM and SWAP allocation ratio is about 1:5. It seems there is a wide gap betwen the two, am i right? I used to think the ratio of RAM and SWAP should usually be 1:2.
What I also fail to understand is the fact that out of the two parameters (vmguarpages & oomguarpages), only oomguarpages seems to be used up to some extent. vmguarpages seems be completely unused at all. What does this situation indicate? Or am i missing something?
Answer:
You are a bit wrong here. Let me explain it more carefully.
vmguarpages barrier is a guarantee: in any case (except out-of-memory aka OOM) you will be able to allocate 125000 pages. Well, probably you will be able to allocate more, up to 750000 pages (and if you are cool high priority process, you will be able to allocate 825000 pages, but definitely not more). We don't say anything about RAM or SWAP here. These pages can be swapped by HWnode kernel if needed, and can live in RAM if possible. Frankly, you can't say "I want this container to have X MB RAM and Y MB SWAP". There is no such parameter.
vmguarpages does not account anything, so its current usage always equals to zero.
oomguarpages parameter's 'current usage' field accounts total amount of RAM+SWAP used.
physpages parameter accounts total amount of RAM used by the container (the memory shared between containeres is divided and accounted in physpages for all containeres which use it in equal fractions). It cannot be limited, it just shows the current situation.
Let's do some math here.
1. Normal situation:
guarantee: vmguarpages barrier = 125000 pages = 488 MB.
current usage: physpages current value = 44243 physical (RAM) pages = 172 MB.
current usage (RAM + SWAP): oomguarpages current value = 59239 pages = 231 MB.
may allocate (but no garantees if allocation will be successful) up to privvmpages barrier = 750000 pages = 2929 MB, and system processes -- up to privvmpages limit = 825000 pages = 3222 MB.
2. OOM situation.
If your oomguarpages current value + sockets buffers current value + kmemsize current value is more than oomguarpages barrier, you are the bad guy here and we'll kill your process.
Question: can you illustrate the OOM situation more in detail
OK, here is the math. OOM situation on the node means that there is no free memory (or sometimes that there is no free low memory).
In that case, the system will count the sum of:
oomguarpages current value (MEM+SWAP actual usage) = 59239 pages = 242642944 bytes
socket buffers current value: 79920 + 2220 + 19552 + 0 bytes
kmemsize current value: 5125208 bytes
the sum is 247869844 bytes. If this sum is more than oomguarpages barrier = 125000 pages = 512000000 bytes (it is not), some of the processes on the container can be killed in case of OOM situation.
Question: Is there a way to know for a container account holder (me) that a server has reached an OOM situation?
'free' inside a container will show memory statistics for the hardware node (total memory, used memory, total swap, used swap). It's OK for Linux system to use as much RAM as possible, so free memory size is going to be not more than a few megabytes.
Are you aware of a script that would run as a cron job and do the OOM math on a regular basis and email me if I've hit the red zone or maybe at 85-90%?
We don't have such script but you can just take a look at the last column of /proc/user_beancounters. Nonzero values there mean that resource allocations were rejected.
Can you please elaborate the difference between oomguarpages and privvmpages?
Both seem to be showing mem + swap. Is the difference between two about allocated but not used pages? I mean, privvmpages show allocated (some of which might not be used) for mem + swap, whereas oomguarpages show allocated, all of which are used, mem + swap? ====
Answer:
Yes, oomguarpages current value shows actual usage of MEM+SWAP.
