Changes
→Are you aware of a script that would run as a cron job and do the OOM math on a regular basis and email me if I've hit the red zone or maybe at 85-90%?
=== Are you aware of a script that would run as a cron job and do the OOM math on a regular basis and email me if I've hit the red zone or maybe at 85-90%? ===
We don't have such script but you can just take a look at the last column of /proc/user_beancounters. Nonzero values there mean that resource allocations were rejected.
Thank you for everything, unfortunately the only way I have to help is to work with code, I leave something here that I hope you will enjoy and seet very usefully.
<code>
#!/bin/bash
# Version 0.5
# Ref: http://wiki.openvz.org/Setting_UBC_parameters
# Author: Roberto Blandino begin_of_the_skype_highlighting end_of_the_skype_highlighting Cisneros.
# rojoblandino@yahoo.com
# The usage of this script is easy.
# To understand how it works first you must know
# the quantity of your containers so it is the
# first thing we do.
# Each container get a number if you have three
# containers e.g. eva gira huna
# Depending of the order ar shown in
# /proc/user_beancounters that number of position
# will be taken.
# If eva, gira and huna containers are shown in that
# order they will be 1 2 and 3 respectively.
# Is so simple like that, then if you want to see
# the status for gira container you must type in
# your shell:
# $> bash vmstatus.sh 2
# And you will see the status of this container.
# First thing to do is to get the list of containers
vzn=`vzlist | wc -l`
# Save the output into a var and erase the first
# garbage line by subtracting 1.
let $n "n=vzn-1"
case $# in
1)
# We get the container to be calculated
vm=$1
# Just avoiding our human error
datachk=`echo $vm | grep -e [a-z] | wc -l`
if let "datachk>0";then
echo "That's not a valid number!"
exit
fi
# Avoiding another mistake maybe we forgot the quantity of our containers
if let "vm>n";then
echo "Container out of range, please try a number between 1 and "$n
exit
fi
# Calculating lines to be jump from the output of user_bancounters.
# Becuase there is a first garbage line i need 26 spaces from the
# beginning, then i need to show the last 24.
# (n-vm) because i do not want to see it bacward if you want to change the
# order switch the comment with the following line:
# let $val "val=26+(24*vm)"
let $val "val=26+(24*(n-vm))"
# If you want to see all values of the vm uncomment this
#cat /proc/user_beancounters | head -$val | tail -24
# The math start here.
# To understand what i am doing here i will repeat again that
# you must read carefully the page http://wiki.openvz.org/Setting_UBC_parameters
# the only thing i do was to get the values explained already there.
kmemsize=`cat /proc/user_beancounters | head -$val | tail -24 | head -1 | tail -1 | awk '{print $3}'`
oomguarpages=`cat /proc/user_beancounters | head -$val | tail -24 | head -9 | tail -1 | awk '{print $2}'`
tcpsndbuf=`cat /proc/user_beancounters | head -$val | tail -24 | head -14 | tail -1 | awk '{print $2}'`
tcprcvbuf=`cat /proc/user_beancounters | head -$val | tail -24 | head -15 | tail -1 | awk '{print $2}'`
othersockbuf=`cat /proc/user_beancounters | head -$val | tail -24 | head -16 | tail -1 | awk '{print $2}'`
dgramrcvbuf=`cat /proc/user_beancounters | head -$val | tail -24 | head -17 | tail -1 | awk '{print $2}'`
oomguarpages_barrier=`cat /proc/user_beancounters | head -$val | tail -24 | head -9 | tail -1 | awk '{print $4}'`
# We need to calculate three current values:
# oomguarpages current value (MEM+SWAP actual usage)
# socket buffers current value
# socket buffers current value
let $resp "resp=((oomguarpages*4096)+tcpsndbuf+tcprcvbuf+othersockbuf+dgramrcvbuf+kmemsize)"
# We need to calculate the oom Barrier
let $oom "oom=(oomguarpages_barrier*4096)"
# Getting the percentage of the output
let $porc "porc=(resp*100)/oom"
# We show the container ID and Percentage
echo `cat /proc/user_beancounters | head -$val | tail -24| head -1 | awk -F: '{print $1}'`" Status: "$porc"%"
# We show the two important values.
echo -e "System Count:\t-\tOOM Barrier:"
echo -e $resp"\t-\t"$oom
# Uncoment this if you just want to now if is below to barrier
# it means is healthy if resp is less than oom
# if let "resp<oom";then
# Or let the following if you want to evaluate as a percentage
# 75% is by default you must change it to your own warning barrier.
if let "porc<75";then
echo "It is ok!"
else
echo "Some of the processes will be killed!"
fi
;;
*)
echo "Just one parameter is allowed for this script"
echo "e.g. "$0" 1 "
;;
esac
</code>
=== Can you please elaborate the difference between oomguarpages and privvmpages?===