Editing Monitoring openvz resources using nagios and snmp

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 1: Line 1:
 +
 
== snmpd configuration ==
 
== snmpd configuration ==
 
Debian Etch example:
 
Debian Etch example:
Line 5: Line 6:
 
</pre>
 
</pre>
  
edit '''/etc/default/snmpd''' : remove ''-u snmp'' and replace ''127.0.0.1'' with your ip (ie : 207.46.250.119), Full'''/etc/default/snmpd''' example:
+
edit '''/etc/default/snmpd''' : remove ''-u snmp'' and repleace ''127.0.0.1'' witch your ip, Example:
 
<pre>
 
<pre>
 
export MIBDIRS=/usr/share/snmp/mibs
 
export MIBDIRS=/usr/share/snmp/mibs
Line 13: Line 14:
 
TRAPDOPTS='-Lsd -p /var/run/snmptrapd.pid'
 
TRAPDOPTS='-Lsd -p /var/run/snmptrapd.pid'
 
</pre>
 
</pre>
 
For Debian 4.x:
 
<pre>
 
export MIBDIRS=/usr/share/snmp/mibs
 
SNMPDRUN=yes
 
SNMPDOPTS='-Lsd -Lf /dev/null  -I -smux -p /var/run/snmpd.pid'
 
TRAPDRUN=no
 
TRAPDOPTS='-Lsd -p /var/run/snmptrapd.pid'
 
</pre>
 
 
Create user(my_username) and add new mib. Password need a min. of 8 charactes. Username only characters:
 
<pre>
 
/etc/init.d/snmpd stop
 
echo rouser my_username priv >> /etc/snmp/snmpd.conf
 
echo "extend  .1.3.6.1.4.1.2021.51  beancounters  /bin/cat /proc/user_beancounters" >> /etc/snmp/snmpd.conf
 
echo "extend  .1.3.6.1.4.1.2021.52  vzquota  /bin/cat /proc/vz/vzquota" >> /etc/snmp/snmpd.conf
 
echo  createUser my_username MD5 my_password DES >> /var/lib/snmp/snmpd.conf
 
/etc/init.d/snmpd start
 
</pre>
 
 
(Note that the createUser command goes into a separate file. On Centos5 this file is located in /var/net-snmp/snmpd.conf. Make sure you stop snmpd before putting the createUser command there!).
 
 
Testing snmp:
 
<pre>
 
snmpwalk  -v 3  -u my_username -l authPriv  -a MD5 -A my_password -x DES -X my_password  $(hostname -i)
 
</pre>
 
 
Warning: the minimum pass phrase length is 8 characters.
 
 
== nagios configuration ==
 
=== example nagios configuration ===
 
add to configuration:
 
<pre>
 
define command {
 
command_name check_snmp_openvz_on_port
 
# command_line /usr/local/bin/check_snmp_openvz.sh  $HOSTADDRESS$ PORT    USER    PASSWORD
 
command_line /usr/local/bin/check_snmp_openvz.sh  $HOSTADDRESS$ $ARG1$  $ARG2$  $ARG3$
 
}
 
</pre>
 
 
<pre>
 
define host {
 
        host_name  openvz-server
 
        alias      Serwer Openvz
 
        address    207.46.250.119
 
        use        generic-host
 
        contact_groups  admins
 
        }
 
</pre>
 
 
<pre>
 
define service{
 
        use                            generic-service
 
        host_name                      openvz-server
 
        service_description            Virtual Machines Limits
 
        check_command                  check_snmp_openvz_on_port!161!my_username!my_password
 
        max_check_attempts              1
 
        }
 
 
</pre>
 
 
=== nagios plugin ===
 
It is shell script:
 
<source lang="bash">
 
# cat /usr/local/bin/check_snmp_openvz.sh
 
#!/bin/bash
 
HOST=$1
 
PORT=$2
 
USER=$3
 
PASS=$4
 
export FILE=/tmp/$HOST.beancounters
 
RET=0
 
 
DATA_TMP=`snmpwalk  -v 3  -u $USER -l authPriv  -a MD5 -A $PASS -x DES -X $PASS $HOST:$PORT .1.3.6.1.4.1.2021.51.4`
 
if [ "$?" != "0" ]; then
 
        echo "Unknown snmp error"
 
        exit 1
 
fi
 
 
DATA=`echo "$DATA_TMP"| perl -ne '/"(.*)"/ ; print "$1\n" ;'`
 
 
if [ -f $FILE ]; then
 
echo "$DATA" | perl  -n -e'
 
use Data::Dumper;
 
my $file=$ENV{"FILE"};
 
my $ret=0 ;
 
my $vid ;
 
my $resource ;
 
my $held ;
 
my $maxheld ;
 
my $barrier ;
 
my $limit ;
 
my $failcnt ;
 
my %beancounters ;
 
my %beancounters_old ;
 
while(<STDIN>){
 
        my %vmachine;
 
        if ( /\D*(\d+):.*/ ){ $vid=$1; $beancounters{$vid}=\%vmachine ; }
 
        if ( /^[\W\d]+([a-z]+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+).*/ ) {
 
                $resource=$1 ;
 
                $held=$2 ;
 
                $maxheld=$3 ;
 
                $barrier=$4 ;
 
                $limit=$5 ;
 
                $failcnt=$6 ;
 
                ${beancounters{$vid}}{$resource}=[$held , $maxheld , $barrier , $limit ,$failcnt ];
 
                if ( ($held  > $barrier) && ($barrier != 0) ) {
 
                        print "WARNING: Limits on $vid: $resource  held->$held , barrier->$barrier ( limit->$limit ) " ;
 
                        $ret=1;
 
                }
 
        }
 
}
 
 
# read and parse old data
 
open(MYINPUTFILE, "<$file");
 
while(<MYINPUTFILE>){
 
        my %vmachine;
 
        if ( /\D*(\d+):.*/ ){ $vid=$1; $beancounters_old{$vid}=\%vmachine ; }
 
        if ( /^[\W\d]+([a-z]+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+).*/ ) {
 
                $resource=$1 ;
 
                $held=$2 ;
 
                $maxheld=$3 ;
 
                $barrier=$4 ;
 
                $limit=$5 ;
 
                $failcnt=$6 ;
 
                ${beancounters_old{$vid}}{$resource}=[$held , $maxheld , $barrier , $limit ,$failcnt ];
 
        }
 
}
 
 
foreach my $vmachine_id (keys %beancounters) {
 
        foreach my $resource (keys %{$beancounters{$vmachine_id}} ) {
 
                if ( defined($beancounters{$vmachine_id}{$resource}[4]) && defined($beancounters_old{$vmachine_id}{$resource}[4]) ){
 
                        my $failcnt=$beancounters{$vmachine_id}{$resource}[4];
 
                        my $failcnt_old=$beancounters_old{$vmachine_id}{$resource}[4];
 
                        my $held=$beancounters{$vmachine_id}{$resource}[0];
 
                        my $maxheld=$beancounters{$vmachine_id}{$resource}[1];
 
                        my $barrier=$beancounters{$vmachine_id}{$resource}[2];
 
                        my $limit=$beancounters{$vmachine_id}{$resource}[3];
 
                        if ( $failcnt_old < $failcnt ){
 
                                print "CRITICAL: Incrased failcnt  $vmachine_id: $resource from $failcnt_old to $failcnt (held->$held , maxheld->$maxheld , barrier->$barrier , limit->$limit ) " ;
 
                                $ret=2;
 
                        }
 
                }
 
        }
 
 
}
 
 
# if ($ret == 0 ) { print "Ok. \n" ; }
 
# print Dumper(%beancounters_old) ;
 
# print "\n";
 
exit($ret);
 
'
 
 
RET1=$?
 
fi
 
 
echo "$DATA" > $FILE
 
#####################################################################################
 
######### quota check
 
#####################################################################################
 
 
DATA=`snmpwalk  -v 3  -u $USER -l authPriv  -a MD5 -A $PASS -x DES -X $PASS $HOST:$PORT .1.3.6.1.4.1.2021.52.4 \
 
|  perl -ne '/"(.*)"/ ; print "$1\n" ;'`
 
 
if [ "$?" != "0" ]; then
 
        echo "Unknown snmp error"
 
        exit 1
 
fi
 
 
 
echo "$DATA" | perl  -n -e'
 
my $vid ;
 
my $ret=0 ;
 
while(<STDIN>){
 
        my %vid;
 
        if ( /\D*(\d+):.*/ ){ $vid=$1; }
 
        if ( /\s*(\S+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+).*/ ){
 
                $resource=$1 ;
 
                $usage=$2 ;
 
                $softlimit=$3 ;
 
                $hardlimit=$4 ;
 
                $time=$5 ;
 
                $expire=$6 ;
 
                if ( $usage >= $softlimit ){
 
                        print "WARNING: VZquota limit exceeded on $vid: $resource  usage->$usage, softlimit->$softlimit, hardlimit->$hardlimit, time->$time, expire->$expire  " ;
 
                        $ret=1;
 
                }
 
        }
 
}
 
exit($ret);
 
'
 
RET2=$?
 
 
#####################################################################################
 
########### return
 
#####################################################################################
 
 
if [ $RET1 -gt $RET2 ]; then
 
        RET=$RET1
 
        else
 
        RET=$RET2
 
fi
 
 
if [  $RET  = 0  ]; then
 
        echo Ok.
 
fi
 
exit $RET
 
</source>
 
 
=== check_vzquota Without SNMP ===
 
<source lang="bash">
 
#!/bin/bash
 
RET=0
 
DATA=`echo;sudo /usr/sbin/vzlist -1 2>/dev/null | xargs -I {} bash -c "echo {}:;sudo /usr/sbin/vzquota stat {} | sed 's/\*//g'"`
 
if [ -z "$DATA" ]; then
 
        VPS_err=$(sudo /usr/sbin/vzlist -1 2>&1 1>/dev/null)
 
        if [ -n "$VPS_err" ] && [ "$VPS_err" == "Container(s) not found" ]; then
 
                echo "OK - $VPS_err";
 
                exit 0;
 
        else
 
                if [ -n "$VPS_err" ]; then
 
                        echo "UNKNOWN - Error: $VPS_err";
 
                else
 
                        echo "UNKNOWN - VZquota stats are not readable or empty. Maybe it is only readable for root and this script should be called by sudo.";
 
                fi
 
                exit 3;
 
        fi
 
fi
 
 
echo "$DATA" | perl  -n -e'
 
my $vid ;
 
my $ret=0 ;
 
my $crit="";
 
my $warn="";
 
my $ok="";
 
while(<STDIN>){
 
        my %vid;
 
        if ( /^(\d+):.*/ ){ $vid=$1; }
 
        if ( /\D*(\d+):.*/ ){ $vid=$1; }
 
        if ( /\s*(\S+)\s+(\d+)\s+(\d+)\s+(\d+).*/ ){
 
                $resource=$1 ;
 
                $usage=$2 ;
 
                $softlimit=$3 ;
 
                $hardlimit=$4 ;
 
                if ( $usage >= $hardlimit ){
 
                        $crit=$crit."VZquota limit exceeded on $vid: $resource  usage->$usage, softlimit->$softlimit, hardlimit->$hardlimit, time->$time, expire->$expire  " ;
 
                        $ret=2;
 
                } elsif ( $usage >= $softlimit ){
 
                        $warn=$warn."VZquota limit exceeded on $vid: $resource  usage->$usage, softlimit->$softlimit, hardlimit->$hardlimit, time->$time, expire->$expire  " ;
 
                        $ret=1;
 
                }
 
                $ok=$ok."$vid:$resource $usage/$softlimit\n";
 
        }
 
}
 
if ($ret == 0) {
 
        print "OK - click on service-link for details...\n$ok";
 
} elsif ($ret == 1)  {
 
        print "WARNING - $warn\n";
 
} else {
 
        print "CRITICAL - $crit\n";
 
}
 
exit($ret);
 
'
 
RET=$?
 
exit $RET
 
</source>
 
The script calls <code>/usr/sbin/vzlist</code> by sudo. When doing this it normally needs a password, which check_nrpe will not know. Because of this it is necessary that you append a line like the following to <code>/etc/sudors</code> (user name an path should be adapted to the right ones on your system):
 
nagios  ALL=NOPASSWD: /usr/sbin/vzlist, /usr/sbin/vzquota
 
 
=== check_ubc Without SNMP ===
 
<source lang="bash">
 
#!/bin/bash
 
# Servicestate description can have a http-link to the openvz-wiki
 
# in case that a ressource is warning/critical. To use it:
 
# 1. set "escape_html_tags=0" in nagios/etc/cgi.cfg
 
# 2. set "my $linked=1;" in the first perl lines in this script
 
#
 
export FILE=/tmp/check_ubc
 
RET=0
 
ubc_file='/proc/user_beancounters';
 
DATA='';
 
if [ -r $ubc_file ]; then
 
        DATA=`cat $ubc_file`
 
fi
 
if [ -z "$DATA" ]; then
 
        echo "UNKNOWN - $ubc_file is not readable or empty. Maybe it is only readable for root and this script should be called by sudo.";
 
        exit 3;
 
fi
 
 
if [ -f $FILE ]; then
 
echo "$DATA" | perl  -n -e'
 
use Data::Dumper;
 
my $linked=1;  # 0:plain text output, 1:resourcename is a http-link to OpenVZ-wiki
 
my $file=$ENV{"FILE"};
 
my $ret=0 ;
 
my $vid ;
 
my $resource ;
 
my $held ;
 
my $maxheld ;
 
my $barrier ;
 
my $limit ;
 
my $failcnt ;
 
my %beancounters ;
 
my %beancounters_old ;
 
while(<STDIN>){
 
        my %vmachine;
 
        if ( /\D*(\d+):.*/ ){ $vid=$1; $beancounters{$vid}=\%vmachine ; }
 
        if ( /^[\W\d]+([a-z]+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+).*/ ) {
 
                $resource=$1 ;
 
                $held=$2 ;
 
                $maxheld=$3 ;
 
                $barrier=$4 ;
 
                $limit=$5 ;
 
                $failcnt=$6 ;
 
                ${beancounters{$vid}}{$resource}=[$held , $maxheld , $barrier , $limit ,$failcnt ];
 
                if ( ($held  > $barrier) && ($barrier != 0) ) {
 
                        print "WARNING: Limits on $vid: ".&url($resource,$linked)."  held->$held , barrier->$barrier ( limit->$limit ) " ;
 
                        $ret=1;
 
                }
 
                                #print "$vid:$resource $held Barrier:$barrier ";
 
        }
 
}
 
 
# read and parse old data
 
open(MYINPUTFILE, "<$file");
 
while(<MYINPUTFILE>){
 
        my %vmachine;
 
        if ( /\D*(\d+):.*/ ){ $vid=$1; $beancounters_old{$vid}=\%vmachine ; }
 
        if ( /^[\W\d]+([a-z]+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+).*/ ) {
 
                $resource=$1 ;
 
                $held=$2 ;
 
                $maxheld=$3 ;
 
                $barrier=$4 ;
 
                $limit=$5 ;
 
                $failcnt=$6 ;
 
                ${beancounters_old{$vid}}{$resource}=[$held , $maxheld , $barrier , $limit ,$failcnt ];
 
        }
 
}
 
 
foreach my $vmachine_id (keys %beancounters) {
 
        foreach my $resource (keys %{$beancounters{$vmachine_id}} ) {
 
                if ( defined($beancounters{$vmachine_id}{$resource}[4]) && defined($beancounters_old{$vmachine_id}{$resource}[4]) ){
 
                        my $failcnt=$beancounters{$vmachine_id}{$resource}[4];
 
                        my $failcnt_old=$beancounters_old{$vmachine_id}{$resource}[4];
 
                        my $held=$beancounters{$vmachine_id}{$resource}[0];
 
                        my $maxheld=$beancounters{$vmachine_id}{$resource}[1];
 
                        my $barrier=$beancounters{$vmachine_id}{$resource}[2];
 
                        my $limit=$beancounters{$vmachine_id}{$resource}[3];
 
                        if ( $failcnt_old < $failcnt ){
 
                                print "CRITICAL: Increased failcnt  $vmachine_id: ".url($resource,$linked)." from $failcnt_old to $failcnt (held->$held , maxheld->$maxheld , barrier->$barrier , limit->$limit ) " ;
 
                                $ret=2;
 
                        }
 
                                                #print "$vmachine_id: Old_Failcnt: $failcnt_old Failcnt: $failcnt \n";
 
                }
 
        }
 
 
}
 
sub url {
 
        my ($name,$with_link) = @_;
 
        if ($with_link) {
 
                return "<a target=\"_blank\" href=\"http://wiki.openvz.org/".$name."#".$name."\">$name</a>";
 
        } else {
 
                return $name;
 
        }
 
}
 
if ($ret == 0 ) { print "OK: All bean counters fine \n" ; }
 
# print Dumper(%beancounters_old) ;
 
# print "\n";
 
exit($ret);
 
'
 
 
RET=$?
 
fi
 
 
echo "$DATA" > $FILE
 
exit $RET
 
</source>
 
The script needs to read the <code>/proc/user_beancounters</code> file. This is normally only readable for root. Because of this it is necessary that you append a line like the following to <code>/etc/sudors</code> (user name an path should be adapted to the right ones on your system):
 
nagios  ALL=NOPASSWD: /usr/local/nagios/libexec/check_ubc
 
 
Also don't forget to consider this on your <code>nrpe.cfg</code>, so that you call the script with sudo:
 
command[check_ubc]=sudo /usr/local/nagios/libexec/check_ubc
 
 
[[Category: Monitoring]]
 

Please note that all contributions to OpenVZ Virtuozzo Containers Wiki may be edited, altered, or removed by other contributors. If you don't want your writing to be edited mercilessly, then don't submit it here.
If you are going to add external links to an article, read the External links policy first!

To edit this page, please answer the question that appears below (more info):

Cancel Editing help (opens in new window)