Difference between revisions of "UBC Monitoring"

From OpenVZ Virtuozzo Containers Wiki
Jump to: navigation, search
m (finally figured out how to make external links work (mediawiki documentation is hard to search))
(linked to actual upstream code, instead of an outdated copy)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
Monitoring /proc/user_beancounters to see if any VEs are hitting limits is an important part of
+
Monitoring /proc/user_beancounters to see if any containers are hitting limits is an important part of
managing an OpenVZ installation. VEs that are hitting key limits can have applications malfunction in strange
+
managing an OpenVZ installation. Containers that are hitting key limits can have applications malfunction in strange ways that can only be diagnosed by realising that it is a UBC configuration issue.
ways that can only be diagnosed by realising that it is a UBC configuration issue.
 
  
 +
A small script, '''failcnt.py''', can run via cron on the host node and watch all VEs for failures.  It emails the host node admin a failure summary.
 +
 +
To get the latest version using bzr, run:
 +
 +
  bzr branch lp:tkmisc
 +
  cd tkmisc/vz_failcnt
 +
 +
It can also be downloaded from:
 +
 +
https://launchpad.net/tkmisc/
 +
or
 +
http://bazaar.launchpad.net/~toykeeper/tkmisc/trunk/files
 +
 +
Then, to install it, create '''/etc/cron.hourly/vz-failcnt''', containing:
 +
 +
  #!/bin/sh
 +
  /path/to/tkmisc/vz_failcnt/failcnt.py
  
 
This script, when run periodically (using cron) will detect changes in the failcnt field of /proc/user_beancounters
 
This script, when run periodically (using cron) will detect changes in the failcnt field of /proc/user_beancounters
 
[http://img.cs.montana.edu/linux/openvz/failcnt.py failcnt.py]
 
  
  
 
[[Category: Troubleshooting]]
 
[[Category: Troubleshooting]]

Latest revision as of 21:41, 20 May 2008

Monitoring /proc/user_beancounters to see if any containers are hitting limits is an important part of managing an OpenVZ installation. Containers that are hitting key limits can have applications malfunction in strange ways that can only be diagnosed by realising that it is a UBC configuration issue.

A small script, failcnt.py, can run via cron on the host node and watch all VEs for failures. It emails the host node admin a failure summary.

To get the latest version using bzr, run:

 bzr branch lp:tkmisc
 cd tkmisc/vz_failcnt

It can also be downloaded from:

https://launchpad.net/tkmisc/ or http://bazaar.launchpad.net/~toykeeper/tkmisc/trunk/files

Then, to install it, create /etc/cron.hourly/vz-failcnt, containing:

 #!/bin/sh
 /path/to/tkmisc/vz_failcnt/failcnt.py

This script, when run periodically (using cron) will detect changes in the failcnt field of /proc/user_beancounters