Difference between revisions of "UBC Monitoring"
(New page: Monitoring the /proc/user_beancounters to see if any VEs are hitting limits is an important part of managing an OpenVZ installation. VEs that are hitting key limits can have applications m...) |
(linked to actual upstream code, instead of an outdated copy) |
||
(3 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | Monitoring | + | Monitoring /proc/user_beancounters to see if any containers are hitting limits is an important part of |
− | managing an OpenVZ installation. | + | managing an OpenVZ installation. Containers that are hitting key limits can have applications malfunction in strange ways that can only be diagnosed by realising that it is a UBC configuration issue. |
− | ways that can only be diagnosed by realising that it is a UBC configuration issue. | ||
+ | A small script, '''failcnt.py''', can run via cron on the host node and watch all VEs for failures. It emails the host node admin a failure summary. | ||
+ | |||
+ | To get the latest version using bzr, run: | ||
+ | |||
+ | bzr branch lp:tkmisc | ||
+ | cd tkmisc/vz_failcnt | ||
+ | |||
+ | It can also be downloaded from: | ||
+ | |||
+ | https://launchpad.net/tkmisc/ | ||
+ | or | ||
+ | http://bazaar.launchpad.net/~toykeeper/tkmisc/trunk/files | ||
+ | |||
+ | Then, to install it, create '''/etc/cron.hourly/vz-failcnt''', containing: | ||
+ | |||
+ | #!/bin/sh | ||
+ | /path/to/tkmisc/vz_failcnt/failcnt.py | ||
This script, when run periodically (using cron) will detect changes in the failcnt field of /proc/user_beancounters | This script, when run periodically (using cron) will detect changes in the failcnt field of /proc/user_beancounters | ||
− | |||
− | |||
[[Category: Troubleshooting]] | [[Category: Troubleshooting]] |
Latest revision as of 21:41, 20 May 2008
Monitoring /proc/user_beancounters to see if any containers are hitting limits is an important part of managing an OpenVZ installation. Containers that are hitting key limits can have applications malfunction in strange ways that can only be diagnosed by realising that it is a UBC configuration issue.
A small script, failcnt.py, can run via cron on the host node and watch all VEs for failures. It emails the host node admin a failure summary.
To get the latest version using bzr, run:
bzr branch lp:tkmisc cd tkmisc/vz_failcnt
It can also be downloaded from:
https://launchpad.net/tkmisc/ or http://bazaar.launchpad.net/~toykeeper/tkmisc/trunk/files
Then, to install it, create /etc/cron.hourly/vz-failcnt, containing:
#!/bin/sh /path/to/tkmisc/vz_failcnt/failcnt.py
This script, when run periodically (using cron) will detect changes in the failcnt field of /proc/user_beancounters