Tuesday, November 6, 2012

Monitoring apache-status on aegir servers with nagios / check_mk

Apache server-status can produce interesting performance information that can be useful for server and application tuning, getting access to this information and graphing it with nagios is not terribly hard but add in check_mk and the Aegir platform and things get a little bit more complicated.

In the following steps I will demonstrate how to install the check_mk agent, install the check_apachestatus_auto.pl script and dependencies, add the proper stanza for server-status to aegir and finally add the check into mrpe (check_mk replacement for nrpe).

Assumptions: RedHat/CentOS/Scientific Linux, 64-bit, EPEL, Aegir, root access.

# Install check_mk agent:
*First install the check_mk agent rpm:
yum install http://mathias-kettner.de/download/check_mk-agent-1.2.0p3-1.noarch.rpm --nogpg
*We don't want just anyone to poll the data from check_mk, so modify the /etc/xinetd.d/check_mk by adding the ip of your nagios server to the 'only_from' line:
only_from     = 127.0.0.1 nagios_server_ip
*The check_mk agent operates through xinetd on port 6556, verify that xinetd will start at boot, and make sure it is currently running:
chkconfig xinetd on ; service xinetd start
*Hopefully you are running a firewall, to poke a hole in an iptables based firewall you can add a rule similar to:
-A INPUT -s nagios_server_ip -p tcp -m tcp --dport 6556 -j ACCEPT 
to /etc/sysconfig/iptables, then restart iptables with the service command:
service iptables restart
# Install the check_apachestatus_auto.pl plugin
* Install the nagios-plugins-perl rpm from EPEL, this will provide the /usr/lib64/nagios/plugins/utils.pm file, as well as creating a directory structure:
yum install nagios-plugins-perl
* Download plugin from http://blog.spreendigital.de/nagios/?#check_apachestatus_auto to /usr/lib64/nagios/plugins, modify it to find utils.pm in /usr/lib64/nagios/plugins:
wget -O /tmp/check_apachestatus_auto.tgz http://blog.spreendigital.de/wp-content/uploads/2009/07/check_apachestatus_auto.tgz
tar zxvf /tmp/check_apachestatus_auto.tgz -C /usr/lib64/nagios/plugins/
sed -i 's/\/usr\/local\/nagios\/libexec/\/usr\/lib64\/nagios\/plugins/g' /usr/lib64/nagios/plugins/check_apachestatus_auto.pl
# Modify apache to display server-status
* Typically you could enable server-status by un-commenting the correct stanza in /etc/httpd/conf/httpd.conf, but with an aegir system any get request for http://localhost/server-status will be fulfilled by aegir. If you do some digging you will fine the /var/aegir/config/server_master/apache/pre.d directory which is included before any virtual hosts, this is where you need to put a config file for server-status.
cat << EOF >> /var/aegir/config/server_master/apache/pre.d/nagios.conf
<VirtualHost *:80>
ServerName localhost
<Location /server-status>
    SetHandler server-status
    Order deny,allow
    Deny from all
    Allow from 127.0.0.1
</Location>
</VirtualHost>
EOF
* Reload apache to read your new config file:
service httpd reload
* Verify it's working using curl, this should dump the raw html from the server-status page to your screen:
curl localhost/server-status
# Setup mrpe to execute the plugin
* This is one of the easier steps, simply create /etc/check_mk/mrpe.cfg, and add a line with the check alias and location:
mkdir /etc/check_mk
echo "Apache_Status /usr/lib64/nagios/plugins/check_apachestatus_auto.pl -H localhost" >> /etc/check_mk/mrpe.cfg
# Now that we have installed the check_mk agent, the check_apachestatus_auto.pl script and the mrpe.cfg file you can re-inventory the node from your check_mk server, note the mrpe line in the following output:
cpu.loads         1 new checks
cpu.threads       1 new checks
df                7 new checks
diskstat          1 new checks
kernel            3 new checks
kernel.util       1 new checks
lnx_if            1 new checks
mem.used          1 new checks
mounts            7 new checks
mrpe              1 new checks
ntp.time          1 new checks
postfix_mailq     1 new checks
tcp_conn_stats    1 new checks
uptime            1 new checks
# Once you have reloaded nagios, check_mk will watch the server-status page and produce nice graphs like this: