Debian Clusters for Education and Research: The Missing Manual

Creating Your Own Nagios Plugin

From Debian Clusters

Jump to: navigation, search

Contents

The Good News

Creating your own custom Nagios plugin turns out to be surprisingly easy. This is a selling point for Nagios: it's easily customizable and extensible. As far as I can tell, you can implement your script in just about any language (Bash has worked well for simple ones for me), keeping in mind only a few minor points:

  • This script will be run by the nagios user, so it can't run anything requiring root privileges.
  • The script must execute locally, unless you're implementing this with the NRPE Nagios client plugin.
  • The script exit with specific values:
    • 0 - all ok
    • 1 - warning
    • 2 - critical
    • 3 - unknown

Implementing your plugin is basically a three-step process: writing the plugin, adding the commands for the plugin, and then adding the service(s) for the plugin.

Writing the Plugin

As I stated above, you can use any language you want to create your plugin, as far as I know, as long as it can print to standard output (like Bash's echo or C's printf) and as long as it can return with an integer value (0, 1, 2, or 3). For one of my simpler plugins, I decided to write a script to check whether the pbs_mom on any given node was up and available. I wrote a little script in Bash to do this: Nagios Pbs_Mom Plugin.

Notice that my script expects the hostname of the target node to be given to it. Nagios will take care of that part for us when we set up the command. Notice also that my script ends with different values depending on the status of the service (the pbs_mom service, in this case).

Nagios keeps all the basic plugins in /usr/lib/nagios/plugins/, and I like to put mine in /usr/lib/nagios/plugins/local/, but you can put yours whenever you want. You'll be specifying the full path to the plugin later, anyway.

When you're writing the plugin, remember that you need to indicate to Nagios what kind of result you found by returning 0 (ok), 1 (warning), 2 (critical), or 3 (unknown). Standard output from the program will also be captured and displayed in the web interface, so it's helpful to write a little message about what the success/error was.

Adding the Plugin Commands to Nagios

Each one of the built-in plugins comes with a file describing its "commands", the name you put under check_command in a services directive. (If this doesn't make sense to you, you might want to check out the Nagios installation and configuration tutorial). For instance, the ping service runs a command called check_ping, which is defined in /etc/nagios-plugins/config/ping.cfg. You'll need a create a new file in this same directory. You can name it whatever you want, as long as it ends in .cfg, but make sure you choose a name that correlates well with whatever the plugin will monitor.

A plugin can have more than one possible command. For each command, you'll need an entry in the .cfg file with this format:

define command {
        command_name <command name>
        command_line <full path to plugin> $HOSTADDRESS$ 
}

My command for the pbs_mom plugin looks like this:

define command {
        command_name check_pbsmom
        command_line /usr/lib/nagios/plugins/local/check_pbsmom $HOSTADDRESS$
}

You can add extra arguments as needed with whatever flags they need, as well. For instance, one of the default Nagios plugins has a command defined in ping.cfg as shown below:

# 'check_ping' command definition
define command{
        command_name    check_ping
        command_line    /usr/lib/nagios/plugins/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$
}

Adding the Plugin Service(s) to Nagios

Finally, the command can be implemented as a service in Nagios. You can put this under /etc/nagios2/conf.d/services_nagios2.cfg or your own custom-made .cfg file in the same directory. At any rate, you'll need to define a service with the syntax shown below:

define service {
        hostgroup_name <hostgroup to check>
        service_description <short description>
        check_command <command>
        use generic-service
}
  • hostgroup_name is where you list all of the hostgroups that this check should be performed on. You can have more than one hostgroup listed as long as they're comma-separated.
  • service_description is a short (a few letters) name of the service. These are typically uppercase by convention... things like FTP, WEB, MPD, or such. These will be displayed in the Nagios web interface.
  • check_command is the command you defined in the previous section of this tutorial.
  • use generic-service means that this should use the template called generic-service. To further customize it, you can write your own template.

My service definition for the pbs_mom plugin looks like this:

define service {
        hostgroup_name raptor_nodes
        service_description PBSMOM
        check_command check_pbsmom
        use generic-service
}

Following up with the more arguments example, the built-in ping service example looks like this:

# check that ping-only hosts are up
define service {
        hostgroup_name                  ping-servers
        service_description             PING
        check_command                   check_ping!100.0,20%!500.0,60%
        use                             generic-service
        notification_interval           0 ; set > 0 if you want to be renotified
}

Notice the arguments are separated with exclamation marks.

Restarting Nagios

At this point, you're ready to restart Nagios with

/etc/init.d/nagios2 restart

If the restart fails, check the file on the line-number it gives you at the beginning of the error. Once it starts correctly, open up the web interface and make sure you're getting the correct output. If you see a response "out of bounds", make sure your command definition is accessing the plugin at the correct place!

Restarting Nagios

At this point, you're ready to restart Nagios with

/etc/init.d/nagios2 restart

If the restart fails, check the file on the line-number it gives you at the beginning of the error. Once it starts correctly, open up the web interface and make sure you're getting the correct output. If you see a response "out of bounds", make sure your command definition is accessing the plugin at the correct place!

References

Personal tools