Pages

Sunday, July 5, 2015

Nagios - Basic Tutorial



Nagios is a powerful monitoring system that enables organizations to identify and resolve Infrastructure problems before they affect the business process.

Nagios is a open source network monitoring solution. It can be used from simply checking to see if a network host is still up, all the way up to monitoring specific services on remote hosts, and even to trigger corrective action if a problem is detected. And nagios uses mail, phone, fax, pager etc to alert the issues that are seen.

Nagios periodically polls the agent on remote system using the plugins. NRPE (Nagios Remote Plugin Executor) allows you to remotely execute Nagios plugins on other Linux/Unix machines. This allows you to monitor remote machine metrics (disk usage, CPU load, etc.).
In this article we will see how to configure Nagios and perform the basic monitoring of a remote system.

1) Install Nagios from Source
  Download the latest nagios source from :  http://prdownloads.sourceforge.net/sourceforge/nagios
 extract the source and cd to the extracted location and execute the below commands in  sequence
 ./configure --with-nagios-group=nagios --with-command-group=nagcmd 
 make all
 make install
 make install-commandline
 make install-init
 make install-config
      make install-webconf   

Nagios uses GUI console using HTTP web server. We use the console to configure sources , hosts and monitor them.

The last "make install-webconf" will install the nagios.conf file into the /etc/httpd/conf location which will be used for nagios web console

Once the nagios configuration is done, we need to perform some basic post installation scripts,

Nagios Configuration Changes
Configure Nagios Config Directory for configuration files. Uncomment the cfg_dir in the nagios configuration file nagios.cfg ( installed in /usr/local/nagios/etc/nagios.conf)
cfg_dir=/usr/local/nagios/etc/servers

Once the uncommenting is done , create the directory for the above location
mkdir /usr/local/nagios/etc/servers

Apache Server Modifications
Nagios will need to set up a directory that requires authentication and some modifications to the cgi-scripts.  These changes will be found in a file located in the /etc/httpd/conf.d directory called nagios.conf. 
The following are the changes that need to be done to the nagios configuration,
1) Uncomment the Order, Deny and Allow elements in both directory listings
2) uncomment the AuthUserFile which helps in the authentication of the users to the web console. we configure the user in the next step
 AuthUserFile /usr/local/nagios/etc/htpasswd.users

Nagios User
The next step is to configure user ID for the nagios web console which we will use to login to the web console. Use the below command,
[root@localhost conf.d]# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
New password:
Re-type new password:
Adding password for user nagiosadmin

We created the use nagiosadmin for logging to the the nagios web console. Once all the configuration is done , the nagios console can be accessed using
Access the Link using - http://<IP address>:<port>/nagios/

2) Install Nagios Plugins 
In order to run Nagios , we need some more things like nagios plugins. Nagios plugins are stand-alone extensions to Nagios core that provide the low-level intelligence on how to monitor anything and everything with Nagios core.

Plugins process command-line arguments, go about the business of performing a specific type of check, and then return the results to Nagios Core for further processing. Plugins can either be compiled binaries (written in C, C++, etc) or executable scripts (shell, Perl, PHP, etc).

           Download the latest source of nagios plugin from http://nagios-plugins.org/download/
Extract and cd to the extracted location and run the below commands
./configure --with-nagios-user=nagios --with-nagios-group=nagios --with-openssl
make
make install
  
3) Install NRPE - Nagios Remote Plugin Executor 
Lets take and instance, it is easy enough to monitor whether a HTTP or SMTP service are available or not  on a remote machine but how can we determine whether the disk is running out of space , or whether the load average was raised. These things cannot be easily determined without having local access to the system. One way to accomplish is to write our shell script likes check_by_ssh command, but an even better way to do so is with the Nagios Remote Plugin Executor (NRPE) daemon.

What NRPE does is run checks on a system remote from the central Nagios server, allowing Nagios to query it as if the checks were run locally. in generally Nagios talks to NRPE, asks it to run a specific check, waits for the response, and logs it along with everything else it watches. These are checks that could only be run locally: checking the number of users, load average, disk space usage, available memory, whether the local system can query DNS, and so on. In this case of  NRPE's function the overhead is much smaller, making it faster and more efficient.

Download the latest source of nagios nrpe from http://sourceforge.net/projects/nagios/files/
Extract and cd to the extracted location and run the below commands
./configure --enable-command-args --with-nagios-user=nagios --with-nagios-group=nagios --with-openssl=/usr/bin/openssl --with-ssl-lib=/usr/lib/x86_64-linux-gnu
make all
make install
make install-xinetd
make install-daemon-config

Configure Xinetd
We use xinetd (extended Internet daemon) is an open-source super-server daemon which runs on many Unix-like systems and manages Internet-based connectivity.
Once the installation of NRPE is done , we need to perform some basic post installation operations
Open the file "/etc/xinetd.d/nrpe"
# default: on
# description: NRPE (Nagios Remote Plugin Executor)
service nrpe
{
         flags           = REUSE
         socket_type     = stream   
         port              = 5666   
        wait            = no
        user            = nagios
        group             = nagios
        server          = /usr/local/nagios/bin/nrpe
        server_args     = -c /usr/local/nagios/etc/nrpe.cfg --inetd
        log_on_failure  += USERID
        disable         = no
        only_from       = 127.0.0.1 172.16.202.96
}

Add the current IP address to the only_from element after the 127.0.0.1
Next, open /etc/services file add the following entry for the NRPE daemon at the bottom of the file with the IP address like
nrpe            5666/tcp                 NRPE

Configure Firewall Rules 
Make sure that the Firewall on the local machine will allow the NRPE daemon to be accessed from remote servers. To do this, run the following iptables command.

[root@tecmint]# iptables -A INPUT -p tcp -m tcp --dport 5666 -j ACCEPT

Run the following command to Save the new iptables rule so it will survive at system reboots.

[root@tecmint]# service iptables save

By this the installation is complete, we can now
service httpd restart
service xinetd restart
service nagios restart

Basic Checks
Once the starting of services is done with no issues , we can perform basics checks to make sure services are up and running fine

1) access the web console using http://<IP address>:<Port>/nagios to make sure we can enter the login credentials to access the nagios console

2) Make sure the nrpe port is active using
[root@localhost logs]# netstat -at | grep nrpe
tcp6       0      0 [::]:nrpe               [::]:*                  LISTEN    

3) verify the NRPE daemon is functioning properly. Run the “check_nrpe” command that was installed earlier for testing purposes.

[root@localhost logs]# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
NRPE v2.15

The above is the same location where nagios was installed.

4) Check the nagios HTTPD configuration using
 [root@localhost etc]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Nagios Core 4.1.0rc1
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 02-18-2015
License: GPL

Website: http://www.nagios.org
Reading configuration data...
   Read main config file okay...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
          Checked 13 services.
          Checked 2 hosts.
          Checked 1 host groups.
          Checked 0 service groups.
          Checked 1 contacts.
          Checked 1 contact groups.
          Checked 25 commands.
          Checked 5 time periods.
          Checked 0 host escalations.
          Checked 0 service escalations.
Checking for circular paths...
          Checked 2 hosts
          Checked 0 service dependencies
          Checked 0 host dependencies
          Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check

Now if we access the nagios web-console which will look like this,


Customize NRPE commands 
The default NRPE configuration file that got installed has several command definitions that will be used to monitor the local machine. The sample configuration file located at.

[root@localhsot]# vi /usr/local/nagios/etc/nrpe.cfg

from the file, we can use the commands like,

[root@localhost etc]# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -c check_users
USERS OK - 3 users currently logged in |users=3;5;10;0

[root@localhost etc] /usr/local/nagios/libexec/check_nrpe -H localhost -c check_load
OK - load average: 3.90, 4.37, 3.94|load1=3.900;15.000;30.000;0; load5=4.370;10.000;25.000;0; load15=3.940;5.000;20.000;0;

[root@localhost etc] /usr/local/nagios/libexec/check_nrpe -H localhost -c check_hda1
DISK OK - free space: /boot 154 MB (84% inode=99%);| /boot=29MB;154;173;0;193

[root@localhost etc]# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -c check_load
OK - load average: 0.03, 0.07, 0.12|load1=0.030;15.000;30.000;0; load5=0.070;10.000;25.000;0; load15=0.120;5.000;20.000;0;

[root@localhost etc]# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -c check_total_procs
PROCS WARNING: 198 processes | procs=198;150;200;0;

[root@localhost etc]# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1 -c check_zombie_procs
PROCS OK: 0 processes with STATE = Z | procs=0;5;10;0;

Verify NRPE Daemon Remotely 
Make sure that the check_nrpe plugin can communicate with the NRPE daemon on the remote Linux host. Add the IP address in the command below with the IP address of your Remote Linux host.

[root@localhost etc]# /usr/local/nagios/libexec/check_nrpe -H 172.16.202.96
NRPE v2.15

Adding Remote Linux Host to Nagios Monitoring Server 
To add a remote host you need to create a two new files “hosts.cfg” and “services.cfg” under “/usr/local/nagios/etc/” location.

[root@tecmint]# cd /usr/local/nagios/etc/
[root@tecmint]# touch hosts.cfg
[root@tecmint]# touch services.cfg

Now add these two files to main Nagios configuration file. Open nagios.cfg file with any editor.

[root@tecmint]# vi /usr/local/nagios/etc/nagios.cfg

Now add the two newly created files as shown below.

# You can specify individual object config files as shown below:
cfg_file=/usr/local/nagios/etc/hosts.cfg
cfg_file=/usr/local/nagios/etc/services.cfg

[root@localhost etc]# cat hosts.cfg
define host{
name                            linux-box               ; Name of this template
use                             generic-host            ; Inherit default values
check_period                    24x7       
check_interval                  5      
retry_interval                  1      
max_check_attempts              10     
check_command                   check-host-alive
notification_period             24x7   
notification_interval           30     
notification_options            d,r    
contact_groups                  admins 
register                        0                       ; DONT REGISTER THIS - ITS A TEMPLATE
}

## Default
define host{
use                             linux-box               ; Inherit default values from a template
host_name                       vx111a.jas.com          ; The name we're giving to this server
alias                           RHEL 7                  ; A longer name for the server
address                         172.16.202.96           ; IP address of Remote Linux host
}

  
[root@localhost etc]# cat services.cfg
define service{
        use                     generic-service
        host_name               vx111a.jas.com
        service_description     CPU Load
        check_command           check_nrpe!check_load
        }

define service{
        use                     generic-service
        host_name               vx111a.jas.com
        service_description     Total Processes
        check_command           check_nrpe!check_total_procs
        }

define service{
        use                     generic-service
        host_name               vx111a.jas.com
        service_description     Current Users
        check_command           check_nrpe!check_users
        }

define service{
        use                     generic-service
        host_name               vx111a.jas.com
        service_description     SSH Monitoring
        check_command           check_nrpe!check_ssh
        }

define service{
        use                     generic-service
        host_name               vx111a.jas.com
        service_description     FTP Monitoring
        check_command           check_nrpe!check_ftp
        }
Now NRPE command definition needs to be created in commands.cfg file.

[root@tecmint]# vi /usr/local/nagios/etc/objects/commands.cfg

define command{
        command_name check_nrpe
        command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
        }

Now we can see the remote machine is monitored from the nagios web console as


While working with nagios , there were a couple of Packages needed for the internal working which are installed as

yum install openssl-devel*
yum install xinetd.x86_64*
yum install xinetd.x86_64*
yum install php.x86_64*


More to come, Happy Learning

5 comments :

  1. Thanks , I have recently been searching for information approximately this subject for a while and yours is the best I have discovered till now.
    But, what in regards to the conclusion? Are you positive concerning
    the source?

    my weblog; Furnace Camarillo

    ReplyDelete
  2. Nagios periodically polls the agent on remote system using the plugins.Thanks for helping me to understand basic concepts. As a beginner in DevOps, your post helps me a lot. best devops training in chennai | DevOps training in Chennai omr | DevOps training in Chennai with placement

    ReplyDelete
  3. When someone writes an article he/she maintains the ieea of a user in his/her mind that hoow
    a user can undrstand it. So that's whyy thiis post
    iss outstdanding. Thanks!

    ReplyDelete
  4. Wow that was unusual. I just wrote an extremely long commejt but after I clicked submit my commnt didn't shuow up.
    Grrrr... well I'm not writing all tthat ovesr again. Anyhow, just wanted to say excellent blog!

    ReplyDelete
  5. Hi! This post could not be written aany better!
    Reading this post reminds me of my good old room mate!

    He always kept talking about this. I will forward this write-up
    to him. Fairly certain he will have a good read. Thanks for sharing!

    ReplyDelete