I assume you have already installed and configured Nagios on the Nagios monitoring server. If not follow the instructions here. Once your Nagios server is ready you 'll need to follow these steps to monitor your network infrastructure.
1. Enable Switch configuration file in Nagios.cfg
Edit the nagios configuration file, unckeck switch.cfg.
# vim /usr/local/nagios/etc/nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/switch.cfg
2. Define hosts for Switch/Router/Firewall
Open the configuration file and change the host_name, alias, and address fields to appropriate values for the switch.
# vim /usr/local/nagios/etc/objects/switch.cfg
# Define the switch that we'll be monitoring
define host{
use generic-switch ; Inherit default values from a template
host_name catalyst-4500 ; The name we're giving to this switch
alias Cisco Catalyst 4500 Switch ; A longer name associated with the switch
address 192.168.1.195 ; IP address of the switch
hostgroups switches ; Host groups this switch is associated with
3. Monitoring services for Switch/Router/Firewall
Add the following service definition to monitor packet loss and round trip average between the Nagios host and the switch every 5 minutes under normal conditions.
# Create a service to PING to switch
define service{
use generic-service ; Inherit values from a template
host_name catalyst-4500 ; The name of the host the service is associated with
service_description PING ; The service description
check_command check_ping!200.0,20%!600.0,60% ; The command used to monitor the service
normal_check_interval 5 ; Check the service every 5 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every minute until its final/hard state is determined
}
CRITICAL if the round trip average (RTA) is greater than 600 milliseconds or the packet loss is 60% or more
WARNING if the RTA is greater than 200 ms or the packet loss is 20% or more
OK if the RTA is less than 200 ms and the packet loss is less than 20%
host_name catalyst-4500
service_description Uptime
check_command check_snmp!-C public -o sysUpTime.0
}
# Monitor Port 1 status via SNMP
define service{
use generic-service ; Inherit values from a template
host_name catalyst-4500
service_description Port 1 Link Status
check_command check_snmp!-C public -o ifOperStatus.1 -r 1 -m RFC1213-MIB
}
Repeat this procedure for router as well. To monitor firewall you'll need to download the appropriate plugin and define the services. If you are using Cisco ASA you can download the plugin from here.
4. Monitor your Bandwidth
You need to install MRTG if you want to monitor bandwidth usage on your switches or routers. You can set the alert when traffic rates exceed thresholds you specify. You need to use check_mrtgtraf plugin for this. The MRTG log file mentioned below should point to the MRTG log file on your system.
# Monitor bandwidth via MRTG logs
define service{
use generic-service ; Inherit values from a template
host_name catalyst-4500
service_description Port 1 Bandwidth Usage
check_command check_local_mrtgtraf!/var/lib/mrtg/192.168.1.195_1.log!AVG!1000000,1000000!5000000,5000000!10
}
In the example above, the "/var/lib/mrtg/192.168.1.195_1.log" option that gets passed to the check_local_mrtgtraf command tells the plugin which MRTG log file to read from. The "AVG" option tells it that it should use average bandwidth statistics. The "1000000,2000000" options are the warning thresholds (in bytes) for incoming traffic rates. The "5000000,5000000" are critical thresholds (in bytes) for outgoing traffic rates. The "10" option causes the plugin to return a CRITICAL state if the MRTG log file is older than 10 minutes (it should be updated every 5 minutes).
5. Verify configuration and restart Nagios.
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
# /etc/init.d/nagios restart
Stopping nagios: [ OK ]
Starting nagios: [ OK ]
Note 1:
If you want to monitor all the ports of the switch then make an entry of all the ports while defining the services.
check_command check_snmp!-C public -o ifOperStatus.1 -r 1 -m RFC1213-MIB, -o ifOperStatus.2 -r 1 -m RFC1213-MIB, -o ifOperStatus.3 -r 1 -m RFC1213-MIB ...
Note 2:
You can monitor your router/firewall using SNMP if you know the object identifier (OID) for the router/firewall, which you can find using snmpwalk.
# snmpwalk -v1 -c public 192.168.1.205 -m ALL .1, where 192.168.1.205 is the ip address of your router/firewall.
Note 3:
You can monitor your remote linux/windows host using SNMP, but I'm not sure of reliability of SNMP. One reason is SNMP is based on less secure UDP and the other is there is no acknowledgement defined for snmp traps.
Note 4:
There are few occasions we prefer UDP over TCP, especially when we don't require any acknowledgement or few packet loss doesn't make any difference.
1. used for broadcast and multicast, as TCP doesn't support broadcast/multicast.
2. faster, there is no acknowledgement defined, and no need to resend the lost packets makes UDP faster and is widely used for videoconferencing.