Please note this guide is going to assume you have a basic understanding of virtual machines and Linux. If you’re unfamiliar, I will be linking to guides for everything, it may be worth reading through the basic setup guides for installing Linux inside of VirtualBox before you start.

Also please note that most of the ideas in this guide comes from Jeremy Thelen, I just had the time to type it up and make the VM

What is Nagios?

Nagios is an open source network monitoring tool, that lives in a Linux server. It can be either a physical server if you have one, or inside of a VM, and is accessed with a web gui. It can keep track of the status of most devices on your network, organize them by type, and keep logs so that you can know when something went down, or came back. In the case of Clear-Com Gear, for the most part I’m just going to demonstrate pings as the one module to use, but there are dozens of supported actions, including port availability for HTTP, SSH, SNMP monitoring, and more. For switches for example, it can give you all kinds of stats for usage, link status, etc.

Using this app, you can get a snapshot of the network status your system from any PC on the network, because it’s a web gui. As you can see here, I have my Nagios instance pinging my Eclipse Frame’s CPU NIC, each network card in my frame, both of my LQs, and my Netgear switch. Devices can be organized into groups for easy sorting, and it’s giving me close to real time updates about devices. When you are done, your dashboard should include your whole system, whether it be an Eclipse with IPAs, LQs, IPTs, etc, if it has a NIC, we can add it.

This can help you troubleshoot a particular device dropping off the network, find out how long a device has been up, or have confidence in a remote device staying connected over a long period of time. Network Monitoring software is a common tool used by IT professionals to see their system at a glance. There are many, arguably easier, to use options that aren’t free and open source. Nagios itself has a paid version called Nagios XI, but for our purposes, Nagios Core is simple, free, and gives us an easy way to track the network status of the entire system over time. But fair warning, if you’re not used to dealing with Linux and looking at JSON like configuration files, this could be a challenge.

In this demo, I’m going to show you how to build this config, with just some simple hosts organized by category and with ping, HTML, and SNMP as the only modules I’m using.

Setup

If you’d like to skip setting it up from scratch, you can download my Ubuntu image here, that has Nagios already installed and set to a DHCP address. You might be lacking some context on how to configure the application for your specific system though, especially if you have not used Linux or Virtualization software before. [The password to the one user login is admin]

https://clearcom.ftpstream.com/download/OFCqla5fPzeF7CBnGbXQ/misc files/Greg's VM's/nagios.ova

Using a VM is probably the best way to set this up on an existing system. I use VirtualBox and prefer Ubuntu for the Linux distro, since it tends to be the easiest to use, but most of the common distros will work.

https://ubuntu.com/tutorials/how-to-run-ubuntu-desktop-on-a-virtual-machine-using-virtualbox#1-overview

A guide to installing an Ubuntu VM inside of VirtualBox can be found here. Both are free and open source.
Personally, I find giving your VM a specific NIC to use, as opposed to using a bridged mode adapter, tends to be easier to run these web gui based services. That way you can give your VM a static IP so it’s not moving around on DHCP every time you spin it down and back up. You do that inside of VirtualBox after installing the Ubuntu Image, under settings for the VM. Most of the time, I will use a USB nic, give it a static IP inside of windows, and then attach it to the VM. That way, I know where all my VM traffic is coming from, and what its IP is going to be.

Once you’ve completed the Ubuntu install inside of VirtualBox, follow the Nagios Quickstart guide for Ubuntu

https://support.nagios.com/kb/article/nagios-core-installing-nagios-core-from-source-96.html#Ubuntu

Just open up terminal and copy and paste each line in sequence from this quickstart guide. Just make sure to take the correct line for the version of Ubuntu you’re using, probably 20.xx if you followed the guide above. Make sure your VM is connected to the internet before you start. Once you’ve run everything and have started the nagios service inside of linux, you should be able to see the Nagios web gui at the IP address of your VM with /nagios appended afterwards, ie 192.168.1.1/nagios, and an “It works!” Apache test page at 192.168.1.1 . If you don’t know the IP of your VM, the ‘ip add’ command in terminal can tell you.

Configuration

Nagios Core Manual - Table of Contents

It’s worth going through the basics of the manual before continuing!

By default, all that you will see in the GUI on a clean install of Nagios is a localhost pointing at the linux VM itself. In order to configure it to your system, we need to get our hands dirty in the CFGs that control what gets displayed in the web gui. You can find these at /usr/local/nagios. There you’ll find nagios.cfg, the primary config file, and the objects folder, where the hosts themselves get defined.

For a longer explanation, see here: https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4/en/objectdefinitions.html

For this demo, and because all I want is a dashboard that sends pings, not much of anything else, I’m going to be using the localhost.cfg file inside of the objects folder, and just put my entire system in that in that. Why did I chose the localhost.cfg file? Only because by default, that’s the one CFG that isn’t commented out in the nagios.cfg file, the primary config. There are several .cfgs in that folder which can be used to further subdivide the system, so that you can have specific services run for specific devices. If all you’re looking for is some categories of hosts and just a couple of services, like ping, check html etc, that will get you all the way there. This is using a fraction of a fraction of the functionality of Nagios, and I’d highly recommend reading through the documentation linked above to see more.

The File Permissions Issue

If you’ve never dealt with linux you might not be familiar with groups and permissions issues. By default, if you made a user when you installed ubuntu and then followed the guide, you will not have the ability to edit these .cfg files.

Here’s a decent explanation of the chown command and what to do with it. I ‘777’d my objects folder and my nagios.cfg file (making it so any user can make changes), so that I could make changes easily. Please be careful if this is something on an unsecure network, and take reasonable security precautions when messing around with file permissions.

Now that we have that out of the way….

below is my localhost.cfg file for this demo:

###############################################################################
#
# HOST DEFINITION
#
###############################################################################

# Define a host for the local machine


define host {

	use                 	linux-server       	; Name of host template to use
                                                	; This host definition will inherit all variables that are defined
                                                	; in (or inherited by) the linux-server host template definition.
	host_name           	ecl
	alias               	Eclipse CPU
	address             	10.50.1.11
}

define host {

	use                 	linux-server       	; Name of host template to use
                                                	; This host definition will inherit all variables that are defined
                                                	; in (or inherited by) the linux-server host template definition.
	host_name           	eque
	alias               	ECLIPSE E-QUE
	address             	10.50.1.21
}
define host {

	use                 	linux-server       	; Name of host template to use
                                                	; This host definition will inherit all variables that are defined
                                                	; in (or inherited by) the linux-server host template definition.
	host_name           	ivc
	alias               	Eclipse IVC
	address             	10.50.1.22
}

define host {

	use                 	linux-server       	; Name of host template to use
                                                	; This host definition will inherit all variables that are defined
                                                	; in (or inherited by) the linux-server host template definition.
	host_name           	ipa
	alias               	Eclipse IPA
	address             	10.50.1.23
}

define host {

	use                 	linux-server       	; Name of host template to use
                                                	; This host definition will inherit all variables that are defined
                                                	; in (or inherited by) the linux-server host template definition.
	host_name           	lq1
	alias               	LQ 1 - SIP
	address             	10.50.1.31
}
define host {

	use                 	linux-server       	; Name of host template to use
                                                	; This host definition will inherit all variables that are defined
                                                	; in (or inherited by) the linux-server host template definition.
	host_name           	lq2
	alias               	LQ2 - Gen-IC
	address             	10.50.1.32
}
###############################################################################
#
# HOST GROUP DEFINITION
#
###############################################################################

# Define an optional hostgroup for Linux machines

define hostgroup {

	hostgroup_name      	cc-fr       	; The name of the hostgroup
	alias               	CLEAR COM FRAME       	; Long name of the group
	members             	ipa,ecl,eque,ivc,           	; Comma separated list of hosts that belong to this group
}

define hostgroup {

	hostgroup_name      	cc-lq      	; The name of the hostgroup
	alias               	CLEAR COM LQs       	; Long name of the group
	members             	lq1,lq2,           	; Comma separated list of hosts that belong to this group
}

###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################

# Define a service to "ping" the local machine

define service {

	use                 	local-service       	; Name of service template to use
	host_name           	ipa,ecl,eque,ivc,lq1,lq2
	service_description 	PING
	check_command       	check_ping!100.0,20%!500.0,60%
}


# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.

define service {

	use                 	local-service       	; Name of service template to use
	host_name           	ipa,lq1,
	service_description 	HTTP
	check_command       	check_http
	notifications_enabled   0
}

This code is pretty human-readable, so I won’t go too deep, but what I have done here is define hosts with their IPs, in this case my Eclipse cards and my LQs. This could be expanded just by copy and pasting the syntax and changing the name and IP. The Host Group definition defines categories that will display in the web gui as you see above. The Service Definition defines which functions we want to do to our hosts, and which hosts in this file we want to do which function to. The standard use for this .cfg file is for linux servers, so if you installed this from scratch, you’ll have way more services in here than just check http and ping. I just got rid of all the ones I didn’t need or want. These services reference plugins in the /usr/local/nagios/libexec folder. The quickstart guide has you install the basic 65 plugins, but there are many many more available if you are looking to do something in particular. I’m using the check ping and html services. The only difference between the two is that the check_html service looks to see if it can get to port 80.

Every plugin, including something simple like check_ping, has settings that can be changed by adding lines to the check command.

You can see above, the only configuration for the plugin here are which values cause ‘warning’ and ‘critical’ messages in the web gui, but there are many others that could be changed if I wanted to.

For this demo, all I’ve done is enter the IPs of all of my devices, name them as hosts, and add them to host groups as you can see. If you take my ubuntu image above, this should be all you need to change. Make sure to restart the nagios service after you make your changes to any .cfg in order to cause it to update. You can do that from the web gui or with the line in terminal:

sudo systemctl restart nagios.service

My choice to use the localhost.cfg is more or less arbitrary, it just happens that when nagios is installed most of the cfgs are commented out in the nagios.cfg file (meaning they won’t display), the “main” configuration. I have allowed localhost and switch cfgs into my setup, and you can see that here, in a portion of the nagios.cfg file:

# OBJECT CONFIGURATION FILE(S)
# These are the object configuration files in which you define hosts,
# host groups, contacts, contact groups, services, etc.
# You can split your object definitions across several config files
# if you wish (as shown below), or keep them all in a single config file.

# You can specify individual object config files as shown below:
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg

# Definitions for monitoring the local (Linux) host
cfg_file=/usr/local/nagios/etc/objects/localhost.cfg

# Definitions for monitoring a Windows machine
#cfg_file=/usr/local/nagios/etc/objects/windows.cfg

# Definitions for monitoring a router/switch
cfg_file=/usr/local/nagios/etc/objects/switch.cfg

# Definitions for monitoring a network printer
#cfg_file=/usr/local/nagios/etc/objects/printer.cfg

As you can see, anything that is commented out with a # will not appear. The only .cfg that are in the live config are switch.cfg and localhost.cfg.

Switch Monitoring via SNMP

So far, I have showed the localhost.cfg file, and how I made the host groups ‘cc-fr’ and ‘cc-lq’. The last group you can see in the screenshot above of my finished demo is the switch category, which comes from that switch.cfg file.

As you can see, configuration of the switch monitoring side is very similar to the cfg before, the only difference is that we’ll be capturing SNMP data from the switch. Just like before, you’ll need to enter the IP of the switch, and define what services run based on the hostname defined at the top. The “check snmp” plugin is where things get interesting. Everything that produces SNMP has a tree that we’ll need values out of. Each variable, called a MIB, will have a value, called an OID. The easiest way to find, in the case above, what the OID value of link status is, would be to use a MIB browser. Any will work, but here is the one I used:

In order for this to work, SNMP needs to be enabled at the switch. In my case, on a Netgear M4250, I had to enable reading SNMP values from ‘0.0.0.0’, which allows any host on the switch to get SNMP messages.

All I’ve set up in this config is link status for every interface on my switch, and uptime. Link status (or ifOperStatus, as netgear calls it) is a MIB that has two possible values, 1 for up, 2 for down. As you’ve probably already guessed, that long string of numbers next to check_command in the switch.cfg represents where in the SNMP tree that MIB is (one for each interface). In my case, using a Netgear M4250, 1.3.6.1.2.1.2.2.1.8 is the OID for getting to the link status “folder”, and then each link is represented by the last integer. The “-r 1” after that MIB tells Nagios that the value 1, which we know means up, is the “good” value, and any other value returned will be “critical” in nagios as such:

You can get more complicated with SNMP values, to see more, read here:

There is one service in there you may have noticed doesn’t have a -r 1 value after the MIB, and that is for Uptime. That’s because there is no “bad value” for it where it would need to show as “critical”, I just want to know how long the switch has been up. I would encourage you to use the MIB browser to look at what information your switch can provide via SNMP. Anything in that tree can be put into your nagios config, and just like in the configuration for the other hosts above, you can track multiple switches just by adding another in the host definition field, and then adding a second hostname on every service line.

Using the GUI

Once you’ve gotten it set up, using the gui is pretty easy, and there are youtube tutorials for almost every page.

https://www.youtube.com/watch?v=jUDrjgEDb2A&t=1s

My go-to page for looking at the system at a glance is the host groups page, screenshotted above. This categorizes everything based on host groups and give you a summary of alerts on each item you have.

I would also point you to the host alert history page, which you can get to by clicking on a host and then “View Alert History for this Host”. You can see logs for as long as your nagios vm is online, and sort by type. This can be a powerful tool in your pocket for troubleshooting intermittent network issues.

In Conclusion, this is a very complex application which, particularly if you’re not familiar with Linux, can be intimidating. But take it from me, and I’m not a linux guy at all: once you figure out how to configure the system, it will be an extremely powerful tool to troubleshoot and monitor your network, for Clear-Com gear, and anything else.

Solution Finder

Nagios - A Free, Open Source IP Network Monitoring Tool