Enterprise Monitoring - Nagios

From KynetxDocs

Jump to: navigation, search

Contents

Overview

Nagios is a host and service monitor designed to monitor and alert the Kynetx IT Operations staff of potential problems. The monitoring daemon runs intermittent checks on hosts and services using external "plugins" which return status information to Nagios. When problems are encountered, the daemon can send notifications out to the Kynetx IT Operations staff in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical logs, and reports can all be accessed via a web browser. A public portal will be made available in the coming weeks interested parties can see the health of the Kynetx Network Services in real time.

Features

Nagios has a lot of features, making it a very powerful monitoring tool. Some of the major features are listed below:

   * Monitoring of network services (SMTP, POP3, HTTP, NNTP, PING, etc.)
   * Monitoring of host resources (processor load, disk and memory usage, running processes, log files, etc.)
   * Monitoring of environmental factors such as temperature
   * Simple plugin design that allows users to easily develop their own host and service checks
   * Ability to define network host hierarchy, allowing detection of and distinction between hosts that are down and those that are unreachable
   * Contact notifications when service or host problems occur and get resolved (via email, pager, or other user-defined method)
   * Optional escalation of host and service notifications to different contact groups
   * Ability to define event handlers to be run during service or host events for proactive problem resolution
   * Support for implementing redundant and distributed monitoring servers
   * Retention of host and service status across program restarts
   * Scheduled downtime for suppressing host and service notifications during periods of planned outages
   * Ability to acknowledge problems via the web interface
   * Web interface for viewing current network status, notification and problem history, log file, etc.
   * Simple authorization scheme that allows you restrict what users can see and do from the web interface

Screenshot

Nagios Screenshot

Monitored Items

Item Monitoring Period Alert Type and Target
"Is Alive?" Ping 24x7 Email and Pager - IT Operations Team
Load Balancers 24x7 Email and Pager - IT Operations Team
KNS Rules Server Process 24x7 Email and Pager - IT Operations Team
HTTP Web Servers 24x7 Email and Pager - IT Operations Team
HTTPS Web Servers 24x7 Email and Pager - IT Operations Team
MySQL Servers 24x7 Email and Pager - IT Operations Team
SSH Process 24x7 Email and Pager - IT Operations Team
DNS Servers 24x7 Email and Pager - IT Operations Team
SMTP Servers 24x7 Email and Pager - IT Operations Team
IMAP Servers 24x7 Email and Pager - IT Operations Team
POP3 Servers 24x7 Email and Pager - IT Operations Team
ETL Run Times 24x7 Email and Pager - IT Operations Team / Data Warehouse Team
ETL Process 24x7 Email and Pager - IT Operations Team / Data Warehouse Team
Personal tools