Enterprise Monitoring - Nagios
From KynetxDocs
Contents |
Overview
Nagios is a host and service monitor designed to monitor and alert the Kynetx IT Operations staff of potential problems. The monitoring daemon runs intermittent checks on hosts and services using external "plugins" which return status information to Nagios. When problems are encountered, the daemon can send notifications out to the Kynetx IT Operations staff in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical logs, and reports can all be accessed via a web browser. A public portal will be made available in the coming weeks interested parties can see the health of the Kynetx Network Services in real time.
Features
Nagios has a lot of features, making it a very powerful monitoring tool. Some of the major features are listed below:
* Monitoring of network services (SMTP, POP3, HTTP, NNTP, PING, etc.) * Monitoring of host resources (processor load, disk and memory usage, running processes, log files, etc.) * Monitoring of environmental factors such as temperature * Simple plugin design that allows users to easily develop their own host and service checks * Ability to define network host hierarchy, allowing detection of and distinction between hosts that are down and those that are unreachable * Contact notifications when service or host problems occur and get resolved (via email, pager, or other user-defined method) * Optional escalation of host and service notifications to different contact groups * Ability to define event handlers to be run during service or host events for proactive problem resolution * Support for implementing redundant and distributed monitoring servers * Retention of host and service status across program restarts * Scheduled downtime for suppressing host and service notifications during periods of planned outages * Ability to acknowledge problems via the web interface * Web interface for viewing current network status, notification and problem history, log file, etc. * Simple authorization scheme that allows you restrict what users can see and do from the web interface
Screenshot
Monitored Items
| Item | Monitoring Period | Alert Type and Target |
|---|---|---|
| "Is Alive?" Ping | 24x7 | Email and Pager - IT Operations Team |
| Load Balancers | 24x7 | Email and Pager - IT Operations Team |
| KNS Rules Server Process | 24x7 | Email and Pager - IT Operations Team |
| HTTP Web Servers | 24x7 | Email and Pager - IT Operations Team |
| HTTPS Web Servers | 24x7 | Email and Pager - IT Operations Team |
| MySQL Servers | 24x7 | Email and Pager - IT Operations Team |
| SSH Process | 24x7 | Email and Pager - IT Operations Team |
| DNS Servers | 24x7 | Email and Pager - IT Operations Team |
| SMTP Servers | 24x7 | Email and Pager - IT Operations Team |
| IMAP Servers | 24x7 | Email and Pager - IT Operations Team |
| POP3 Servers | 24x7 | Email and Pager - IT Operations Team |
| ETL Run Times | 24x7 | Email and Pager - IT Operations Team / Data Warehouse Team |
| ETL Process | 24x7 | Email and Pager - IT Operations Team / Data Warehouse Team |
