Health Modules
Health modules, or health tests, test for the criteria you specify in a health policy.
There are two types of health module: alert and metrics. Alert modules (sometimes called legacy modules) monitor system infrastructure and report health status only. When the conditions specified in the health policy for these monitored systems are met, these modules raise health alerts. Metrics modules (sometimes called telegraf modules) collect statistics (sometimes called time series data) that you can view on the health monitoring dashboard. You can create custom dashboards with your preferred health metrics, allowing you to monitor statistics or troubleshoot appliance health issues.
Module |
Type |
Description |
||
---|---|---|---|---|
AMP Connection Status |
Metrics |
The module alerts if the device cannot connect to the AMP cloud or Cisco AMP Private Cloud after an initial successful connection, or if the private cloud cannot contact the public AMP cloud. Disabled by default. |
||
AMP Threat Grid Connectivity |
Metrics |
The module alerts if the device cannot connect to the AMP Threat Grid cloud after an initial successful connection. |
||
ASP Drop |
Metrics |
Monitors the connections dropped by the data plane accelerated security path. |
||
Automatic Application Bypass |
Alert |
Monitors bypassed detection applications. |
||
Chassis Environment Status |
Alert |
Monitors chassis parameters such as fan speed and chassis temperature, and enables you to set a warning threshold and critical threshold for temperature. The Critical Chassis Temperature (Celsius) default value is |
||
Cluster/HA Failover Status |
Alert |
For threat defense clusters, alerts when a unit joins, leaves, or is elected primary. |
||
Configuration Resource Utilization |
Alert |
Alerts if the size of your deployed configurations puts a device at risk of running out of memory. The alert shows you how much memory your configurations require, and by how much this exceeds the available memory. If this happens, reevaluate your configurations. You may be able to reduce the number or complexity of access control rules or intrusion policies. |
||
Connection Statistics |
Metrics |
Monitors connection statistics and NAT translation counts. |
||
CPU Usage (per core) |
Metrics |
Alerts when CPU core use exceeds a configurable threshold. |
||
Critical Process Statistics |
Metrics |
Monitors the state of critical processes, their resource consumption, and the restart counts. |
||
CPU Usage Date Plane |
Metrics |
Alerts when data plane CPU use exceeds a configurable threshold. |
||
Memory Usage Data Plane |
Metrics |
Alerts when data plane memory use exceeds a configurable threshold. |
||
Deployed Configuration Statistics |
Metrics |
Monitors statistics about the deployed configuration, such as the number of ACEs and IPS rules. |
||
Disk Status |
Alert |
Alerts if there is an issue with the hard disk or RAID controller. If this module alerts, contact Cisco TAC. This will prevent upgrade. |
||
Disk Usage |
Metrics |
This module compares disk usage on the appliance’s hard drive to the limits configured for the module and alerts when usage exceeds the thresholds configured for the module. This module also alerts when the system excessively deletes files in monitored disk usage categories, or when disk usage excluding those categories reaches excessive levels, based on module thresholds. Use the Disk Usage health status module to monitor disk usage for the |
||
File System Integrity Check |
Alert |
This module performs a file system integrity check and runs if the system has CC mode or UCAPL mode enabled, or if the system runs an image signed with a DEV key. |
||
Firewall Threat Defense HA |
Alert |
Alerts if a threat defense high availability pair is split brain. |
||
Firewall Threat Defense Platform Faults |
Alert |
Monitors Secure Firewall 1000/3100/4200 platform faults and generate health alerts for the faults. A platform fault represents a failure in the threat defense instance or an alarm threshold that has been raised. During the lifecycle of a platform fault, it can change from one state or severity to another. Each fault includes information about the operational state of the affected object at the time the fault was raised. If the fault is transitional and the failure is resolved, then the object transitions to a functional state. For more information, see the Cisco Firepower 1000/2100 FXOS Faults and Error Messages Guide. |
||
Flow Offload Statistics |
Metrics |
Monitors hardware flow offload. |
||
Hardware Alarms |
Alert |
This module determines if hardware needs to be replaced on a physical managed device and alerts based on the hardware status. It also reports on the status of hardware-related daemons. |
||
Inline Link Mismatch Alarms |
Alert |
Alerts if inline pair interfaces negotiate different speeds. |
||
Interface Status |
Alert |
Determines if the device currently collects traffic and alerts based on the traffic status of physical interfaces and aggregate interfaces. For physical interfaces, the information includes interface name, link state, and bandwidth. For aggregate interfaces, the information includes interface name, number of active links, and total aggregate bandwidth.
|
||
Intrusion and File Event Rate |
Alert |
Alerts if intrusion events per second exceed a configurable threshold. We recommend a warning threshold of 1.5 times your average intrusion event rate, and a critical threshold of 2.5 times. For example, for an average event rate on network segment of 20 events per second, we recommend a warning value of 30 and a critical value of 50. The critical limit must be lower than1000, and higher than the warning limit. Event rates for your devices are available on System ( |
||
Link State Propagation |
Alert |
For the ISA 3000, alerts when an interface in a inline set fails. |
||
Memory Usage |
Alert |
Alerts when memory use exceeds configurable thresholds. For appliances with more than 4 GB of memory, the preset alert thresholds are based on a formula that accounts for proportions of available memory likely to cause system problems. On >4 GB appliances, because the interval between Warning and Critical thresholds may be very narrow, its recommended that you manually set the Warning Threshold % value to Complex access control policies and rules can command significant resources and negatively affect performance. |
||
Network Card Reset |
Alert |
Alerts when a network card restarts due to hardware failure. |
||
NTP Statistics |
Metrics |
Monitors NTP synchronization status. Disabled by default. |
||
Firewall Management Center Access Configuration Changes |
Alert |
Monitors configuration changes made on the management center directly using the configure network management-data-interface command. This module alerts when there is a conflict between the existingmanagement center configuration and the out of band configuration changes made. |
||
Power Supply |
Alert |
Monitors the management center power supply units and alerts if any fault is detected or if it requires replacement. |
||
Process Status |
Alert |
Alerts when processes on the appliance exit or terminate outside of the process manager. If a process is deliberately exited outside of the process manager, the module status changes to Warning and the health event message indicates which process exited, until the module runs again and the process has restarted. If a process terminates abnormally or crashes outside of the process manager, the module status changes to Critical and the health event message indicates the terminated process, until the module runs again and the process has restarted. |
||
Routing Statistics |
Metrics |
Monitors the current state of routing table. |
||
Snort 3 Statistics |
Metrics |
Collects Snort 3 statistics for events, flows, and packets. |
||
CPU Usage Snort |
Metrics |
This module checks that the average CPU usage of the Snort processes on the device is not overloaded and alerts when CPU usage exceeds the percentages configured for the module. The Warning Threshold % default value is |
||
Snort Identity Memory Usage |
Alert |
Enables you to set a warning threshold for Snort identity processing and alerts when memory usage exceeds the level configured for the module. The Critical Threshold % default value is This health module specifically keeps track of the total space used for the user identity information in Snort. It displays the current memory usage details, the total number of user-to-IP bindings, and user-group mapping details. Snort records these details in a file. If the memory usage file is not available, the Health Alert for this module displays Waiting for data. This could happen during a Snort restart due to a new install or a major update, switch from Snort 2 to Snort 3 or back, or major policy deployment. Depending on the health monitoring cycle, and when the file is available, the warning disappears, and the health monitor displays the details for this module with its status turned Green. |
||
Memory Usage Snort |
Metrics |
This module checks the percentage of allocated memory used by the Snort process and alerts when memory usage exceeds the percentages configured for the module. The Warning Threshold % default value is |
||
Snort Reconfiguring Detection |
Metrics |
Alerts if a device reconfiguration has failed. This module detects reconfiguration failure for both Snort 2 and Snort 3 instances. |
||
Snort Statistics |
Metrics |
Monitors Snort statistics for events, flows, and packets. |
||
SSE Connection Status |
Metrics |
The module alerts if the device cannot connect to the security services exchange cloud after an initial successful connection. Disabled by default. |
||
CPU Usage System |
Metrics |
This module checks that the average CPU usage of all system processes on the device is not overloaded and alerts when CPU usage exceeds the percentages configured for the module. The Warning Threshold % default value is |
||
Threat Data Updates on Devices |
Alert |
Certain intelligence data and configurations that devices use to detect threats are updated on the management center from the cloud every 30 minutes. This module alerts you if this information has not been updated on the devices within the time period you have specified. Monitored updates include:
By default, this module sends a warning after 1 hour and a critical alert after 24 hours. If this module indicates failure on the management center or on any devices, verify that the management center can reach the devices. |
||
VPN Statistics |
Metrics |
Monitors site-to-site and remote access VPN tunnels between threat defense devices. |
||
XTLS Counters |
Metrics |
Monitors XTLS/SSL flows, memory and cache effectiveness. Disabled by default. |
Module |
Type |
Description |
||
---|---|---|---|---|
AMP for Endpoints Status |
Alert |
The module alerts if the management center cannot connect to the AMP cloud or Cisco AMP Private Cloud after an initial successful connection, or if the private cloud cannot contact the public AMP cloud. It also alerts if you deregister an AMP cloud connection using the Secure Endpoint management console. |
||
AMP for Firepower Status |
Alert |
Alerts if:
If your management center loses connectivity to the Internet, the system may take up to 30 minutes to generate a health alert. |
||
Appliance Heartbeat |
Alert |
This module determines if an appliance heartbeat is being heard from the appliance and alerts based on the appliance heartbeat status. |
||
CPU Usage (per core) |
Metrics |
This module checks that the CPU usage on all the cores is not overloaded and alerts when CPU usage exceeds the thresholds configured for the module. The Warning Threshold % default value is |
||
Critical Process Statistics |
Metrics |
Monitors the state of critical processes, their resource consumption, and the restart counts. |
||
Dynamic Attributes Connector |
||||
Database |
Alert |
Alerts if the configuration database size is too big. It also monitors the system for database schema or configuration data (sometimes called EO) integrity issues. If this module alerts, contact Cisco TAC. This will prevent upgrade. |
||
Discovery Host Limit |
Alert |
This module determines if the number of hosts the management center can monitor is approaching the limit and alerts based on the warning level configured for the module. For more information, see Host Limit. |
||
Disk Status |
Alert |
This module examines the performance of the hard disk and malware storage pack (if installed) on the appliance. This module generates a Warning (yellow) health alert when the hard disk and RAID controller (if installed) are in danger of failing, or if an additional hard drive is installed that is not a malware storage pack. This module generates an Alert (red) health alert when an installed malware storage pack cannot be detected. |
||
Disk Usage |
Metrics |
This module compares disk usage on the appliance’s hard drive and malware storage pack to the limits configured for the module and alerts when usage exceeds the thresholds configured for the module. This module also alerts when the system excessively deletes files in monitored disk usage categories, or when disk usage excluding those categories reaches excessive levels, based on module thresholds. Use the Disk Usage health status module to monitor disk usage for the |
||
eStream Status |
Alert |
Monitors connections to third-party client applications that use the Event Streamer on the management center. |
||
Event Backlog Status |
Alert |
Alerts if the backlog of event data awaiting transmission from the device to the management center has grown continuously for more than 30 minutes. To reduce the backlog, evaluate your bandwidth and consider logging fewer events. |
||
Event Monitor |
Metrics |
This module monitors overall incoming event rate to management center. |
||
File System Integrity Check |
Alert |
This module performs a file system integrity check and runs if the system has CC mode or UCAPL mode enabled, or if the system runs an image signed with a DEV key. This module is enabled by default. |
||
Firewall Management Center HA Status |
Alert |
Monitors management center high availability. This module generates alerts if the HA pairs are not synchronized and if there is a discrepancy in the number of managed devices between the active and standby units. |
||
Hardware Statistics |
Metrics |
Monitors management center hardware: fan speed, temperature, and power supply. Alerts when values exceed configurable thresholds. |
||
Health Monitor Process |
Alert |
Monitors the health process itself, and alerts if there have been no health events in some number of minutes (configurable). |
||
ISE Connection Monitor |
Alert |
This module monitors the status of the server connections between the Cisco Identity Services Engine (ISE) and the management center. ISE provides additional user data, device type data, device location data, SGTs (Security Group Tags), and SXP (Security Exchange Protocol) services. |
||
License Monitor |
Alert |
This module monitors expiration of Classic licenses. |
||
Local Malware Analysis |
Alert |
This module monitors ClamAV updates for Local Malware Analysis. |
||
Memory Usage |
Alert |
This module compares memory usage on the appliance to the limits configured for the module and alerts when usage exceeds the levels configured for the module. When calculating the memory usage, the management center Memory Usage health module monitors and includes the usage of RAM, swap memory, and cache memory. For appliances with more than 4 GB of memory, the preset alert thresholds are based on a formula that accounts for proportions of available memory likely to cause system problems. On >4 GB appliances, because the interval between Warning and Critical thresholds may be very narrow, its recommended that you manually set the Warning Threshold % value to Beginning with Version 6.6.0, the minimum required RAM for management center virtual upgrades to Version 6.6.0+ is 28 GB, and the recommended RAM for management center virtual deployments is 32 GB. We recommend you do not decrease the default settings: 32 GB RAM for most management center virtual instances, 64 GB for the management center virtual 300 (VMware only).
Complex access control policies and rules can command significant resources and negatively affect performance. |
||
MySQL Statistics |
Metrics |
Monitors the status of the MySQL database, including the database size, number of active connections, and memory use. |
||
Process Status |
Alert |
Alerts when processes on the appliance exit or terminate outside of the process manager. If a process is deliberately exited outside of the process manager, the module status changes to Warning and the health event message indicates which process exited, until the module runs again and the process has restarted. If a process terminates abnormally or crashes outside of the process manager, the module status changes to Critical and the health event message indicates the terminated process, until the module runs again and the process has restarted. |
||
RabbitMQ Status |
Metrics |
Monitors and collects RabbitMQ statistics. |
||
Realm |
Alert |
Allows you to set a warning threshold for realm or user mismatches, which are:
For more information, see . This module also displays health alerts when you try to download more users than the maximum number of downloaded users supported per realm. The maximum number of downloaded users for a single realm depends on your management center model. For more information, see User Limit in the Cisco Secure Firewall Management Center Device Configuration Guide |
||
RRD Server Process |
Alert |
Alerts if the round robin data (RRD) server that stores time series data has restarted since the last time it updated. You can configure additional warning and critical thresholds for consecutive restarts. |
||
Security Intelligence |
Alert |
Alerts if Security Intelligence is in use and the management center cannot update a feed, or feed data is corrupt or contains no recognizable IP addresses. See also the Threat Data Updates on Devices module. |
||
Smart License Monitor |
Alert |
Monitors Smart Licensing status and alerts if:
|
||
Threat Data Updates on Devices |
Alert |
Certain intelligence data and configurations that devices use to detect threats are updated on the management center from the cloud every 30 minutes. This module alerts you if this information has not been updated on the devices within the time period you have specified. Monitored updates include:
By default, this module sends a warning after 1 hour and a critical alert after 24 hours. If this module indicates failure on the management center or on any devices, verify that the management center can reach the devices. |
||
Time Series Data (RRD) Monitor |
Alert |
This module tracks the presence of corrupt files in the directory where time series data (such as correlation event counts) are stored and alerts when files are flagged as corrupt and removed. |
||
Time Synchronization Status |
Alert |
This module tracks the synchronization of a device clock that obtains time using NTP with the clock on the NTP server and alerts if the difference in the clocks is more than ten seconds. |
||
Unresolved Groups Monitor |
Alert |
Monitors Foreign Security Principals (FSPs) that are groups used in policies. Security principals are Active Directory objects, like authenticated user groups, to which security can be applied in access control policies. This module generates a warning alert for unresolved groups that exist but are not used in policies, and a critical alert for unresolved groups that are used in policies. |
||
URL Filtering Monitor |
Alert |
Monitors connectivity with the Cisco cloud, which is required for downloading URL filtering data and doing URL filtering lookups. |
||
VPN Tunnel Status |
Alert |
Alerts when VPN tunnels are down. Supported for both remote access and site-to-site VPN. |
||
Zero-Touch Provisioning |
Alert |
Alerts if there is a failure when registering a device using the serial number. It also shows errors related to zero-touch provisioning capable management centers in high availability. |