Health Modules

Health modules, or health tests, test for the criteria you specify in a health policy.

There are two types of health module: alert and metrics. Alert modules (sometimes called legacy modules) monitor system infrastructure and report health status only. When the conditions specified in the health policy for these monitored systems are met, these modules raise health alerts. Metrics modules (sometimes called telegraf modules) collect statistics (sometimes called time series data) that you can view on the health monitoring dashboard. You can create custom dashboards with your preferred health metrics, allowing you to monitor statistics or troubleshoot appliance health issues.

Device Health Modules

Module

Type

Description

AMP Connection Status

Metrics

The module alerts if the device cannot connect to the AMP cloud or Cisco AMP Private Cloud after an initial successful connection, or if the private cloud cannot contact the public AMP cloud. Disabled by default.

AMP Threat Grid Connectivity

Metrics

The module alerts if the device cannot connect to the AMP Threat Grid cloud after an initial successful connection.

ASP Drop

Metrics

Monitors the connections dropped by the data plane accelerated security path.

Automatic Application Bypass

Alert

Monitors bypassed detection applications.

Chassis Environment Status

Alert

Monitors chassis parameters such as fan speed and chassis temperature, and enables you to set a warning threshold and critical threshold for temperature. The Critical Chassis Temperature (Celsius) default value is 85. The Warning Chassis Temperature (Celsius) default value is 75.

Cluster/HA Failover Status

Alert

For threat defense clusters, alerts when a unit joins, leaves, or is elected primary.

Configuration Resource Utilization

Alert

Alerts if the size of your deployed configurations puts a device at risk of running out of memory.

The alert shows you how much memory your configurations require, and by how much this exceeds the available memory. If this happens, reevaluate your configurations. You may be able to reduce the number or complexity of access control rules or intrusion policies.

Connection Statistics

Metrics

Monitors connection statistics and NAT translation counts.

CPU Usage (per core)

Metrics

Alerts when CPU core use exceeds a configurable threshold.

Critical Process Statistics

Metrics

Monitors the state of critical processes, their resource consumption, and the restart counts.

CPU Usage Date Plane

Metrics

Alerts when data plane CPU use exceeds a configurable threshold.

Memory Usage Data Plane

Metrics

Alerts when data plane memory use exceeds a configurable threshold.

Deployed Configuration Statistics

Metrics

Monitors statistics about the deployed configuration, such as the number of ACEs and IPS rules.

Disk Status

Alert

Alerts if there is an issue with the hard disk or RAID controller. If this module alerts, contact Cisco TAC. This will prevent upgrade.

Disk Usage

Metrics

This module compares disk usage on the appliance’s hard drive to the limits configured for the module and alerts when usage exceeds the thresholds configured for the module. This module also alerts when the system excessively deletes files in monitored disk usage categories, or when disk usage excluding those categories reaches excessive levels, based on module thresholds.

Use the Disk Usage health status module to monitor disk usage for the / and /volume partitions on the appliance and track draining frequency. Although the disk usage module lists the /boot partition as a monitored partition, the size of the partition is static so the module does not alert on the boot partition.

File System Integrity Check

Alert

This module performs a file system integrity check and runs if the system has CC mode or UCAPL mode enabled, or if the system runs an image signed with a DEV key.

Firewall Threat Defense HA

Alert

Alerts if a threat defense high availability pair is split brain.

Firewall Threat Defense Platform Faults

Alert

Monitors Secure Firewall 1000/3100/4200 platform faults and generate health alerts for the faults.

A platform fault represents a failure in the threat defense instance or an alarm threshold that has been raised. During the lifecycle of a platform fault, it can change from one state or severity to another. Each fault includes information about the operational state of the affected object at the time the fault was raised. If the fault is transitional and the failure is resolved, then the object transitions to a functional state. For more information, see the Cisco Firepower 1000/2100 FXOS Faults and Error Messages Guide.

Flow Offload Statistics

Metrics

Monitors hardware flow offload.

Hardware Alarms

Alert

This module determines if hardware needs to be replaced on a physical managed device and alerts based on the hardware status. It also reports on the status of hardware-related daemons.

Inline Link Mismatch Alarms

Alert

Alerts if inline pair interfaces negotiate different speeds.

Interface Status

Alert

Determines if the device currently collects traffic and alerts based on the traffic status of physical interfaces and aggregate interfaces. For physical interfaces, the information includes interface name, link state, and bandwidth. For aggregate interfaces, the information includes interface name, number of active links, and total aggregate bandwidth.

Note

This module also monitors the high availability standby device traffic flow. Though it is known that the standby device would not be receiving any traffic yet, the management center alerts that the interface is not receiving any traffic. The same alerting principle is applied when traffic is not received by some of the subinterfaces on a port channel.

If you use the show interface CLI command to know the interface statistics of your device, the input and output rates in the CLI command result can be different from the traffic rates that appear in the interface module.

This module displays the traffic rates according to the values from Lina. The sampling intervals of Lina and the management center interface statistics are different. Due to the difference in sampling interval, throughput values in the management center GUI can be different from the throughput values appears in the device CLI result.

Intrusion and File Event Rate

Alert

Alerts if intrusion events per second exceed a configurable threshold.

We recommend a warning threshold of 1.5 times your average intrusion event rate, and a critical threshold of 2.5 times. For example, for an average event rate on network segment of 20 events per second, we recommend a warning value of 30 and a critical value of 50. The critical limit must be lower than1000, and higher than the warning limit.

Event rates for your devices are available on System (system gear icon) > Monitoring > Statistics. If the rate is zero, the Snort process may be down or the device may not be sending events.

Link State Propagation

Alert

For the ISA 3000, alerts when an interface in a inline set fails.

Memory Usage

Alert

Alerts when memory use exceeds configurable thresholds.

For appliances with more than 4 GB of memory, the preset alert thresholds are based on a formula that accounts for proportions of available memory likely to cause system problems. On >4 GB appliances, because the interval between Warning and Critical thresholds may be very narrow, its recommended that you manually set the Warning Threshold % value to 50. This will further ensure that you receive memory alerts for your appliance in time to address the issue.

Complex access control policies and rules can command significant resources and negatively affect performance.

Network Card Reset

Alert

Alerts when a network card restarts due to hardware failure.

NTP Statistics

Metrics

Monitors NTP synchronization status. Disabled by default.

Firewall Management Center Access Configuration Changes

Alert

Monitors configuration changes made on the management center directly using the configure network management-data-interface command. This module alerts when there is a conflict between the existingmanagement center configuration and the out of band configuration changes made.

Power Supply

Alert

Monitors the management center power supply units and alerts if any fault is detected or if it requires replacement.

Process Status

Alert

Alerts when processes on the appliance exit or terminate outside of the process manager.

If a process is deliberately exited outside of the process manager, the module status changes to Warning and the health event message indicates which process exited, until the module runs again and the process has restarted. If a process terminates abnormally or crashes outside of the process manager, the module status changes to Critical and the health event message indicates the terminated process, until the module runs again and the process has restarted.

Routing Statistics

Metrics

Monitors the current state of routing table.

Snort 3 Statistics

Metrics

Collects Snort 3 statistics for events, flows, and packets.

CPU Usage Snort

Metrics

This module checks that the average CPU usage of the Snort processes on the device is not overloaded and alerts when CPU usage exceeds the percentages configured for the module. The Warning Threshold % default value is 80. The Critical Threshold % default value is 90.

Snort Identity Memory Usage

Alert

Enables you to set a warning threshold for Snort identity processing and alerts when memory usage exceeds the level configured for the module. The Critical Threshold % default value is 80.

This health module specifically keeps track of the total space used for the user identity information in Snort. It displays the current memory usage details, the total number of user-to-IP bindings, and user-group mapping details. Snort records these details in a file. If the memory usage file is not available, the Health Alert for this module displays Waiting for data. This could happen during a Snort restart due to a new install or a major update, switch from Snort 2 to Snort 3 or back, or major policy deployment. Depending on the health monitoring cycle, and when the file is available, the warning disappears, and the health monitor displays the details for this module with its status turned Green.

Memory Usage Snort

Metrics

This module checks the percentage of allocated memory used by the Snort process and alerts when memory usage exceeds the percentages configured for the module. The Warning Threshold % default value is 80. The Critical Threshold % default value is 90.

Snort Reconfiguring Detection

Metrics

Alerts if a device reconfiguration has failed. This module detects reconfiguration failure for both Snort 2 and Snort 3 instances.

Snort Statistics

Metrics

Monitors Snort statistics for events, flows, and packets.

SSE Connection Status

Metrics

The module alerts if the device cannot connect to the security services exchange cloud after an initial successful connection. Disabled by default.

CPU Usage System

Metrics

This module checks that the average CPU usage of all system processes on the device is not overloaded and alerts when CPU usage exceeds the percentages configured for the module. The Warning Threshold % default value is 80. The Critical Threshold % default value is 90.

Threat Data Updates on Devices

Alert

Certain intelligence data and configurations that devices use to detect threats are updated on the management center from the cloud every 30 minutes.

This module alerts you if this information has not been updated on the devices within the time period you have specified.

Monitored updates include:

  • Local URL category and reputation data

  • Security Intelligence URL lists and feeds, including global Block and Do Not Block lists and URLs from Threat Intelligence Director

  • Security Intelligence network lists and feeds (IP addresses), including global Block and Do Not Block lists and IP addresses from Threat Intelligence Director

  • Security Intelligence DNS lists and feeds, including global Block and Do Not Block lists and domains from Threat Intelligence Director

  • Local malware analysis signatures (from ClamAV)

  • SHA lists from Threat Intelligence Director, as listed on the Objects > Object Management > Security Intelligence > Network Lists and Feeds page

  • Dynamic analysis settings configured on the Integration > AMP > Dynamic Analysis Connections page

  • Threat Configuration settings related to expiration of cached URLs, including the Cached URLs Expire setting on the Integration > Other Integrations > Cloud Services page. (Updates to the URL cache are not monitored by this module.)

  • Communication issues with the Cisco cloud for sending events. See the Cisco Cloud box on the Integration > Other Integrations> Cloud Services page.

Note

Threat Intelligence Director updates are included only if TID is configured on your system and you have feeds.

By default, this module sends a warning after 1 hour and a critical alert after 24 hours.

If this module indicates failure on the management center or on any devices, verify that the management center can reach the devices.

VPN Statistics

Metrics

Monitors site-to-site and remote access VPN tunnels between threat defense devices.

XTLS Counters

Metrics

Monitors XTLS/SSL flows, memory and cache effectiveness. Disabled by default.

Management Center Health Modules

Module

Type

Description

AMP for Endpoints Status

Alert

The module alerts if the management center cannot connect to the AMP cloud or Cisco AMP Private Cloud after an initial successful connection, or if the private cloud cannot contact the public AMP cloud. It also alerts if you deregister an AMP cloud connection using the Secure Endpoint management console.

AMP for Firepower Status

Alert

Alerts if:

  • The management center cannot contact the AMP cloud (public or private) or the Secure Malware Analytics Cloud or Appliance, or the AMP private cloud cannot contact the public AMP cloud.

  • The encryption keys used for the connection are invalid.

  • A device cannot contact the Secure Malware Analytics Cloud or Secure Malware Analytics Appliance to submit files for dynamic analysis.

  • An excessive number of files are detected in network traffic based on the file policy configuration.

If your management center loses connectivity to the Internet, the system may take up to 30 minutes to generate a health alert.

Appliance Heartbeat

Alert

This module determines if an appliance heartbeat is being heard from the appliance and alerts based on the appliance heartbeat status.

CPU Usage (per core)

Metrics

This module checks that the CPU usage on all the cores is not overloaded and alerts when CPU usage exceeds the thresholds configured for the module. The Warning Threshold % default value is 80. The Critical Threshold % default value is 90.

Critical Process Statistics

Metrics

Monitors the state of critical processes, their resource consumption, and the restart counts.

Dynamic Attributes Connector

Database

Alert

Alerts if the configuration database size is too big. It also monitors the system for database schema or configuration data (sometimes called EO) integrity issues. If this module alerts, contact Cisco TAC. This will prevent upgrade.

Discovery Host Limit

Alert

This module determines if the number of hosts the management center can monitor is approaching the limit and alerts based on the warning level configured for the module. For more information, see Host Limit.

Disk Status

Alert

This module examines the performance of the hard disk and malware storage pack (if installed) on the appliance.

This module generates a Warning (yellow) health alert when the hard disk and RAID controller (if installed) are in danger of failing, or if an additional hard drive is installed that is not a malware storage pack. This module generates an Alert (red) health alert when an installed malware storage pack cannot be detected.

Disk Usage

Metrics

This module compares disk usage on the appliance’s hard drive and malware storage pack to the limits configured for the module and alerts when usage exceeds the thresholds configured for the module. This module also alerts when the system excessively deletes files in monitored disk usage categories, or when disk usage excluding those categories reaches excessive levels, based on module thresholds.

Use the Disk Usage health status module to monitor disk usage for the / and /volume partitions on the appliance and track draining frequency. Although the disk usage module lists the /boot partition as a monitored partition, the size of the partition is static so the module does not alert on the boot partition.

eStream Status

Alert

Monitors connections to third-party client applications that use the Event Streamer on the management center.

Event Backlog Status

Alert

Alerts if the backlog of event data awaiting transmission from the device to the management center has grown continuously for more than 30 minutes.

To reduce the backlog, evaluate your bandwidth and consider logging fewer events.

Event Monitor

Metrics

This module monitors overall incoming event rate to management center.

File System Integrity Check

Alert

This module performs a file system integrity check and runs if the system has CC mode or UCAPL mode enabled, or if the system runs an image signed with a DEV key. This module is enabled by default.

Firewall Management Center HA Status

Alert

Monitors management center high availability. This module generates alerts if the HA pairs are not synchronized and if there is a discrepancy in the number of managed devices between the active and standby units.

Hardware Statistics

Metrics

Monitors management center hardware: fan speed, temperature, and power supply. Alerts when values exceed configurable thresholds.

Health Monitor Process

Alert

Monitors the health process itself, and alerts if there have been no health events in some number of minutes (configurable).

ISE Connection Monitor

Alert

This module monitors the status of the server connections between the Cisco Identity Services Engine (ISE) and the management center. ISE provides additional user data, device type data, device location data, SGTs (Security Group Tags), and SXP (Security Exchange Protocol) services.

License Monitor

Alert

This module monitors expiration of Classic licenses.

Local Malware Analysis

Alert

This module monitors ClamAV updates for Local Malware Analysis.

Memory Usage

Alert

This module compares memory usage on the appliance to the limits configured for the module and alerts when usage exceeds the levels configured for the module.

When calculating the memory usage, the management center Memory Usage health module monitors and includes the usage of RAM, swap memory, and cache memory.

For appliances with more than 4 GB of memory, the preset alert thresholds are based on a formula that accounts for proportions of available memory likely to cause system problems. On >4 GB appliances, because the interval between Warning and Critical thresholds may be very narrow, its recommended that you manually set the Warning Threshold % value to 50. This will further ensure that you receive memory alerts for your appliance in time to address the issue.

Beginning with Version 6.6.0, the minimum required RAM for management center virtual upgrades to Version 6.6.0+ is 28 GB, and the recommended RAM for management center virtual deployments is 32 GB. We recommend you do not decrease the default settings: 32 GB RAM for most management center virtual instances, 64 GB for the management center virtual 300 (VMware only).

Attention

A critical alert is generated by the health monitor when insufficient RAM is allocated to a management center virtual deployment.

Complex access control policies and rules can command significant resources and negatively affect performance.

MySQL Statistics

Metrics

Monitors the status of the MySQL database, including the database size, number of active connections, and memory use.

Process Status

Alert

Alerts when processes on the appliance exit or terminate outside of the process manager.

If a process is deliberately exited outside of the process manager, the module status changes to Warning and the health event message indicates which process exited, until the module runs again and the process has restarted. If a process terminates abnormally or crashes outside of the process manager, the module status changes to Critical and the health event message indicates the terminated process, until the module runs again and the process has restarted.

RabbitMQ Status

Metrics

Monitors and collects RabbitMQ statistics.

Realm

Alert

Allows you to set a warning threshold for realm or user mismatches, which are:

  • User mismatch: A user is reported to the cloud-delivered Firewall Management Center without being downloaded.

    A typical reason for a user mismatch is that the user belongs to a group you have excluded from being downloaded to the cloud-delivered Firewall Management Center. Review the information discussed in Realm Fields.

  • Realm mismatch: A user logs into a domain that corresponds to a realm not known to the management center.

For more information, see .

This module also displays health alerts when you try to download more users than the maximum number of downloaded users supported per realm. The maximum number of downloaded users for a single realm depends on your management center model.

For more information, see User Limit in the Cisco Secure Firewall Management Center Device Configuration Guide

RRD Server Process

Alert

Alerts if the round robin data (RRD) server that stores time series data has restarted since the last time it updated. You can configure additional warning and critical thresholds for consecutive restarts.

Security Intelligence

Alert

Alerts if Security Intelligence is in use and the management center cannot update a feed, or feed data is corrupt or contains no recognizable IP addresses.

See also the Threat Data Updates on Devices module.

Smart License Monitor

Alert

Monitors Smart Licensing status and alerts if:

  • There is a communication error between the Smart Licensing Agent (Smart Agent) and the Smart Software Manager.

  • The Product Instance Registration Token has expired.

  • The Smart License usage is out of compliance.

  • The Smart License authorization or evaluation mode has expired.

Threat Data Updates on Devices

Alert

Certain intelligence data and configurations that devices use to detect threats are updated on the management center from the cloud every 30 minutes.

This module alerts you if this information has not been updated on the devices within the time period you have specified.

Monitored updates include:

  • Local URL category and reputation data.

  • Security Intelligence URL lists and feeds, including global Block and Do Not Block lists and URLs from Threat Intelligence Director.

  • Security Intelligence network lists and feeds (IP addresses), including global Block and Do Not Block lists and IP addresses from Threat Intelligence Director.

  • Security Intelligence DNS lists and feeds, including global Block and Do Not Block lists and domains from Threat Intelligence Director.

  • Local malware analysis signatures (from ClamAV).

  • SHA lists from Threat Intelligence Director, as listed on the Objects > Object Management > Security Intelligence > Network Lists and Feeds page.

  • Dynamic analysis settings configured on the Integration > AMP > Dynamic Analysis Connections page.

  • Threat Configuration settings related to expiration of cached URLs, including the Cached URLs Expire setting on the Integration > Other Integrations > Cloud Services page. (Updates to the URL cache are not monitored by this module.)

  • Communication issues with the Cisco cloud for sending events. See the Cisco Cloud box on the Integration > Other Integrations> Cloud Services page.

Note

Threat Intelligence Director updates are included only if TID is configured on your system and you have feeds.

By default, this module sends a warning after 1 hour and a critical alert after 24 hours.

If this module indicates failure on the management center or on any devices, verify that the management center can reach the devices.

Time Series Data (RRD) Monitor

Alert

This module tracks the presence of corrupt files in the directory where time series data (such as correlation event counts) are stored and alerts when files are flagged as corrupt and removed.

Time Synchronization Status

Alert

This module tracks the synchronization of a device clock that obtains time using NTP with the clock on the NTP server and alerts if the difference in the clocks is more than ten seconds.

Unresolved Groups Monitor

Alert

Monitors Foreign Security Principals (FSPs) that are groups used in policies. Security principals are Active Directory objects, like authenticated user groups, to which security can be applied in access control policies.

This module generates a warning alert for unresolved groups that exist but are not used in policies, and a critical alert for unresolved groups that are used in policies.

URL Filtering Monitor

Alert

Monitors connectivity with the Cisco cloud, which is required for downloading URL filtering data and doing URL filtering lookups.

VPN Tunnel Status

Alert

Alerts when VPN tunnels are down. Supported for both remote access and site-to-site VPN.

Zero-Touch Provisioning

Alert

Alerts if there is a failure when registering a device using the serial number. It also shows errors related to zero-touch provisioning capable management centers in high availability.