KBI 310907 New Feature: Prevent Flood Of Events From Crippling Argent AT Engines

Version

Added to Argent Advanced Technology 3.1A-1401-T6 and later

Date

Wednesday, 9 Apr 2014

Summary

Ill-configured Argent AT Monitoring Products can generate flood of Events that can severely cripple Argent AT Engines

A typical example is that customer configures Event Log Rule that breaks on every record of Windows Security Log

Considering a typical Domain Controller can generate more than a million Audit Events per day, if each event triggers some slow I/O intensive Alerts such as SMS, email etc, Argent Console Engine will simply be driven to the ground

Argent AT 3.1A-1401-T6 introduces the limit of pending Events of same source (Product, server, Relator and Rule)

A typical Argent AT system could have 1000+ outstand Events overall, but it should not have several hundred Events from the same source

If so, it simply means the Argent AT is not configured properly

Technical Background

Maximum Pending Events To Fire Alert (Global) – When pending Events from all sources exceed this limit, the event is still recorded in A1x screens, but the Alert is not fired as a result

Default value is 14,000

It is a very unhealthy system if it has pending Events more than that

Maximum Events To Fire Alerts Per Hour (Global) – When Alerts fired for Events from all sources in past one hour exceed this limit, the event is still recorded in A1x screens, but the Alert is not fired as a result

Default value is 500

To disable it, set it to 9,999

Maximum Pending Events To Process (Same Node, Rule And Relator from One Product) – If pending Events from the same source exceed this limit, the event request will be rejected

Default value is 100

To disable it, set it to 9,999

When the event request is rejected, it is definitely not a healthy condition

Argent AT Engine will constantly remind customer the condition until it is resolved

A network message is post when the first time an event of a source is rejected, and once every half hour by default afterward


When customer starts Argent AT main GUI, if such condition persists, a popup message is also displayed

As Argent Console is shared among Argent AT monitoring products, main GUI will prompt even the condition is caused by another product


Sometimes customer sets up heartbeat Relator that breaks Rule on purpose

Apparently it is possible to cause pending Events of such Relator exceed the limit

Customer can define exceptions for the limit

Wildcards are supported in Relator, Rule and server/device name

Resolution

Upgrade to Argent Advanced Technology 3.1A-1401-T6 or later