KBI 310907 New Feature: Prevent Flood Of Events From Crippling Argent AT Engines
Version
Added to Argent Advanced Technology 3.1A-1401-T6 and later
Date
Wednesday, 9 Apr 2014
Summary
Ill-configured Argent AT Monitoring Products can generate flood of Events that can severely cripple Argent AT Engines
A typical example is that customer configures Event Log Rule that breaks on every record of Windows Security Log
Considering a typical Domain Controller can generate more than a million Audit Events per day, if each event triggers some slow I/O intensive Alerts such as SMS, email etc, Argent Console Engine will simply be driven to the ground
Argent AT 3.1A-1401-T6 introduces the limit of pending Events of same source (Product, server, Relator and Rule)
A typical Argent AT system could have 1000+ outstand Events overall, but it should not have several hundred Events from the same source
If so, it simply means the Argent AT is not configured properly
Technical Background
Maximum Pending Events To Fire Alert (Global) – When pending Events from all sources exceed this limit, the event is still recorded in A1x screens, but the Alert is not fired as a result
Default value is 14,000
It is a very unhealthy system if it has pending Events more than that
Maximum Events To Fire Alerts Per Hour (Global) – When Alerts fired for Events from all sources in past one hour exceed this limit, the event is still recorded in A1x screens, but the Alert is not fired as a result
Default value is 500
To disable it, set it to 9,999
Maximum Pending Events To Process (Same Node, Rule And Relator from One Product) – If pending Events from the same source exceed this limit, the event request will be rejected
Default value is 100
To disable it, set it to 9,999
When the event request is rejected, it is definitely not a healthy condition
Argent AT Engine will constantly remind customer the condition until it is resolved
A network message is post when the first time an event of a source is rejected, and once every half hour by default afterward
When customer starts Argent AT main GUI, if such condition persists, a popup message is also displayed
As Argent Console is shared among Argent AT monitoring products, main GUI will prompt even the condition is caused by another product
Sometimes customer sets up heartbeat Relator that breaks Rule on purpose
Apparently it is possible to cause pending Events of such Relator exceed the limit
Customer can define exceptions for the limit
Wildcards are supported in Relator, Rule and server/device name
Resolution
Upgrade to Argent Advanced Technology 3.1A-1401-T6 or later