KBI 310635 Escalation and Auto-Correction Alerts Not Working Due To Data Truncation
Version
Argent Advanced Technology — All Versions
Date
Monday, 3 Feb 2014
Summary
This KBI explains the following symptoms (all of which will occur):
- Escalation Alerts not being sent
- Auto-Correction Alerts not being sent
- Alert not being automatically set as ‘Resolved’ after correction
- Automatic Alert Archiving not archiving alerts
The AAC_ENGINE_LOG.TXT under the Argent Console may also show the following error message:
04 Jun 2013 03:16:56.692 ARG123 ARGENT\argadmin ***** ERROR ***** An error condition has been detected by ArgSoft_DAL.cpp (D6220_Open_RS) at line 3484: Could not open record set. Microsoft Error Code = Data truncated..
04 Jun 2013 03:16:56.692 ARG123 ARGENT\argadmin Could not query table ‘ARGSOFT_AAC_ESCALATION’
Technical Background
What Causes This?
This issue is caused by database fields in the Argent Console tables that contain data that is extremely ‘large’
When Argent reads data from a database, Microsoft requires the retrieving object to “allocate” a fixed size for each row of data — naturally, the size should be larger than the amount of data being read
While the majority of the fields in the Argent Console tables are limited based on the VARCHAR length, any “text” or “ntext” field could allow a virtually infinite length
Examples of “text” or “ntext” fields are LONG_DESCRIPTION (the Alert body), or the MEMO field (for Memo Notes)
Argent provides a flexible option to change the allocated size of each row
By default, this is 32 KB
This is determined by the registry HKLM\Software\Argent\ARGENT_CONSOLE\ODBC_TXTBUF_IN_KB
(for Motors, the registry is in the ARGSOFT_AAC_CONFIGURATION table)
Under normal circumstances, 32 KB is plenty of space for a single alert
The issue occurs when Argent inserts data far larger than 32 KB — the most common instance is with Argent for SQL Server
The Deadlock Rules or Runaway Query Rules, for example, will include an informative list ALL of the queries that caused a deadlock, or the actual runaway queries themselves
If a customer had hundreds of deadlocks at the time of execution, the alert “body” could conceivably contain upwards of 800 KB of information
Simply put, the issue is caused by a lack of restrictions when WRITING into the tables, mixed with an stringent restriction when READING from the tables
In an example where the LONG_DESCRIPTION contains 800 KB of data, customers could set the ODBC_TXTBUF_IN_KB value to a larger number, such as 1,000 KB
The only drawback of setting a very large number: each fetched row will have more memory allocated to it when Argent performs read operations on the Argent Console tables
There is no general rule — the best advice is to contact Argent support for recommendations. Some customers go over by a few dozen KB of data, others go over by a few megabytes of data
What Can We Do To Check For This?
The database tables commonly affected are the ARGSOFT_AAC_CONSOLE table (which holds all Alerts), and the ARGSOFT_AAC_ESCALATION table (which holds all Alert Escalations)
To check the size of alerts, customers can run these sample SQL queries, which look for all text or ntext fields with a size larger than 20,000 characters:
ARGSOFT_AAC_CONSOLE
SELECT UUID ,RELATOR ,MACHINE ,RULE_NAME ,DATALENGTH(LONG_DESCRIPTION) AS LONG_DESCRIPTION_LENGTH ,DATALENGTH(SHORT_DESCRIPTION) AS SHORT_DESCRIPTION_LENGTH ,DATALENGTH(CORRELATION) AS CORRELATION_LENGTH ,DATALENGTH(ROOT_CAUSE) AS ROOT_CAUSE_LENGTH ,DATALENGTH(ALERT_MEMO) AS ALERT_MEMO_LENGTH FROM ARGSOFT_AAC_CONSOLE WHERE DATALENGTH(LONG_DESCRIPTION) > 20000 OR DATALENGTH(SHORT_DESCRIPTION) > 20000 OR DATALENGTH(CORRELATION) > 20000 OR DATALENGTH(ROOT_CAUSE) > 20000 OR DATALENGTH(ALERT_MEMO) > 20000
ARGSOFT_AAC_ESCALATION
SELECT UUID ,RELATOR ,MACHINE ,RULE_NAME ,DATALENGTH(LONG_DESCRIPTION) AS LONG_DESCRIPTION_LENGTH ,DATALENGTH(SHORT_DESCRIPTION) AS SHORT_DESCRIPTION_LENGTH ,DATALENGTH(RULE_DESCRIPTION) AS RULE_DESCRIPTION_LENGTH ,DATALENGTH(RULE_DESCRIPTION) AS RULE_DESCRIPTION_LENGTH ,DATALENGTH(ESCALATION_EMAIL) AS ESCALATION_EMAIL_LENGTH ,DATALENGTH(ESCALATION_PAGER) AS ESCALATION_PAGER_LENGTH ,DATALENGTH(ESCALATION_SMS) AS ESCALATION_SMS_LENGTH ,DATALENGTH(ESCALATION_MESSAGE) AS ESCALATION_MESSAGE_LENGTH ,DATALENGTH(NOTIFY_WHEN_OK) AS NOTIFY_WHEN_OK_LENGTH ,DATALENGTH(EXTRA_INFO) AS EXTRA_INFO_LENGTH ,DATALENGTH(CORRELATION) AS CORRELATION_LENGTH ,DATALENGTH(ROOT_CAUSE) AS ROOT_CAUSE_LENGTH FROM ARGSOFT_AAC_ESCALATION WHERE DATALENGTH(LONG_DESCRIPTION) > 20000 OR DATALENGTH(SHORT_DESCRIPTION) > 20000 OR DATALENGTH(RULE_DESCRIPTION) > 20000 OR DATALENGTH(RULE_DESCRIPTION) > 20000 OR DATALENGTH(ESCALATION_EMAIL) > 20000 OR DATALENGTH(ESCALATION_PAGER) > 20000 OR DATALENGTH(ESCALATION_SMS) > 20000 OR DATALENGTH(ESCALATION_MESSAGE) > 20000 OR DATALENGTH(NOTIFY_WHEN_OK) > 20000 OR DATALENGTH(EXTRA_INFO) > 20000 OR DATALENGTH(CORRELATION) > 20000 OR DATALENGTH(ROOT_CAUSE) > 20000
The value of the fields are in bytes — any row that contains an excessive amount of data should be noted
Resolution
Increase the value of HKLM\Software\Argent\ARGENT_CONSOLE\ODBC_TXTBUF_IN_KB
For Motors, the registry is in the ARGSOFT_AAC_CONFIGURATION table
Contact Argent support for recommendations on the new proposed value
Once changed, the Argent Console services should be restarted
Alternatively, the affected alert rows that exceed the limit should be manually deleted
Note that it may take between 1 to 60 minutes for the symptoms to disappear — this is caused by a large queue of “stuck” escalations or correction emails that now need to be processed
This issue has been addressed in Argent AT 3.1A-1401-A
Alerts will be automatically truncated BEFORE writing, if the size of the data is too large
When Alerts are truncated, a message will be inserted into the Alert body so customers to distinguish this from untruncated Alerts