KBI 310635 Escalation and Auto-Correction Alerts Not Working Due To Data Truncation

Version

Argent Advanced Technology — All Versions

Date

Monday, 3 Feb 2014

Summary

This KBI explains the following symptoms (all of which will occur):

  • Escalation Alerts not being sent
  • Auto-Correction Alerts not being sent
  • Alert not being automatically set as ‘Resolved’ after correction
  • Automatic Alert Archiving not archiving alerts

The AAC_ENGINE_LOG.TXT under the Argent Console may also show the following error message:

04 Jun 2013 03:16:56.692 ARG123 ARGENT\argadmin ***** ERROR ***** An error condition has been detected by ArgSoft_DAL.cpp (D6220_Open_RS) at line 3484: Could not open record set. Microsoft Error Code = Data truncated..

04 Jun 2013 03:16:56.692 ARG123 ARGENT\argadmin Could not query table ‘ARGSOFT_AAC_ESCALATION’

Technical Background

What Causes This?

This issue is caused by database fields in the Argent Console tables that contain data that is extremely ‘large’

When Argent reads data from a database, Microsoft requires the retrieving object to “allocate” a fixed size for each row of data — naturally, the size should be larger than the amount of data being read

While the majority of the fields in the Argent Console tables are limited based on the VARCHAR length, any “text” or “ntext” field could allow a virtually infinite length

Examples of “text” or “ntext” fields are LONG_DESCRIPTION (the Alert body), or the MEMO field (for Memo Notes)

Argent provides a flexible option to change the allocated size of each row

By default, this is 32 KB

This is determined by the registry HKLM\Software\Argent\ARGENT_CONSOLE\ODBC_TXTBUF_IN_KB

(for Motors, the registry is in the ARGSOFT_AAC_CONFIGURATION table)

Under normal circumstances, 32 KB is plenty of space for a single alert

The issue occurs when Argent inserts data far larger than 32 KB — the most common instance is with Argent for SQL Server

The Deadlock Rules or Runaway Query Rules, for example, will include an informative list ALL of the queries that caused a deadlock, or the actual runaway queries themselves

If a customer had hundreds of deadlocks at the time of execution, the alert “body” could conceivably contain upwards of 800 KB of information

Simply put, the issue is caused by a lack of restrictions when WRITING into the tables, mixed with an stringent restriction when READING from the tables

In an example where the LONG_DESCRIPTION contains 800 KB of data, customers could set the ODBC_TXTBUF_IN_KB value to a larger number, such as 1,000 KB

The only drawback of setting a very large number: each fetched row will have more memory allocated to it when Argent performs read operations on the Argent Console tables

There is no general rule — the best advice is to contact Argent support for recommendations. Some customers go over by a few dozen KB of data, others go over by a few megabytes of data

What Can We Do To Check For This?

The database tables commonly affected are the ARGSOFT_AAC_CONSOLE table (which holds all Alerts), and the ARGSOFT_AAC_ESCALATION table (which holds all Alert Escalations)

To check the size of alerts, customers can run these sample SQL queries, which look for all text or ntext fields with a size larger than 20,000 characters:

ARGSOFT_AAC_CONSOLE


SELECT	UUID

	,RELATOR

	,MACHINE

	,RULE_NAME

        ,DATALENGTH(LONG_DESCRIPTION) AS LONG_DESCRIPTION_LENGTH

        ,DATALENGTH(SHORT_DESCRIPTION) AS SHORT_DESCRIPTION_LENGTH

        ,DATALENGTH(CORRELATION) AS CORRELATION_LENGTH

        ,DATALENGTH(ROOT_CAUSE) AS ROOT_CAUSE_LENGTH

        ,DATALENGTH(ALERT_MEMO) AS ALERT_MEMO_LENGTH 

            

FROM	ARGSOFT_AAC_CONSOLE



WHERE	DATALENGTH(LONG_DESCRIPTION) > 20000

	OR

	DATALENGTH(SHORT_DESCRIPTION) > 20000

	OR

	DATALENGTH(CORRELATION) > 20000

	OR

	DATALENGTH(ROOT_CAUSE) > 20000

	OR

	DATALENGTH(ALERT_MEMO) > 20000

ARGSOFT_AAC_ESCALATION


SELECT   UUID

        ,RELATOR

        ,MACHINE

        ,RULE_NAME

	,DATALENGTH(LONG_DESCRIPTION)		AS LONG_DESCRIPTION_LENGTH	

	,DATALENGTH(SHORT_DESCRIPTION)		AS SHORT_DESCRIPTION_LENGTH	

	,DATALENGTH(RULE_DESCRIPTION)		AS RULE_DESCRIPTION_LENGTH	

	,DATALENGTH(RULE_DESCRIPTION)		AS RULE_DESCRIPTION_LENGTH	

	,DATALENGTH(ESCALATION_EMAIL)		AS ESCALATION_EMAIL_LENGTH	

	,DATALENGTH(ESCALATION_PAGER)		AS ESCALATION_PAGER_LENGTH	

	,DATALENGTH(ESCALATION_SMS)		AS ESCALATION_SMS_LENGTH	

	,DATALENGTH(ESCALATION_MESSAGE)		AS ESCALATION_MESSAGE_LENGTH	

	,DATALENGTH(NOTIFY_WHEN_OK)		AS NOTIFY_WHEN_OK_LENGTH	

	,DATALENGTH(EXTRA_INFO)			AS EXTRA_INFO_LENGTH		

	,DATALENGTH(CORRELATION)		AS CORRELATION_LENGTH	

	,DATALENGTH(ROOT_CAUSE)			AS ROOT_CAUSE_LENGTH	

            

FROM	ARGSOFT_AAC_ESCALATION



WHERE   DATALENGTH(LONG_DESCRIPTION) > 20000

	OR	

	DATALENGTH(SHORT_DESCRIPTION) > 20000

	OR	

	DATALENGTH(RULE_DESCRIPTION) > 20000

	OR	

	DATALENGTH(RULE_DESCRIPTION) > 20000

	OR	

	DATALENGTH(ESCALATION_EMAIL) > 20000

	OR	

	DATALENGTH(ESCALATION_PAGER) > 20000

	OR	

	DATALENGTH(ESCALATION_SMS) > 20000

	OR	

	DATALENGTH(ESCALATION_MESSAGE) > 20000

	OR	

	DATALENGTH(NOTIFY_WHEN_OK) > 20000

	OR	

	DATALENGTH(EXTRA_INFO) > 20000

	OR		

	DATALENGTH(CORRELATION) > 20000

	OR	

	DATALENGTH(ROOT_CAUSE) > 20000

The value of the fields are in bytes — any row that contains an excessive amount of data should be noted

Resolution

Increase the value of HKLM\Software\Argent\ARGENT_CONSOLE\ODBC_TXTBUF_IN_KB

For Motors, the registry is in the ARGSOFT_AAC_CONFIGURATION table

Contact Argent support for recommendations on the new proposed value

Once changed, the Argent Console services should be restarted

Alternatively, the affected alert rows that exceed the limit should be manually deleted

Note that it may take between 1 to 60 minutes for the symptoms to disappear — this is caused by a large queue of “stuck” escalations or correction emails that now need to be processed

This issue has been addressed in Argent AT 3.1A-1401-A

Alerts will be automatically truncated BEFORE writing, if the size of the data is too large

When Alerts are truncated, a message will be inserted into the Alert body so customers to distinguish this from untruncated Alerts