KBI 310202 Logs Flooded With Accept Request To Get Batch Info…

Version

All

Date

20 Aug 2010

Summary

You may have been trying to troubleshoot why Relators are being skipped, or why tasks are taking a long time to execute.

While investigating a Monitoring Engine log, you may notice lines that resemble this:


18 Aug 2010 12:00:04.206 SVR1 admin Accept request to get batch task info (Relator: REL_DISK_FREE Node: SQLPROD-B1)

...

18 Aug 2010 12:00:14.206 SVR1 admin Accept request to get batch task info (Relator: REL_DISK_FREE Node: SQLPROD-B1)

...

18 Aug 2010 12:00:26.206 SVR1 admin Accept request to get batch task info (Relator: REL_DISK_FREE Node: SQLPROD-B1)

Note how the same Relator/Node pair is being repeated every 10 to 12 seconds.

This behavior typically causes huge logging and disk I/O, thereby compounding the issue further.

Technical Background

The log message is actually performing a function — it is checking the status of the task.

The status of the task is used to update the History screen in the corresponding product — e.g. Running, Pending, Completed, etc.

The rate of the status is based on the registry entry:

HKLM\Software\Argent\ArgentManagementConsole\TASK_MONITOR_INTERVAL

The default is 10 seconds — this explains the 10 to 12 second repetition.

However, healthy monitoring tasks typically complete in a matter of seconds.

If this log message is repeating itself over and over, and if you see SKIPPED Relator tasks along with the repetition, then without a doubt, there is an issue with the Relator being able to execute the task itself.

Resolution

Investigate the Relator task itself.

Do isolated tests on the Relator task, both from the GUI, and from an isolated production Relator.

Typical causes may be rights and privileges issues, connectivity issues, firewalls, or an issue with Performance Counters running a shared process.

If these issues are appearing on a Daughter Engine, check if the Daughter Engine is simply overloaded.

If there are no apparent issues with the Relator tasks themselves, to at least reduce the logging, you can change the registry value of:

HKLM\Software\Argent\ArgentManagementConsole\TASK_MONITOR_INTERVAL

From 10 to 60 — this is safe and has no side effects because Daughter Engines are programmed to contact the Mother Engine with updates every 60 seconds anyways, which means the History screen will not suffer any visible delays.