KBI 311090 Issue Addressed: Running System Down Rule With NetRemoteTOD Option For Remote Machine Can Cause High CPU Usage In Windows 2012 Server
Version
Argent Advanced Technology 3.1A-1407-A or earlier
Date
Tuesday, 21 Oct 2014
Summary
If Argent AT is installed on Windows 2012 server, running System Down Rule with
NetRemoteTOD option for a remote machine with very long network latency can cause very high CPU usage
The issue has been addressed in Argent AT 3.1A-1407-T6
Technical Background
There is a timeout associated with NetRemoteTOD API
If the remote machine has a very long network latency, which can be confirmed by using trace route utility, the API may time out prematurely
When many of such timeout happen, somehow it can trigger a very high CPU usage. It has only been observed at Windows 2012 servers
Preliminary investigation shows Windows 2012 server handles the context switch in C++/CLI differently compared to previous Windows versions
Argent AT 3.1A-1407-T6 addressed the issue by moving NetRemoteTOD to a separate worker thread, and terminates the thread if timeout happens
Resolution
Upgrade to Argent AT 3.1A-1407-T6 or later
For customer who cannot upgrade immediately, he can either set the timeout longer for the specific node in License Manager, or use Relator option ‘Spawn New Monitoring Engine Process‘ to isolate the Monitoring Engine process