KBI 311772 Issue Addressed: SLA Tasks Are Not Running At Specified Interval For Large Installation With More Than 1,000 Server/Devices
Version
Argent Advanced Technology 5.1A-1904-C and below
Date
Thursday, 22 August 2019
Summary
For large installation with more than 1,000, if system is configured to run SLA or System Down Rule with high frequency of interval less than 2 minutes, user could notice Relators are running with increased interval over time
In other words, monitoring tasks are running slower and slower
The issue can be temporarily rectified by truncating SQL table ARGSOFT_{PRODUCT}_EXECJOBS
But as the table row count increasing, the performance will gradually go down
The issue has been addressed in Argent AT 5.1A-1907-A
Technical Background
SQL table ARGSOFT_{PRODUCT}_EXECJOBS holds the information of each execution instance of Relator and Server/Device combination
When SLA or System Down Rule is executed for a lot of Server/Devices with high frequency, the row count can grow very quickly
While Argent AT Engine updates the SQL table when instance status changes, ex. instance status is changed from ‘Running’ to ‘Ended’ when execution has completed
When Engine update the status, it scans for rows of any orphan instance for the same Relator and Service/Device combination
If the table grows very big, the scanning becomes time consuming
This is the cause of worsening performance over time
Resolution
Upgrade to Argent Advanced Technology 5.1A-1907-A or above
For customer who could not upgrade immediately, he can address the issue by manually adding a SQL index for column ‘JOBPCKEY’ for SQL table ARGSOFT_{PRODUCT}_EXECJOBS