KBI 311338 Issue Argent for SQL Server Failed To Execute SQL CPU And I/O Usage Rules

Version

Argent Advanced Technology All Versions

Date

Monday, 18 January 2016

Summary

Argent for SQL Server fires exception while executing SQL Server CPU and I/O usage Rules if the monitored server’s CPU or I/O usage value exceeds 49 days of cumulative CPU time

Technical Background

Argent for SQL Server uses a Microsoft system stored procedure named ‘sp_monitor’ to retrieve the CPU and I/O usage statistics of a specific SQL database

This Microsoft system stored procedure ‘sp_monitor’ internally uses two system statistical functions, @@cpu_busy and @@io_busy to calculate the CPU usage time and I/O usage time respectively

Unfortunately there is a bug in the Microsoft sp_monitor stored procedure — it can hold cumulative CPU time only up to 49 days, beyond that an arithmetic overflow exception is fired by Microsoft, which results in the following lines in the Monitoring Engine log

SQL Server SMO exception occurred: ‘Execute with results failed for Database ‘master”

Arithmetic overflow error converting expression to data type int

The following links provide references to @@cpu_busy and @@io_busy functions:

https://msdn.microsoft.com/en-us/library/ms186925.aspx

https://msdn.microsoft.com/en-us/library/ms177599.aspx

Also the ‘sp_monitor’ stored procedure updates the value returned by @@cpu_busy and @@io_busy to the cpu_busy and io_busy attributes of the master.dbo.spt_monitor system table respectively

The data type of these attributes is ‘int’

The overflow error can also occur while updating the values of the above attributes if the value being updated is greater than the maximum value that ‘int’ data type can hold, which is 2,147,483,647 as per Microsoft documentation

The results of updating cpu_busy and io_busy attributes of the master.dbo.spt_monitor system table with a value greater than 2,147,483,647 using SQL Management studio is shown below:

Updating cpu_busy with a value of 2,147,483,648

Updating io_busy with a value of 2,147,483,648

Hence the Arithmetic Overflow Error can occur during the execution of SQL Processor Usage Rule and SQL I/O Usage Rule if @@cpu_busy or @@io_busy returns a very large value

This is possible if the SQL server has been running continuously without shutting down for a long period of time

It has been observed that rebooting the server resets the values of @@cpu_busy and @@io_busy

Resolution

Not available