What Are The Main Features Needed For A Production Job Scheduler?

In this section, the most important features of a job scheduling product are described. Needless to say, different vendors’ perception of the features Customers require differ, but the features listed here are common to all Customers, what varies from one Customer to the next is the relative importance of the specific feature.

1 – Distributed Client-Server Architecture

While this may seem glib to mention, there are still a few vendors with products so old and superseded that jobs can only be run a single server, and when asked about the distributed client-server architecture, the vendor’s salesman had the gall to say, “Sure it’s distributed – you can distribute this to any number of different machines”; the salesman conveniently did not mention the machines did not talk to each other – the ‘distribution’ was free-standing, and totally independent environments.

So, clearly a central job scheduler where the status and control of all jobs throughout the enterprise is essential.

2 – Platform Independent – Windows, UNIX, Linux, iSeries

While the central scheduler can reside on a Windows or iSeries machine, the actual jobs need to be run on a variety of platforms.

There is no benefit — it’s actually counter-productive — to have one vendor’s scheduler for Windows, a second vendor’s scheduler for UNIX, and a third vendor’s scheduler for iSeries, as the idea of enterprise-wide scheduling is just that, to schedule across the enterprise, not one slither of the enterprise with its concomitant fiefdoms.

3 – Powerful Calendars

While this may seem as obvious as the first requirement, it’s essential the vendor’s calendaring be top notch – nothing less will lead to kludges and these kludges inevitably lead to bad practices (that’s a polite way of saying problems, and “problems” and “production job scheduling” are best not combined in the same sentence…)

So what does ‘powerful’ mean? Actually, it’s trivial to answer: what Customers need.

Here are some examples:

Last Tuesday of the quarter if not a holiday; if a holiday, use the next business day
The 1st, 5th and 11th Mondays of the quarter

And the calendaring has to be defined from the GUI with point and click – the last thing needed is to learn yet another scripting language, or – even worse – to introduce errors via complex SQL statements; it is the job of the production job scheduler to use point-and-click technology, not arcane scripts.

4 – Complete And Integrated Alerts

Production job scheduling is the company’s or government department’s business – if jobs fails the company or department start to fail along with the jobs.

Thus alerting when production jobs fails is of the utmost importance. And it’s not a matter of “if”, but rather of “when” production jobs fail.

It’s important the vendor’s alerting product is complete and no additional third-party are needed. Sadly, a lot of large vendors today still force Customers to buy an SMS production. And this is doubly troubling, as SMS is often a bear to get running reliably, so now there are two vendors in the story, each pointing a finger at the other.

So Alerting has to be:

comprehensive

flexible

centralized

Comprehensive means a single email to a long-dead email account will not hack it – these alerts about failed production jobs are the most important, so a range of alert types is needed: from simple email and pager, to SMS, and even complete SQL queries being run as alerts.

Flexible means the ability to abstract — via macros and other global string facilities — repeated information, so critical information is not duplicated (Duplication = maintenance issues later).

Centralized means the alerts are all presented to a single screen or help desk. Thus the alerts need to be able to be sent to IBM’s Tivoli product, or Argent’s Alert Console, or both.

The single most important aspect of the alerting product selected is that it has alert escalation – without alert escalation no effective and reliable alerting can be created for critical production jobs; alert escalation means if the first alert is not answered with a specified time, then additional alerts are sent to an ever-widening array of people, until someone responds to one of the alerts and addresses the issue.

5 – Integrated Limits And Warnings

Production jobs come in all shapes and sizes, so there needs to be a wide array of optional tests and limits available.

Some of the more common ones are:

Job takes too much elapsed time
Job consumes too much CPU time
Job does not start by a specified time
Job does not end by a specified time

And these are just some of the basic limits, and these limits need to be able to be combined together via a point-and-click motif.

6 – Complete Dependencies

Since the first job schedulers were created for mainframes, an essential feature has been dependencies. There are a number of different types of dependencies, the most common are:

Job Dependencies
File Dependencies
FTP Dependencies
ODBC Dependencies

7 – ODBC

ODBC — or Open Data Base Connectivity — enables Customers to select and use the database that is best suited to their needs. All the leading databases are ODBC-compliant, these include SQL Server, Oracle, Sybase, MySQL, and the like. It’s essential that the job scheduler does not force you to use its own particular database. Not only does this generally mean more costs, but also it always leads to operational problems – if you’re an Oracle shop and you are forced to use SQL Server, who maintains SQL Server and applies the myriad patches, etc.

8 – Auditing

In the ever-thickening fog of regulatory compliance, auditing is key. The most basic aspect of auditing is for each job that runs, a job log is created and archived. This addresses the first and most common question auditors ask – “let me see the job log of job 12345 that ran on the 15th of July last year, please” The second question auditors ask is: “please show me the history of all the changes to the job template that creates job 12345”.

9 – Automatic Report Distribution

To properly manage an enterprise-wide job scheduling environment, a complete and automatic report distribution facility is needed. Crystal Reports should be available as it is the industry standard and provides unprecedented power and flexibility.

10 – Automatic Failover

Failover has to be both transparent and automatic – there is no benefit in having failover that is manual, as manual “failover” is not failover at all…

11 – Macros

Like most IT functions, large areas of scheduling are duplicated. Thus copying a job must be trivial, but also a rich array of macros are needed.

A macro is a shorthand that can be substituted in hundreds of jobs, and when a change is needed, only the one macro entry need be changed. For example, when a job fails, one of the alerts may be to email three support people. The wrong way to do this is to hard code the explicit email addresses into the 100 or 200 batch jobs.

The correct way is to code a single macro. The macro itself contains the email addresses of the three support people. Thus when one of these people goes on vacation, it’s a trivial 30-second edit to update just the single macro and not 200 jobs.

An enterprise-level scheduler needs macros for all the commonly duplicated items, such as the above example specified of the email alert, as well as job macros, queue macros, and pager macros.

An extremely worthwhile, albeit rare, macro facility is one that is a simple global string macro. This uncommon but incredibly useful macro facility is where a specified macro string is replaced wherever the macro definition is encountered.

For example, if a string macro of %ACCOUNTING_9% is defined as “TEST” it can be used in a wide variety of locations within the jobs and sundry areas (such as email text messages, and pager messages, etc.) When the release of the accounting product needs to be converted from test to production, then only the single macro definition needs to change from “TEST” to “PRODUCTION”, then all uses of the string %ACCOUNTING_9% in all locations in all the job entries will now reflect the new string. Simple, but incredibly powerful.

12 – Automatic Daily – Or Weekly — Scheduling of All Jobs

In today’s complex enterprise-wide job scheduling environment, it’s essential that all of the day’s regularly scheduled jobs be displayed on the master job template screen.

Only in this way, can the operators see all the work for a day.

13 – Forecasting

Associated with the ability to see all of a day’s jobs on one screen, it’s just as important to be able to forecast what jobs will be run and when.

14 – Load Balancing

One of the less obvious aspects of scalability is the ability for the scheduling product to automatically load balance where work is processed. There are two benefits of this, the first is there is an implicit failover capability – if one of the destination servers is not available, another in the pool will be automatically used. The second benefit is spare CPU cycles and bandwidth can be used, thus increasing overall ROI of the hardware.

15 – Comprehensive Job Selection Screen

As there is a single, central, console that lists all the jobs, this console will get very crowded very quickly if the enterprise scheduler is running 10,000 jobs per day. Thus, from a simple human engineering viewpoint, a wide range of filters are needed to be able to manage this.