KBI 311151 How To Ensure High Availability In Argent Job Scheduler
Version
Argent Job Scheduler all versions
Date
Wednesday, 14 Jan 2015
Summary
The article describes the following:
- The Perfect Installation method and the Best Practices to ensure High Availability in Argent Job Scheduler
- How Argent Job Scheduler High Availability architecture handles the following unlikely Events of server downtimes
- Main Engine goes down
- Main Engine comes back online when Backup Engine runs on Main Database
- Main Database goes down
- Switching back to Main Engine manually after Main Database is up
- One of the Queue Engine goes down
Technical Background
High Availability of Argent Job Scheduler can be ensured by strictly following the Argent Job Scheduler High Availability architecture during the installation and configuration
The explanations are for one Scheduling Engine – many Queue Engine environment
In case of multiple installations of Main Scheduling Engines, the architecture prescribed for a single Main Scheduling Engine is applicable for all other Main Scheduling Engines
Perfect Installation
A perfect installation for Argent Job Scheduler for High Availability should be as follows:
- Main Scheduling Engine in one Server
- Main Database (SQL Server) in another server
- Backup Scheduling Engine in third server installed using the same database of Main Scheduling Engine
- Backup Database in fourth server
- Two Queue Engines (second one replica of the first) to run the same Job of which the first available server is selected for execution
Best Practises For Setting Up High Availability
Backup Scheduling Engine Setting
- The Backup Scheduling Engine should be installed on a server in the same network and with same user credentials as the Main Scheduling Engine
This is to ensure that the Backup Scheduling Engine has availability of dependent databases on different servers and dependent files specified using UNC paths
Backup Database Settings
- The Backup Database should be installed on another server than the Main Database
- No mirroring tools should be configured for the Backup Database as Argent Job Scheduler itself syncs databases automatically
Replica Queue Engine Settings
- The Queue Engine replica should be in the same network and should have the Argent Queue Engine service running under the same user credential as the original Queue Engine
- The Job files specified locally in the original Queue Engine should be available locally for replica Queue Engine as well
- If the Job files are specified using UNC paths in original Queue Engine, the path should be accessible from the replica Queue Engine as well
- In J20C screen of Argent Job Scheduler, both the Queue Engine and its replica should be added and the first radio option ‘First Available Server’ should be selected as shown in the screenshot below
How Argent Job Scheduler High Availability Architecture Handles Server Down Times
The below sections explain the behaviour of Argent Job Scheduler under the following scenarios:
- Main Engine goes down
- Main Engine comes back online when Backup Engine runs on Main Database
- Main Database goes down
- Switching back to Main Engine manually after Main Database is up
- One of the Queue Engine goes down
Scenario 1: Main Engine Down
Consider the situation where the Main Scheduling Engine goes down
Backup Engine Takes Over
- The Backup Scheduling Engine detects that Main Scheduling Engine is down in specified time and checks for the availability of Main Database
- Once the Backup Scheduling Engine finds the Main Database is up, it takes over scheduling from Main Scheduling Engine with the Main Database
Scenario 2: Main Engine Back Online While Backup Engine Runs On Main Database
- The Backup Scheduling Engine detects that Main Scheduling Engine is up, in the specified time; it transfers service back to Main Engine
- The Backup Engine switches back to Backup mode
Scenario 3: Main Database Down
Consider the situation where the Main Database goes down
Main Engine Shuts Down And Backup Engine Starts Using Backup Database
- Main Scheduling Engine detects the Main Database is down and waits for it to come back online for 15 minutes
- Once it fails to detect the Main Database up even after 15 minutes, Main Scheduling Engine service goes down automatically
- The Backup Scheduling Engine detects the Main Scheduling Engine as down and checks for the availability of Main Database before taking over
- When the backup Scheduling Engine finds the Main Database is down, it takes over the scheduling with the Backup Database
Scenario 4: Switching Back To Main Engine After Main Database Is Up
Note:
Switching back to Main Scheduling Engine is possible only through user intervention
- Down the Backup Engine service
- Bring up the Main Database
- Run the AJS_FP/REVERSE executable in command prompt that synchronises the Backup Database with Main Database
- Bring up the Main Scheduling Engine service
- Bring up the Backup Scheduling Engine service which works in backup mode
Scenario 5: Queue Engine Is Down
Consider the scenario where one of the Queue Engine goes down
Job Is Executed On The First Available Queue Engine
As the the Job is configured to execute on the first available Queue Engine out of the 2 available, the Job gets executed on the second Queue Engine when the first one goes down
Resolution
N/A