KBI 310582 Issue Addressed: Job Scheduler TCP Retry
Version
Issue addressed in Argent Job Scheduler 1304-B and above
Date
Tuesday, 2 July 2013
Summary
Jobs intermittently fail to start or are reported as ‘Abended‘
Technical Background
This is a rare situation whereby TCP connections are dropped or rejected intermittently by the Queue Engine due to the following reasons:
- Intermittent latency spikes which may lead to connection timeouts
- TCP configuration settings on the Queue Engine which may prevent new connections being accepted if the server is busy
- TCP handshake errors
- Issue with the physical layer of the network
There are various other reasons why TCP connections are dropped or rejected
In earlier versions Argent Job Scheduler would only attempt the connection once and if this failed, the Queue Engine would be marked as unreachable
Resolution
Argent Job Scheduler now retries a TCP connection to the Queue Engine for five times before concluding that the Queue Engine is unreachable
This makes connections to the Queue Engine more resilient to errors that may occur at the network layer
The issue has been addressed in Argent Job Scheduler 1304-B and above