Service Rules

Service Rules interact with the Windows Service Control Manager to check the status of services. Service Rules are generally used in conjunction with Service Alerts – the Service Rule finds an issue and the Service Alert automatically takes the action to correct it.

Service Rules not only test services but they also optionally take actions before sending an Event to the main Argent Console screen. This means an Alert can be avoided. You’ll see this in the examples below.

In addition, Service Macros can be used to improve your productivity by abstracting groups of related services into a macro.

Creating A Simple Service Rule

The top pane of the main Service Rule screen (G8A) specifies the list of services on the selected machine.

The machine is selected at the top of the G8A screen, in this case it is ARGENT-SER2003.

The status of the services is color-coded: green for running; red for stopped.

After specifying the name of the new Service Rule, the next step is to select one or more services from the top pane. Simply highlight the services in the top pane and click the Add button. (You can hold the Ctrl key down to select multiple services at once.)

By default the tests are all set to Not Running – if the service is not running when the Rule is run the Rule will be broken.

This is now a complete and functioning Service Rule.

Creating A Service Rule With Attempted Correction

In this example, the Service Rule will attempt to correct the issue.

So far this example appears to be the same as the previous one – if the MailMarshal Controller service is either not running or not responding then the Rule is considered broken and an Alert will be sent to the main Argent Console screen.

Or will it?

Before drawing that conclusion, let’s have a look at the Advanced screen for this Rule.

Here’s what you will see:

Whoa!

Glad we looked…

This Service Rule – not the Service Alert – the actual Rule is attempting to correct the error condition.

Here is what the Rule’s Advanced tab holds:

STOP MailMarshal Engine
STOP MailMarshal POP3
STOP MailMarshal Receiver
STOP MailMarshal Sender
STOP #BROKEN_SERVICE#
Sleep 15 seconds
START #BROKEN_SERVICE#
Sleep 15 seconds
START MailMarshal Engine
START MailMarshal POP3
START MailMarshal Receiver
START MailMarshal Sender

Let’s examine how this works and what the syntax means – it’s fairly clear, but a few words won’t hurt.

First this Service Rule unconditionally tries to stop the four MailMarshal services, then unconditionally tries to stop the service that was stopped or not responding – the MailMarshal Controller service.

Then the Service sleeps for 15 seconds (no special significance; anything over 10 seconds should be fine)

Then the Service Rule attempts to restart all five services, one after the other, in the specified order.

If, and only if, all five services do correctly restart will the Rule be considered passed, that is, no Event will be sent.

Note:

You can, and generally should, control the order the services are started. For many vendors’ products the order the services are started is critical; Argent realizes this and provides this facility.


If you found the above services needed some time for initialization, then you could add to the Rule:

START #BROKEN_SERVICE#
Sleep 30 seconds
START MailMarshal Engine
Sleep 30 seconds
START MailMarshal POP3
Sleep 10 seconds
START MailMarshal Receiver
Sleep 10 seconds
START MailMarshal Sender
Sleep 10 seconds

Here is how the Advanced tab was defined.

And for all 12 lines this need be done, varying the action and the name.

For The Service That Caused The Rule To Fail the explicit service is replaced with #BROKEN_SERVICE#.

Using A Service Macro

All this is fine, but you’ve likely noticed two minor wrinkles – there are 12 lines to enter, and one spelling error and the Rule itself has an error. (Those with very keen eyes will notice in the first few screenshots the first line had MailMarshal incorrectly spelt…)

Wouldn’t it be better to define this sequence once – and correctly – and then reuse this sequence? That’s precisely what a Service Macro does.

Here’s a Service Macro for MailMarshal

MailMarshal Engine
MailMarshal POP3
MailMarshal Receiver
MailMarshal Sender

With this Service Macro now defined, we can simplify the Advanced tab of the Service Rule to:

STOP &SM_MMSL1
STOP #BROKEN_SERVICE#
Sleep 15 seconds
START #BROKEN_SERVICE#
Sleep 15 seconds
START &SM_MMSL1

Not only do we abstract the services to a macro that can be re-used the chance of spelling mistakes is greatly lessened.

Note:

Because the MailMarshal services were not installed on this machine, they had to be explicitly typed; had the services been installed – need not be running, just installed – then they could have be added via the OK button, as is shown below.


Multiple Retries And Unconditional Alerting

To cover all the bases, there are two additional options you can use.

Try Action x Times

Using this option you can retry the corrective sequence a number of times. This is generally used with extremely unstable services; like starting a dodgy car engine – sometimes it starts, sometimes it does not. Not recommended, but you have the option if you need it.

Fire Event Even If The Condition Is Corrected

This option makes more sense, and it makes a lot of sense to use this option – if the Service Rule is doing its job and it is restarting the service that’s good. But what happens if the Service Rule is doing this 15 times an hour…

It’s recommended to always put an Event on the main Argent Console screen, even if the Service Rule does correctly restart the stopped or stalled service.


Service Rule Failing If Service Is Running

There are a number of reasons to test for services running when they should not be.

The most common reasons are conflicts, security issues, or malware.

In this example, it’s a conflict test.

The Service Rule breaks if IIS SMTP is running because the SMTP service is normally not used in most IIS configurations and can cause inadvertent messaging conflicts.

Trend Analysis And Capacity Planning With Service Rules

To make the most use of resources, you can have the Service Rules add to the Argent Predictor each time the Rule tests a service or set of services.

The Argent Predictor is the Argent trend analysis and capacity planning product. By having the Save Service Uptime For Trend Analysis you save the data point every time the service is checked, not every time the Rule breaks, but each time the Rule runs.

Auto-Started Services Checking

Some services are started when the operating system starts, they are called – surprisingly — Auto-Started services.

Argent checks these by having Alert If Any Auto-Started Service Is In Stopped Mode

Note no trend analysis data is collected – the checkbox of Save Service Uptime Data For Trend Analysis is grayed.