Hi,
Today I was at a customer who had a really specific question regarding monitoring of Windows Services with Operations Manager (SCOM).
We had already set up some basic recovery actions which restart the service automatically after it was stopped.
For some other services the customer wanted to add extra functionality: The recovery action should retry starting the service a maximum of 3 times, if the service wasn’t started after 3 tries the customer wanted to receive an email telling them the recovery action failed. Out-of-the-box SCOM is unable to do stuff like that, therefore I used Powershell to accomplish this.
Sidenote: To be able to use Powershell as a recovery action you can use the free management pack provided by the community & SquaredUp, it can be downloaded from this website: https://squaredup.com/free-powershell-management-pack/. This management pack adds Powershell everywhere it is missing in Operations Manager, this is one of the default management packs I always install at customers.
To be fully functional different components are needed:
- A monitor that checks the status of the service
- This monitor can be created from the Authoring pane of the SCOM console using the Windows Service template
- A recovery action for the monitor created previously
- The recovery action can be created from health explorer
- A rule that picks up the event created by the recovery action Powershell script
- This is an Alert Generating Rule (NT Event Log), the configuration is linked to the type and location of the event logged during the script
- A subscription on the rule to send the email.
The powershell script:
# Fill in the service name here
$ServiceName = “LPD Service”
$ServiceStarted = $False
$i =0;
#Create Eventlog source, erroraction Ignore is neededbecause once the source is created an error is thrown because the source already exists
New-Eventlog -LogName Application -Source “Powershell – Restart Service” -ErrorAction Ignore
Do{
# In second or third run, wait a minute before trying
to start the service
if ($i -gt 0){Start-Sleep -s 60}
#Try to start the service
Start-Service $ServiceName
$Service = Get-Service -Name $ServiceName
if($Service.Status -eq “Running”)
{
$ServiceStarted = $true
}
$i++
if (($i -eq 3) -and ($ServiceStarted = $false))
{
$eventmessage = “$Servicename failed to restart after $i attempts, exiting script”
#Log error event in eventviewer
Write-Eventlog -LogName Application -Source “Powershell – Restart Service” -EntryType Error -Eventid 101 -Message $eventmessage
exit
}
}
Until ($ServiceStarted = $true)
$eventmessage = “$ServiceName restarted after $i attempt(s)”
Write-Eventlog -LogName Application -Source “Powershell – Restart Service” -EntryType Information -Eventid100 -Message $eventmessage
If you have any difficulties doing this, don’t hesitate to drop a comment below.
If you find this post useful, please consider buying me a virtual beer with a bitcoin donation: 3QhpQ5z5hbPXXRS8x6R5RagWVrRQ5mDEZ1
Best regards,
Bert