SCOM – Powershell Recovery Action – Stopped Windows Service

By | November 12, 2019

Hi,

Today I was at a customer who had a really specific question regarding monitoring of Windows Services with Operations Manager (SCOM).

We had already set up some basic recovery actions which restart the service automatically after it was stopped.

For some other services the customer wanted to add extra functionality: The recovery action should retry starting the service a maximum of 3 times, if the service wasn’t started after 3 tries the customer wanted to receive an email telling them the recovery action failed. Out-of-the-box SCOM is unable to do stuff like that, therefore I used Powershell to accomplish this.

Sidenote: To be able to use Powershell as a recovery action you can use the free management pack provided by the community & SquaredUp, it can be downloaded from this website: https://squaredup.com/free-powershell-management-pack/. This management pack adds Powershell everywhere it is missing in Operations Manager, this is one of the default management packs I always install at customers.

To be fully functional different components are needed:

  • A monitor that checks the status of the service
    • This monitor can be created from the Authoring pane of the SCOM console using the Windows Service template
3
  • A recovery action for the monitor created previously
    • The recovery action can be created from health explorer
  • A rule that picks up the event created by the recovery action Powershell script
    • This is an Alert Generating Rule (NT Event Log), the configuration is linked to the type and location of the event logged during the script
  • A subscription on the rule to send the email.

The powershell script:

# Fill in the service name here

$ServiceName = “LPD Service”

$ServiceStarted = $False

$i =0;

#Create Eventlog source, erroraction Ignore is neededbecause once the source is created an error is thrown because the source already exists

New-Eventlog -LogName Application -Source “Powershell – Restart Service” -ErrorAction Ignore

Do{

# In second or third run, wait a minute before trying
to start the service

if ($i -gt 0){Start-Sleep -s 60}

#Try to start the service

Start-Service $ServiceName

$Service = Get-Service -Name $ServiceName

     if($Service.Status -eq “Running”)

    {

    $ServiceStarted = $true

     }

    $i++

    if (($i -eq 3) -and ($ServiceStarted = $false))

    {

    $eventmessage = “$Servicename failed to restart after $i attempts, exiting script”

    #Log error event in eventviewer

    Write-Eventlog -LogName Application -Source “Powershell – Restart Service” -EntryType Error -Eventid 101 -Message $eventmessage

    exit

    }

 }

Until ($ServiceStarted = $true)

 $eventmessage = “$ServiceName restarted after $i attempt(s)”

Write-Eventlog -LogName Application -Source “Powershell – Restart Service” -EntryType Information -Eventid100 -Message $eventmessage

 If you have any difficulties doing this, don’t hesitate to drop a comment below.

If you find this post useful, please consider buying me a virtual beer with a bitcoin donation: 3QhpQ5z5hbPXXRS8x6R5RagWVrRQ5mDEZ1

Best regards,

Bert

Leave a Reply

Your email address will not be published. Required fields are marked *