Category: "System Center"

SCOM 2016 UR3 released

SCOM, System Center, SCOM 2016 Send feedback »

SCOM 2016 UR3 has been released and contain some awaited fixes.

You can find the KB article here:
https://support.microsoft.com/en-za/help/4016126/update-rollup-3-for-system-center-2016-operations-manager

Download location for the files: http://catalog.update.microsoft.com/v7/site/Search.aspx?q=4016126

Some issues that are fixed

  • When you run System Center 2016 Operations Manager in an all French locale (FRA) environment, the Date column in the Custom Event report appears blank.
  • The Enable deep monitoring using HTTP task in the System Center Operations Manager console does not enable WebSphere deep monitoring on Linux systems.
  • When overriding multiple properties on rules that are created by the Azure Management Pack, duplicate override names are created. This issue causes overrides to be lost.
  • When the heartbeat failure monitor is triggered, a "Computer Not Reachable" message is displayed even when the computer is not down.
  • The Get-SCOMOverrideResult PowerShell cmdlet does not return the correct list of effective overrides.
  • When creating an management pack (MP) on a client containing a Service Level (SLA) dashboard and Service Level Objects (SLO), the localized names of objects are not displayed properly if the client's CurrentCulture settings do not match the CurrentUICulture settings. In the case where the localized settings are English English, ENG, or Australian English, ENA, there is an issue when the objects are renamed.
  • The Event ID: 26373 error which may cause high memory consumption, and affect server performance, has been changed to an “Informational” message from a “Critical” message.
  • The Application Performance Monitoring (APM) feature in System Center 2016 Operations Manager Agent causes a crash for the IIS Application Pool that is running under the .NET Framework 2.0 runtime.
  • The UseMIAPI registry subkey prevented collection of processor performance data for RedHat Linux system. Also, custom performance collection rules were also impacted with the UseMIAPI setting.
  • Organizational Unit (OU) properties for Active Directory systems were not being discovered or populated.
  • The Microsoft.SystemCenter.Agent.RestartHealthService.HealthServicePerfCounterThreshold recovery task fails to restart the agent and you receive the following error message:
    LaunchRestartHealthService.ps1 cannot be loaded because the execution of scripts is disabled on this system.
    This issue has been resolved to make the recovery task work whenever the agent is consuming too much resources.
  • The DiscoverAgentPatches.ps1 script in Microsoft.SystemCenter.Internal.xml fails and you receive the following exception:
  • Exception: Method invocation failed because [System.Object[]] does not contain a method named 'op_Subtraction'. At C:\bin\scripts\patch.ps1:37 char:35 + for($count = 0; ($productList.Count-1); $count++) + ~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : InvalidOperation: (op_Subtraction:String) [], RuntimeException + FullyQualifiedErrorId : MethodNotFound
  • An execution policy has been added as unrestricted to PowerShell scripts in Inbox management packs.
  • SQL Agent jobs for maintenance schedule use the default database. If the database name is not default, the job fails.
  • This update adds support for OpenSSL1.0.x on AIX computers. With this change, System Center Operations Manager uses OpenSSL 1.0.x as the default minimum version supported on AIX and we no longer support OpenSSL 0.9.x.

Installation order and so on are all documented in the KB article or else use the same method as the previous UR for this version of SCOM.
Good luck!

Savision Live Maps Service Health Index

SCOM, System Center, SCOM 2012, SCOM 2016 Send feedback »

Starting with version of Savision Live Maps version 8.5 they added a new feature called Service Health Index.
Let us investigate what it does.

Those who have been using Live Maps the last few years know about the Services monitoring, which basically is a definition of a single application/service/distributed app which gets split up in 3 parts: Infrastructure, Application and User. We place items like Operating System and Disks in the infrastructure part, we can place specific server roles like Web Server, Domain Controller and monitored items such as website, database, windows service in the Application layer. The user checks go in the User side. This way we can display the state of the Service as a whole, but also its main parts and the effect the users might see.

However if one of the items of one of the 3 main maps goes red this usually makes that map and the whole service go red as well. Things can be overridden with custom health rollups, but still there are the usual Green-Yellow-Red colors and the rollup to the top. There have been several requests to be able to specify which parts of our application are more important than others. For instance, imagine a web farm. Lets say this farm has 3 web servers and 1 database. Now, if 1 web server goes down this will make the application go red, but the website is still up. The user side website check would show a green state still as well, but the health rollup does not make a distinction of this and rolls up the Application map to the Service state.

Now imagine the database going down. Assuming for a second any other high availability solutions for this database have failed. Without the backend database the website will not work. This is also rolled up again to a red state for the Application side and up to the total Service health. Depending on how the user side web checks are setup this could make that check go red as well as a User impact. However, looking at both imaginary situations the Service went into a red state and we potentially did not see much difference as to how important this red state was to the service.

Bring in the new feature Service Health Index!

Quite simply we have a list of items we are monitoring in the Infrastructure/Application/User maps and we define how important they are to the working of the Service on a scale from 1 to 5 with 5 being very bad.

What does this look like? Lets open up the Savision Live Maps Authoring Console and open up one of the Services. In this case I am opening up the SCOM service. There is now a tab called Health Index.

From this screen you can Enable the Health Index and set it to update its health index indication every x minutes. I set it to 15 minutes at first.
There is the option to set which states have an impact on the Health Index:

So I added Warning in this case as an example.

Next you will see a list of all current objects added to all 3 maps (Infrastructure/Application/User) which are added to one of the levels. You can now drag them around to the correct effect it would have to your Service.

So over here I have been dragging some components of the SCOM Service up to the higher impact levels.
The SCOM operational database, the main Resource Pool and the Data Access Service in this case were placed in the Catastrophic level (level 5). Next move down and place other components according to the expected impact of those components on the working of the service.
Next Save the result. Give it the amount of minutes you specified to calculate the health index the first time.

If we now go to the All Services Dashboard we see the following:

Luckily the SCOM service is still green. On the other service (Exchange) you can see the Health Index of 4, which means this red is quite red, but not catastrophic yet.

So now we have a combination of the health state rollups of the 3 main components of every Service and an additional Health Index indicating the resulting effect and priority of handling the situation!

Enjoy your monitoring and pass on the value of monitoring to the whole organization by displaying the state of company services and its impact to all stakeholders!
Bob Cornelissen

NiCE DB2 Management Pack updated

SCOM, System Center, SCOM 2012, SCOM 2016 Send feedback »

The new version 4.20 of NiCE DB2 Management Pack has been released!

New with this release
• Feature: Support of DB2 BLU Acceleration
• Feature: Monitoring of InDoubt Transactions
• Security: Support of DB2 restrictive databases
• Security: Support of non-root setup and operation
• Security: DB2 Instance attach extensions for user and password options
• Platform: New platform support for IBM AIX 7.2
• Platform: Support of non-standard paths for both installation path and instance user home directory

If you are interested in learning more you can click on the Nice logo to the right of this screen.

Happy monitoring!
Bob Cornelissen

SCOM Web Console Application Pool crashing every 15 minutes

SCOM, System Center, SCOM 2012, SCOM 2016 Send feedback »

Recently I had a customer where the SCOM web console application pool would be crashing every 15 minutes (2 servers in this case). This was on a SCOM 2016 instance on a Windows 2012 R2 server.

The error message we got was (the process id is a different number each time):

A process serving application pool 'OperationsManagerMonitoringView' terminated unexpectedly. The process id was '1111'. The process exit code was '0xc0000005'.

This is a bit of a generic access denied error code.
While looking at the application pool which was crashing all the time we see the application pool is running under the security context of "ApplicationPoolIdentity".
In this environment there are several policies in effect and this was probably affecting the access of this generic placeholder account to not be able to access some registry key or local path.

We changed the application pool identity to LocalSystem by opening IIS Manager -> finding the application pool -> on the right click Advanced settings -> find the Identity and use the dropdown to select the LocalSystem in this case. Could have also used another account which was used for another application pool on the server, but went with this one first.
Recycle the application pool after this.

The crashes stopped happening from here. The SCOM web console was reachable.

Hope it helps somebody sometime.
Bob Cornelissen

SCOM agent for Linux and root squash

SCOM, System Center, SCOM Tricks, SCOM 2012, SCOM 2016 Send feedback »

At one of my customers they had a problem deploying SCOM agents through a script on Linux servers. They had a number of Red Hat 6 servers and all went well. On the Red Hat 7 servers however the agent refused to install. Also through a push of the agent through the console. It seemed to stop around the file copy stage where the rpm file gets copied to the server and next run for installation.

It turned out to be a feature called "root squash" causing the issue. What it does is lock rights on NFS shared volumes, so root can not simply access or run commands from any directory. For instance the /home parts. When they turned off this feature the agent installed immediately.

Just writing this down because I am sure I will run into this again somewhere.

Happy agent deployment!
Bob Cornelissen

Test your knowledge on SCOM/OMS/Azure and more

SCOM, System Center, SCOM 2012, SCOM 2016, Windows 2016, OMS Send feedback »

Now test your knowledge on SCOM/OMS/Azure and more through this quiz for fun and to win a Band as well :D

You can take the quiz by clicking on the picture of by this link:
Test your knowledge on SCOM/OMS/Azure and more

Have fun!
Bob Cornelissen

Error 500.19 after installing Savision LiveMaps Unity Portal

SCOM, System Center, SCOM Tricks, SCOM 2012, SCOM 2016 Send feedback »

Today I was doing a quick installation of the Savision 8.2 Live Maps Unity Portal. Downloaded the self-extracting executable from the website and of course arranged a license key. While running the installer I selected the Express setup which just pushes the web portal onto the machine and not the other components available in the Advanced installation option. The installation ran in 2 minutes on a slow machine, and this is including the extracting of the files and running checks.

After installation the web page automaticaly opens up and I was greeted with the following error:

HTTP Error 500.19 - Internal Server Error
Module: WindowsAuthenticationModule

In the error description there is talk of a configuration section being locked at parent level.

Screenshot of the error:

What happened is that the configuration on the server level is that Windows Authentication is turned off and that this configuration is locked for the whole machine. So for the Live Maps Portal it is trying to read configuration from a configuration file relating to Authentication and because this configuration is locked at a higher level it throws an error.

How to fix it:

Open IIS Manager
In the left menu select your server name
In the middle of the screen select Configuration Editor

Near the top of the Configuration Editor is a selection box for which section you want to see and edit.
Go to system.webServer/security/authentication/windowsAuthentication

In the right hand manu you will find a link to Unlock Section. Click it to unlock this configuration item.

Now any lower level (Sites or Applications within a site) can have their own configuration for Windows Authentication.

Refresh the error page and the Live Maps Unity Portal came up fine!

Happy dashboarding!
Bob Cornelissen

How to make a SCOM implementation project successful

SCOM, System Center, SCOM Tricks, SCOM 2012, SCOM 2016 Send feedback »

I thought I would take a different approach to thinking about how to make a SCOM monitoring project a success. It is not about technical details or designs this time, but about a way to bring business and IT together into monitoring business related services and being in control of those processes. In a short blog post below I am touching upon some of those items.

https://www.savision.com/resources/how-to-make-a-scom-implementation-project-successful

Enjoy B)
Bob Cornelissen

WSUS Console not able to connect Handshake failed

System Center, Windows 2012 Send feedback »

Last week I installed a fresh WSUS server for a customer of mine and because it needed to download lots of files after the approvals were done we left it for a few days. Today I came in and opened the WSUS console only to notice it refused to connect. Got an error like this one:

The WSUS administration console was unable to connect to the WSUS Server via the remote API.
Verify that the Update Services service, IIS and SQL are running on the server. If the problem persists, try restarting IIS, SQL, and the Update Services Service.
The WSUS administration console has encountered an unexpected error. This may be a transient error; try restarting the administration console. If this error persists, Try removing the persisted preferences for the console by deleting the wsus file under %appdata%\Microsoft\MMC\.
System.IO.IOException -- The handshake failed due to an unexpected packet format.

After checking that the requires services were running the investigation starts. Lot of blog and forum posts from long ago to recent, all with different solutions.

I came across a post from 6 weeks or so ago which talks about an update KB3148812 which causes this behavior and also to cause an additional error where clients can not scan WSUS.

Now I could not find this KB patch installed on my system, however it mentioned manual steps to be done after applying the hotfix and those manual steps solved it indeed. Keep reading.
A little more research found that the 3148812 has now been cancelled and another one came in its place KB3159706.

KB3159706

This article describes what is going on and it contains manual steps to be followed! The first step solved the console not being able to connect. The second step is for HTTP Activation. And if you have SSL turned on there are a few more steps to follow.

Happy updating!
Bob Cornelissen

SCOM 2016 Features - Example - Network Monitoring MP Generator

SCOM, System Center, SCOM 2016 Send feedback »

In my previous post which introduced SCOM 2016 Features - Network Monitoring MP Generator I have shown you how to use the command syntax of the tool and why it was created. Now it is time for an example.

The idea:

Have fun monitoring some network device and see how the principles of the input XML file works.

Also because I have been doing a few presentations with a SCOMosaur theme, so we combine a little SCOM with a little dinosaur madness. You will see a few references of that here and there.

Mind I am using a simulated device which may not be fit for this purpose. Reason being the default simulated devices by the Jalasoft SNMP Device Simulator are all CERTIFIED. ANd we are of course creating monitoring for the non certified devices. :crazy: The OID's in the example below are from a APC UPS device, but for now we can use it as exampe clearly enough.

Prerequisites

  • First of all I am using SCOM 2016 TP5 here, which is the first version to include this feature.
  • I am using Jalasoft SNMP Device Simulator on another machine to simulate a few network devices of different types.
  • Of course make sure both sides can reach eachother with ping (ICMP) and SNMP.
  • I am using iReasoning MIB Browser to browse the SNMP tree on the device selected to determine we actually have data there and the right OID's.

Next on the list is to discover the devices in SCOM by creating a Device Discovery and adding the device IP addresses and SNMP community string to it and letting SCOM discover the devices.

The XML input file

Actually the idea here is relatively the same as a simple management pack setup.

  • A manifest with management pack name and version
  • A Device definition
  • A Device discovery>/li>

  • Device Components
  • Device ฉomponent Discovery
  • Rules (these are collection rules)
  • Monitors

Starting the Manifest

First we are going to define the start to the input file by the Root tag.
Next we define the Display Name and Version for the management pack.

Name and Version are mandatory and an optional tag is KeyToken.

Device Definition and Discovery

The next thing to do is create an entry for each type of device and to make a device discovery for it.

First we define a name for the device.
Next we jump into a discovery for it.

The discovery covers the SysObjId tag which points to the unique device identifier for the device type.
Next we have to specify a device type. The following types are supported for now: Switch, Router, Firewall, LoadBalancer.
Next fill out the Vendor and Model.

Components and Discovery

Now it is time to look into the components of the device. For example Processors or Fans. After we dicover those we can target monitors and rules to those components in order to monitor them.

We are opening the Components tag here, and it will be closed all the way at the end of the story.

Next we define our first component.
There are a few component types supported at this moment: Processor, Memory, Fan, Voltage Sensor, Power Supply, Temperature Sensor.
And we give it a name of course.

Now we define the OIDs we are interested in. These OIDs will have to be there for each instance of the Component we define. One of these will be used in the discovery of the component and the same one and/or others we can use for rules and monitors. At least we have defined all of them here and given them original names.

We do not have to enter the index number of each component instance. For example...

fan2 = 1.3.6.1
fan2 = 1.3.6.2
fan3 = 1.3.6.3

In the very short OID example above you can see the last number is the index number for each fan. So we only need to specify 1.3.6 in this case and the discoveries will find each instance for you.

In this case I named the component the Tricera Environment and gave it a Processor type, just because it needs to conform to the default types at this moment.

The 3 used OID's are a Temperature OID, a Usage OID (which happens to be the amount of battery percent left for the UPS), and an overal state indicator OID for this component.
For the step coming after this, it means we have two performance counters we can collect (but I will collect all three in the example), and also we can create state monitors based on the values.

Lastly the ComponentDiscovery is a pointer to which of the already defined OIDs is a component indicator. In this case I use the state indicator OID. If that one is there (with an index number behind it) an instance of the component will be created or as many as needed.

Monitoring and Rules

Alright now the monitoring needs to start for the component we are still at.

For starters we set the Monitoring tag. We will close that tag later after we have defined all rules and monitors.

Next we start with the rules:

We open the Rules tag and next define the performance collection rules as you see here. I used short names for it and pointed each rule to the name of the OID we defined already. See how easy that part is?

Lets go to the monitors now...

Monitors

First again we start it off with the Monitors tag which we will close off after the last monitor we add.

Alright, first UnitMonitor. We give it a name. In this case Triceratops Environment Status.

It is a two state monitor so we define two expressions.
Both of them point (in black letters in the middle here)
to the name of the OID containing the state indication.
The first expression is for success (green state) and uses 2 or less. And the second expression uses anything higher than 2 to set it to an error state.

So i repeated that two more times for the Temperature and set it to 30 degrees as maximum acceptable value, otherwise our dino gets sunburn. :lalala:
And the third monitor is using the TriEnvUsage OID to determine if it is at 100 or below.

And now as promissed we close the whole load of tags off:

The conversion process

Alright we now have an XML input file with all the stuff we need. Now we need to use the Network Monitoring MP Generator tool to convert the input file to a management pack XML file.

Open a command prompt and go to
%Program Files%\Microsoft System Center 2016\Operations Manager\

I placed my input file in the folder C:\SCOMosaur with file name dino.xml and I will allow the output file to be written to that folder as well.

I run the command:
NetMonMPGenerator.exe -InputFile "C:\SCOMosaur\dinos.xml" -OutputDir "C:\SCOMosaur"

The program will let you know if there are any errors and it will confirm if it finished creating the management pack file.

From here you simply import the management pack and as usual wait a little bit.

Conclusion

Well it is a lot easier to create this input file with the basics we need to be monitoring the custom device. The total input XML file was about 60 lines if we take away the empty lines. The resulting management pack was 690 lines long.

There will be a complete example coming from the product team very soon now, including comments in the file and such. This is just a quick starter to help you play with this feature.

This is meant to get NOT Certified devices in a more complete monitoring state as if it were CERTIFIED. As you have seen the device types and component types are for the moment a limited set.

My idea around this feature is that the possibilities might still expand in due time to be more and more flexible. Also it would be nice to see a graphic interface to build up the input XML and of course that would immediately build up the management pack. However those kind of things take a lot of time to build. I consider the current solution a nice go between.

Back to the SCOM 2016 Features - Overview post!

Hope you all have fun!
Bob Cornelissen

Contact / Help. ©2017 by Bob Cornelissen. blog software.
Design & icons by N.Design Studio. Skin by Tender Feelings / Evo Factory.