Service Principal Names (SPN) are used to uniquely identify an instance of a service. SCOM Services are also registered and mostly everything will be just fine, but in some cases you might have issues with this.
My friend Walter Eikenboom has written up a very good post about SCOM and SPN, how to check it, how to set it etc. Check it out here: http://systemcenterdynamics.wordpress.com/2009/08/26/scom-2007-r2-what-should-my-spn-registrations-look-like/
Also a post by Kevin Holman about the subject:
System Center Operations Manager SDK service failed to register an SPN
http://blogs.technet.com/b/kevinholman/archive/2007/12/13/system-center-operations-manager-sdk-service-failed-to-register-an-spn.aspxSymptoms if you see an event from OpsMgr SDK Service with ID 26371
And another writeup by Jonathan Almquist:
Another case that can be brought back to an SPN issue by JC Hornbeck:
OpsMgr 2007: Agents stuck in Pending Management with Event ID 21016
In some cases you might require a lot more info from SCOM about what is going on during troubleshooting. In that case you might want to have more diagnostic logging (more verbose). Here is how to use it.
How to use diagnostic tracing in System Center Operations Manager 2007 and in System Center Essentials
In some cases we might open the SCOM console and go into a state view and see old entries there. Devices that have actually gone already. Items that have been discovered before somehow and not being monitored. Assuming that we do not want these monitored we would like to get rid of them in our state views.
First of all check out SCOM Trick 10 and see if it is not just the SCOM console playing tricks on you.
Next you can go into the Start menu – All programs – System Center Operations Manager 2007 R2 – Operations Manager Shell. When it is loaded you can type the following command:
That should get rid of a lot of the unmonitored entries.
Many times it can happen that you are looking at stale data in the SCOM console. For instance you click on an alert and it gives you an error, saying that it has already been closed (in more difficult terms). Sometimes you see an entry in a list that should not be there anymore. This could have several reasons, which will be discussed later. But the first things to do when you fear you are looking at data that could have some clutter in it are the following (going one step further every time until you see what you expect to see):
- Press F5 to refresh the screen
- Close the SCOM console and start it using the /clearcache option
Click Start ? Run and put the following (in one line) as the command:
"C:\Program Files\System Center Operations Manager 2007\Microsoft.MOM.UI.Console.exe" /clearcache
- Next you can do the same after deleting the following registry key (close the console first):
HKEY_CURRENT_USER\Software\Microsoft\Microsoft Operations Manager\3.0\Console
This gets rid of most of the “ghost” entries of alerts and items in the view caused by the console itself.
Have seen a thread today on the TechNet forums about the SCOM Console crashing when running a task. http://social.technet.microsoft.com/Forums/en-US/operationsmanagergeneral/thread/c339c327-1e7d-412b-9d1a-5ae0b8a2e0f8/#9ae059f9-f881-4fa1-9d42-fceed4c4ef2d
We actually also have one issue exactly like that, so it was a nice opportunity to dive into it a bit more. So what happens is that from the SCOM Console you run a task against some agent and while running that task it crashes and takes the SCOM Console along with it. You can get an error like the following:
Alexey Zhuravlev from opsmgr.ru recently found that Internet Explorer 9 seemed to cause the problem in a case he encountered and in this thread the solution was also to remove IE9 to get it working.
To quote him on that part:
Console calls this:
If I understand the process correct, it uses Mshtml.dll. IE9 replaces this dll (installs it's own version 9.0.*). And it looks like the new version causes an access violation...
One of the colleagues of mine at a customer location also had these problems since a while, but with certain tasks. It turns out to be the tasks that are run against the agent. So for instance a ping will work just fine because it is running locally, but a task like show processes or start a service will first give a popup for you asking if you want these credentials after running that one it crashes (and takes the SCOM console down with it. So when you do that you get one of these popups:
We tested by uninstalling IE9 and confirming it was now IE8. Tasks ran fine. So we installed IE9 again, but this time from the Microsoft website and not some internal updating process. And yes, it crashed again at running a task.
So the current workaround is to uninstall IE9 to work with this. Hope that it will be fixed soon.
Update 9 June 2011:
Lincoln Atkinson gave an answer to a thread in the Technet Forums about this issue with a remark for a future fix:
The product team is aware of this issue and are looking into a fix. We will be fixing for vNext + if possible backporting the fix in a future cumulative update.
As I have said in SCOM Trick 7, the use of maintenance mode is important. But not always will somebody use the normal SCOM interface (or web interface) to start maintenance mode right before they start working on the machine. To be honest a lot of the times machines get placed in maintenance mode when the alerts start flowing in during planned work and they quickly place the machine into maintenance mode. In any case, you can actually schedule maintenance mode, or include it in scripting. Here are some resources to get you started.
Maintenance mode history report
Remote maintenance mode mp
by running a script on the agent, makes event log entry that gets picked up.
MCS maintenance mode mp
Put a group into Maintenance Mode
Maintenance Mode powershell script
Remote Maintenance Mode GUI tool
Cluster and maintenance mode
Stopping maintenance mode
SCOM Maintenance Mode Tool
Schedule a group of URLs (or one) into maintenance mode
One of the questions that get asked after people start using maintenance mode in SCOM, especially in bigger environments, is to provide an overview of who put something into maintenance mode.
Somebody wrote a management pack for this!
Maintenance Mode History Report Management Pack
One of the great features in SCOM is the ability to place a machine/device or part of it in maintenance mode whenever you are working on the machine and you do not want it to generate alerts while you are doing your stuff. For instance during a planned change. This also avoids unnecessary red and yellow health states which affect your SLA availability reports. One more thing is that it tends to not stress out helpdesks and ticketing systems if you try to avoid sending them unneeded alerts (sometimes they do not know you are playing with the machines).
In many cases a whole machine will be placed into maintenance mode. As of SCOM 2007 R2 setting maintenance mode for a machine only has to be done in one place and not in three places like before.
You can enable maintenance mode from any state view by clicking the desired machine/device/website/database and selecting Start Maintenance Mode in the actions pane or by right-clicking and selecting that option. This can also be done right from an alert view, but I always prefer to be clear on where I select it.
From there you can select if it is planned or not and what the reason is. You can start it, stop it or change the duration.
Something to NOT do is place management servers in maintenance mode. Unless you know what you are doing.
In some cases when you have a health explorer that will not turn green and manually resetting it does not help and in some cases it is just at the rollup stages where it will not turn back to green… you can try to place the machine in maintenance mode for 15 minutes and after that time it will re-calculate the health state.
In an upcoming Trick I will list some of the tools you can use for maintenance mode, scripts, powershells, management packs and so on.
Just remember to use maintenance mod.
A question that pops up a lot is if we can monitor VMware with SCOM. There are actually several options that deliver different levels of monitoring.
- Veeam Nworks. A very complete third party SCOM add-on.
- Bridgeways VMware monitoring MP.
- Vizioncore (Quest).
- SNMP and Syslog from SCOM
I am sure there are more possibilities, but these are the most well-known I think.
My personal favorite is still Nworks as the most complete and robust system.
My opinion is still that books are a great resource to learn about a product. Also for SCOM there are a few around:
- System Center Operations Manager 2007 Unleashed
A very complete book about SCOM, written by MVP’s
- System Center Operations Manager 2007 R2 Unleashed
A follow-up on the previous book that builds on top of it, adding R2 pieces of information, but also authoring and more
- Mastering Microsoft System Center Operations Manager 2007
- Monitoring Exchange Server 2007 with System Center Operations Manager
My personal favorites (and many with me) are the two Unleashed books which are the most complete and clear to read for SCOM admins of all levels.