SQL Server Schedulers To Rule Them All

Last Updated on August 29, 2020 by John Morehouse

In this post I’ll examine how installing SQL Server from the incorrect ISO can lead to CPU throttling.  If you aren’t careful, you could be paying for SQL Server cores that you can’t even use!

Sometimes things just don’t make sense.

One of our production servers has 4×8 (4 sockets, 8 core) processors with hyper threading enabled.  This results in 32 physical cores or 64 logical cores.  Therefore, this particular server should have 4 NUMA nodes available.

bufferHowever, I was looking at the dashboard for this particular server through SentryOne.   I happened to notice that the server was only showing 3 buffers for SQL Server to utilize.  If you look at the  image, the 3 purple blocks on the left represent the number of buffers the server has.  Each block  reflects a one to one ratio of buffer to a NUMA node.  If you have two NUMA nodes, you would have two blocks represented.

As you can see, I’ve got 3 blocks, so 3 buffers are aligned with 3 NUMA nodes, which is not expected.  Where is the fourth buffer?  Did it just not show up for work?

So I started to investigate it.

SQL Server is pretty versatile.  It’s got dynamic management views (DMVs) that will tell you a number of things if you know where to look.  In this case, I needed to look at those that told me about the underlying core structure.  Namely, these:

Let’s take a closer look

numa_nodesThe DMV sys.dm_os_nodes did indeed confirm that the server has 4 NUMA nodes available.  SQL Server sees that there are four of them.  Node ID of 64 represents the Dedicated Administrator Connection (DAC).  This node guarantees that if you need to use the DAC, you’ll have resources available to do the work.

Also notice that node_id of 3 is currently set to “OFFLINE“.  That’s curious.  We will come back to that.

You could also use the sys.dm_os_schedulers DMV to find out how many nodes it has:

SELECT DISTINCT parent_node_id
FROM sys.dm_os_schedulers

I have now confirmed in two different ways that SentryOne should be reflecting 4 memory buffers.  Let’s take a closer look at what this DMV is telling us.  A scheduler is essentially a gatekeeper.  Schedulers coordinate processing time on the processors for worker processes.  There is a one to one correlation between the number of cores and the number of schedulers.  If you have a 32 core system, you would have 32 schedulers to process end user requests.

Just as a side note, while there is a direct link between the schedulers and the processor, this does not mean that each scheduler is limited to that specific processor.  A scheduler assigned to processor #1 could end up work on processor #2 assuming that the affinity mask IS NOT set.

In this DMV, I happened to notice that each scheduler also has an available status.  These statuses are:

  • HIDDEN ONLINE
  • HIDDEN OFFLINE
  • VISIBLE ONLINE
  • VISIBLE OFFLINE
  • VISIBLE ONLINE (DAC)
  • HOT_ADDED

Hidden schedulers are used to process requests that are internal to the engine itself.  Visible schedulers are used to handle end-user requests.  When you run the casual SELECT * query, it will utilize a visible scheduler to process the query.  With this information, if I have a 64 core server and all is well, I should have 64 visible schedulers online to process requests.

offline_schedulersHowever, I discovered that some of the schedulers were set to “VISIBLE OFFLINE”.  This essentially means that those particular schedulers are unavailable to SQL Server for some reason.   How many offline schedulers do I have? A quick query resulted in 24 schedulers currently offline.  24 logical cores means that 12 physical cores are offline.

But why would a scheduler be set to “VISIBLE OFFLINE”?

I wasn’t able to find anything really online as to why SQL Server would only have 40 logical cores online and available.  So, I started to wonder if SQL would dynamically spin these up if it was under CPU pressure.

After my google-fu failed me and I had some conversations with Chris Yates (B|T), we thought it was a good idea to turn to my good friend David Klee (B|T) at HeraFlux Technologies.  David is a longtime friend from Omaha, Nebraska and is wicked smart about this stuff.   As a side note, being in the SQL Server Community and knowing who to call, makes a world of difference.

Later that evening David called me and we discussed what I was seeing.   Within about 15 minutes David asked which ISO we had installed on the server.   Given that the server was installed and running prior to my starting at the company, I wasn’t sure which ISO all I knew was that it was SQL Server 2012 Enterprise Edition.  David continued to tell me that there are two different versions of SQL Server 2012 Enterprise ISO’s.  One was core based which allowed you to utilize all of the cores available to the server.  The other version would limit you to 20 physical cores or 40 logical cores with hyper threading.

Wait.  What?  I’m sure my jaw hit the floor as I have never heard of that.  David gave me some things to check the following morning.

The Truth Will Set You Free!

online_schedulers_1The next morning,  I checked the core counts again, and there it was.  40 cores online.  Things were starting to make some sense.

One of the things that David said to check was the server log.  If you are using the one that limits the number of cores, there will be an entry in the log upon start up.  Given that we had just recently patched the operating system, I could easily check to see if that message there.

error_log
Click to Enlarge Image

Shoot.  The incorrect ISO was installed and we are limited to only 40 logical cores. This means that a portion of our 64 core we can’t even utilize to process queries.  This also potentially means that we could experience other issues due to the fact that the work load is now not balanced across all of the processors.  We haven’t experienced this (yet) but I suspect as time goes on we would start to see some evidence of this.

Now that I had a better understanding of what had happened, I change my google queries and was able to locate some references to this particular “issue”.

https://blogs.msdn.microsoft.com/sql_shep/2012/06/21/sql-server-2012-license-core-limitaion/

From that blog post, it sounds like an easy fix, however we will have to test it some fashion as this is one of our Production SQL Servers.  And it’s clustered.

Summary

If you have SQL Server 2012 Enterprise edition or higher and you have more than 20 physical cores available, you might want to check to ensure the correct ISO was installed.  If not, you could be licensing cores that you can’t even utilize.

Here’s script that will help you determine whether or not you need to investigate further. This should work with SQL Server 2008 and up.

SELECT CASE WHEN COUNT(1) < ( SELECT COUNT(1)
FROM sys.dm_os_schedulers
WHERE [status] IN ( 'VISIBLE OFFLINE',
'VISIBLE ONLINE' )
) THEN 'ERROR - Offline Schedulers Detected'
ELSE 'SUCCESS! No offline schedulers Detected'
END AS 'Status Check'
FROM sys.dm_os_schedulers
WHERE [status] = 'VISIBLE ONLINE'
GO

Double check your SQL Servers for this particular issue.  Nobody wants to throw money down the drain.

 

© 2017 – 2020, John Morehouse. All rights reserved.

8 Replies to “SQL Server Schedulers To Rule Them All”

  1. Or, just run sp_Blitz. 😀 I’ve been warning folks about that for years. Comes up as a high priority urgent warning. (Plus you can have memory nodes go offline due to that same reason, too.) Happened to be working with a client this week with that problem.

  2. I’ve come across this probably a half-dozen times in the wild. I believe you can see the same type of thing with the Standard Edition core limit too.

  3. Thanks John for the wonderful article! I am curious to know what you did to fix the issue as I am in the similar boat right now. 🙂 Only change in my scenario is, I have SQL Server 2014 Enterprise Edition with 72 processors but it is only using 40 of them.

  4. This is still useful – I just had to troubleshoot this for my company’s first SQL Server Enterprise installation. I have to say, though, setting the core limitations to less than the standard edition limits for SQL Server 2016 is a pretty big “gotcha” when upgrading.

  5. Hey Brendt! Thanks for stopping by! Glad you find the post useful! It really caught me off guard when I found it on our production servers that’s for sure.

Hey you! Leave me a comment and start a discussion!

This site uses Akismet to reduce spam. Learn how your comment data is processed.