SQL Server parallelism optimization
The performance of an SQL Server database instance can often be improved by small tweaks. Performance might also be hindered by simple oversights. In fact, some SQL Server parallelism settings have suboptimal default values. Moreover, they have to be re-tuned after an hardware upgrade. Other issues might arise from inadvertent hardware configurations, especially when SQL Server is hosted on a virtual machine (VM).
- maximum degree of parallelism (MaxDoP)
- cost threshold for parallelism (CTFP)
Starting from BigFix 10.0.2, you
can use the /checksqlserverparallelism
BESAdmin command to check if the
MaxDoP and CTFP settings of your database instance are configured appropriately, and to detect
other issues described later.
SQL Server MaxDoP values for best performance
Microsoft recommends MaxDoP settings in Configure the max degree of parallelism Server Configuration Option.
Server configuration | Number of processors | SQL Server 2008-2014 (10.x-12.x) | SQL Server 2016 (13.x) and newer |
Server with single NUMA node | Fewer than or equal to 8 logical processors | Keep MAXDOP at or below the number of logical processor | Keep MAXDOP at or below the number of logical processors |
Server with single NUMA node | More than 8 logical processors | Keep MAXDOP at 8 | Keep MAXDOP at 8 |
Server with multiple* NUMA nodes | Fewer than or equal to 8 logical processors per NUMA node | Keep MAXDOP at or below the number of logical processors per NUMA node | Keep MAXDOP at or below the number of logical processors per NUMA node |
Server with multiple* NUMA nodes | More than 8 logical processors per NUMA node | Keep MAXDOP at 8 | Keep MAXDOP at half the number of logical processors per NUMA node with a MAX value of 16 |
On SQL Server 2016 (13.x) and later versions, the soft-NUMA feature is enabled by default and set to automatically split hardware NUMA nodes with more than 8 logical processors into smaller soft-NUMA nodes.The soft-NUMA feature can be configured so that soft-NUMA nodes are created manually, or it can be completely disabled. For more information, refer to Soft-NUMA (SQL Server).
MaxDoP changes do not require that you restart the SQL Server 2012 or 2019.
SQL Server CTFP values for best performance
Microsoft does not provide recommendations for setting the CTFP.
A common suggestion is setting it to a value between 15 and 50, with the understanding that the best value for it depends on the workload.
For the database workload that the BigFix Server generates, test results show that setting CTFP to 50 yields better performance than leaving it at 5.
CTFP changes do not require that you restart SQL Server 2012 and later versions such as 2019.
Troubleshooting scenario 1: Under-utilization of licensed cores in a VM
Because of a licensing restriction, SQL Server might not be able to use all available CPU cores.
In particular, the license of some SQL Server editions (Express, Web and Standard) is "limited to the lesser of n sockets or m cores."
For example, a database instance of SQL Server 2019 Express Edition is "limited to the lesser of 1 socket or 4 cores" for its maximum compute capacity. See the Scale Limits section of Editions and supported features of SQL Server 2019 (15.x). Because of that, this edition of SQL Server can use only 4 cores on the same socket.
This limitation might lead to unexpected issues when SQL Server is installed on a virtual machine. In fact, using a common VM configuration (that uses many virtual sockets with few cores per socket) can severely limit the number of cores that SQL Server can use, because of the SQL Server license limitations.
In the example, SQL Server 2019 Express can use up to 4 cores, but if it is installed on a VM with 4 cores and 4 sockets (1 core per socket), it can use a single core.
In another example, the SQL Server 2019 Web license lets you use "the lesser of 4 sockets or 16 cores".
If your VM has 16 (virtual) sockets and 1 core per socket, you can only use 4 cores out of 16. That is, you will use 4 sockets with 1 core each.
However, SQL Server can use all cores if you change your VM CPU configuration to use, for example, 4 sockets and 4 cores per socket.
When you create a new VM, carefully choose the number of CPU sockets and the number of total CPU cores.
If you are using the VMware vSphere Client, when creating a VM or editing its settings, you can expand the "CPU" menu of the "Virtual Hardware" tab to configure the number of "cores per socket", which determines the number of sockets.
To detect whether SQL Server is running in a VM, you can run this query and check for the following result:
virtual_machine_type >=
1
.SELECT virtual_machine_type
FROM sys.dm_os_sys_info
select COUNT(*) AS sqlUsedLogicProcs
from sys.dm_os_schedulers
where status = 'VISIBLE ONLINE'
SELECT cpu_count AS LogicalCpuCount
FROM sys.dm_os_sys_info
SELECT SERVERPROPERTY( 'edition' )
Microsoft provides complete list of returned values in SERVERPROPERTY (Transact-SQL). Ignore the Azure values.
Troubleshooting scenario 2: Uneven distribution of used cores
Even when SQL Server can use all licensed cores, performances might not be optimal.
Ideally, SQL Server is licensed to use all cores on the (virtual or physical) hardware of the computer.
If SQL Server cannot use all cores on the computer, the impact on performance is smaller when the cores it can use are evenly distributed among the hardware NUMA nodes of that computer. If SQL Server can use only a small portion of the available cores, which is not ideal, it can use only the cores on the NUMA nodes that host the n sockets that it is licensed to use.
On physical hardware, the number of hardware NUMA nodes usually matches the number of sockets, or, less commonly, is a multiple of it. For example, a physical socket can contain one or more NUMA nodes.
On virtual hardware, the opposite can happen. The number of hardware NUMA nodes can be smaller than the number sockets. That is, multiple sockets can be part of the same hardware NUMA node.
On Windows, you can use Resource monitor (resmon.exe) to check the number of hardware NUMA nodes on your computer.
In the "CPU" tab, the panel on the right shows a graph for each NUMA node and CPU processor.
If the panel shows only CPU graphs, that means there is only one NUMA node that hosts all CPUs.
select COUNT( DISTINCT memory_node_id ) as hwNumaNodes
from sys.dm_os_memory_nodes
where memory_node_id <> 64
In SQL Server 2016 and later, the automatic soft-NUMA feature splits virtual or physical hardware NUMA nodes with more than 8 cores into multiple soft-NUMA nodes. This split does not necessarily solve the performance degradation that unevenly assigned cores to different hardware NUMA nodes causes; it might in fact only mask it.
select COUNT(*) as usedNumaNodes,
MIN(online_scheduler_count) as minUsedLogicProcsPerNumaNode,
MAX(online_scheduler_count) as maxUsedLogicProcsPerNumaNode
from sys.dm_os_nodes
where online_scheduler_count > 0 and node_state_desc not like '%DAC%'
An example scenario is SQL Server Web on a computer with 2 sockets and 20 cores (10 per socket).
In this setup, SQL Server Web can use all the 16 licensed cores out of the total 20 that the system provides.
However, how the 16 used cores are chosen among the total of 20 can make a difference in performance.
Assuming there is a NUMA node for each socket, the used cores might be unevenly distributed, like this example:
- 10 cores in NUMA node 0
- 6 cores in NUMA node 1
This example shows a better distribution of the used cores:
- 8 cores in NUMA node 0
- 8 cores in NUMA node 1
ALTER SERVER
CONFIGURATION SET PROCESS AFFINITY CPU
For more information, refer to the "Setting process affinity" section of this Microsoft article: ALTER SERVER CONFIGURATION (Transact-SQL)
Gathering additional information
To gather additional information, you use BESAdmin and pass the /extrainfo
flag to the /checksqlserverparallelism
command.
Also, you can inspect the SQL Server logs to extract useful details.
If no output is returned, the log lines of interest might be deleted by the log rotation.
SET NOCOUNT ON;
DECLARE @logData TABLE( LogDate DATETIME, ProcInfo NVARCHAR(64), LogText NVARCHAR(1024) );
INSERT INTO @logData
EXEC sys.xp_readerrorlog 0, 1, N'SQL Server detected ', N' socket', null, null, N'DESC';
SELECT TOP 1 [LogText]
FROM @logData;
Example output:
SQL Server detected 1 sockets with 2 cores per socket and 2 logical processors per socket, 2 total logical processors; using 2 logical processors based on SQL Server licensing. This is an informational message; no user action is required.
SET NOCOUNT ON;
DECLARE @logData TABLE( LogDate DATETIME, ProcInfo NVARCHAR(64), LogText NVARCHAR(MAX) );
INSERT INTO @logData
EXEC sys.xp_readerrorlog 0, 1, N'Node configuration: ', N' CPU mask: ', null, null, N'DESC';
SELECT [LogText]
FROM @logData;
LogText
Example output:
Node configuration: node 0: CPU mask: 0x0000000000000003:0 Active CPU mask: 0x0000000000000003:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
If the CPU mask was set manually, you can view it by using SQL Server Management Studio.
Right-click your DB instance, click Properties, and then click Processors.