In my previous blog and this one, there is a lot of talk of trace flags and you can see the life cycle of them. They are implemented and then sometimes become part of the product over time. For a long time, the trace flag 1118 was a common performance improvement trick known only by industry experts. Over the years as a consultant, I have often been asked by clients if they should use trace flags, and generally speaking, as long as they are documented by Microsoft they are safe to use. I would certainly not recommend using undocumented trace flags. They are not supported and therefore not recommended. So always be sure to check the trace flag list before setting trace flags. So of course, as I say that, I have to offer another one.
2016 SP2 CU2 and SQL Server 2017 CU4
Trace Flag 3427
Although technically not just for SQL 2016, this is an important trace flag for 2016 SP2 CU2 and SQL Server 2017 CU4 onward to allow for transaction-level auditing in Common Criteria Compliance (CCC), sometimes referred to as C2 auditing. This can cause overhead in workloads that do extensive inserts and updates in temp tables. The reason this trace flag is important is that the overhead is incurred whether you have CCC enabled or not. By setting the trace code you automatically bypass the CCC code if that feature is disabled. There is no need for the extra overhead for a feature you are not using.
Metadata contention is the contention that occurs on the system objects in tempdb that are used to track temp tables. Since SQL Server 2005 this has been handled by temp table caching. Periodically objects get removed from the cache and the corresponding row from the metadata table is removed. As data gravity has increased, the contention with deletes has grown. With the addition of features like temporal tables and Always Encrypted which required additional metadata, the need to purge the temp table cache was receiving more pressure, either because of deletes or because of memory pressure. The problem was fixed with 3 main changes;
- Synchronous to asynchronous
When a temp table changes within a stored procedure (add an index, column or is explicitly dropped) it used to be deleted immediately. Now it is moved to a deleted list and will be deleted by a memory sweep, the same as other caches in SQL Server. This is for temp tables created within a stored procedure, because SQL Server does not store ad-hoc temp tables.
- Remove more tables on each pass
Originally, the delete list was purged on a 1-1 basis. For every thread that needed to be added to the cache, something had to be deleted from the delete list when the list reached a certain size. This has been adjusted to have only one thread per NUMA node tasked with this, and the helper thread will remove 64 objects at a time rather than 1.
- Optimize latching when scanning meta data
To move records from a page you need exclusive access to that page in memory and not allow anyone else to use it. Prior to 2016 when scanning for meta data, SQL Server was taking an exclusive latch on all pages it searched. As you can imagine it was searching more pages than it was changing so actually taking far more locks than it needed. Now it takes a shared lock only as it searches, and switches to the exclusive lock when it finds the correct page and needs to make the delete. This causes far less locking and blocking.
Please remember this is for temp tables created within a stored procedure, because SQL Server does not store ad-hoc temp tables. This is an interesting feature because it is not available in Azure SQL Database or Azure SQL Managed Instance. For details on how to set it up review the details in Microsoft docs
Concurrent PFS updates
To locate free space for an object, SQL Server used the data on pages called Page Free Space (PFS) pages. Prior to SQL Server 2019 to update these pages required an exclusive lock. However, now they can be updated with a shared latch. Although page latch contention of PFS pages is most commonly associated with tempdb it can all occur when there are concurrent object allocation threads in other databases. This is on by default in all databases as of SQL Server 2019.
Coming in SQL Server 2022 I expect to see continued improvements to Global Allocation Map Updates (GAM) and Concurrent Shared Global Allocations Map Updates (SGAM). Improving concurrent updates and latching when scanning meta data. Taking the next step in fine tuning the tempdb that has been going on in the last few releases.
Even with all these changes, it is still possible to have performance issues. Here are some best practices for tempdb to keep in mind:
- Do not explicitly drop temp tables at the end of a stored procedure. They will get cleaned up when the session that created them ends.
- Do not alter temp tables after they have been created.
- Do not truncate temp tables.
- Move index creation statements on temp tables to the new inline index creation syntax within the create table definition that was introduced in SQL Server 2014