Archives

SQL Server – TempDB How many files?

As explained by Paul Randal. Scripts by Paul Randal.

Step 1: Make sure this information isn’t out of date!

Step 2: Detect if your server has a problem with tempDB contention by running a script.

Step 3: If you have contention, then…. If you have less than equal to 8 cores, ensure you have as many tempdb files as you have cores. If you have more than 8 cores, follow a growth formulae, and test the impact of the change (see Paul’s blog above). Note: You do not need more log files.

Step 4:  Make sure all tempdb files are the same size.

Step 5: Make sure your tempdb files are sized to your typical operational workload, as they will shrink when the server resets.

 

Advertisements

SQL Server – Detection of Implicit Conversions

These notes were taken whilst watching the fabulous PluralSight course “Why Physical Database Design Matters, Datatypes and Query Performance – Kimberly Tripp.”

 

TL;DR: Writing SQL statements which compare or join columns of different data types can cause performance problems. To locate when this happens, you can use two scripts. One to locate implicit conversions in comparisons/ordering and one to locate probe residual conversions typically found in ‘joins’.

Extra content in this blog post has come from Jonathan Kehayias’ blog Implicit Conversions that Cause Index Scans, and  Rob Farley’s blog.

 

Implicit Conversions

When executing an SQL statement which tries to compare char, nvarchar, varchar, and some datetime types with each other SQL server may be prevented from using any indices and need to convert the data in the rows to perform the operation.  This chart shows the danger zones – in Yellow. This is when an index scan is performed due to conversion rather than a more robust outcome (no conversion/index seek/obvious failure). The results may differ depending on your collation settings too. – But in general do not allow users to write ad-hoc statements and make sure that statements use the correct datatype!

The blog Finding Implicit Column Conversions in the Plan Cache details a script to help locate if any conversion has taken place recently in your plan cache. In testing I’ve also seen conversion from int to bigint be reported by this script.

 

Residual conversions

“Probe Residual” is the name for conversions which happen implicitly during join execution (a residual converted column value is used). This is usually only a problem when foreign key relationships aren’t part of the integrity (e.g. soft keys), and the converted column has a large number of rows.

Sometimes these appear in the SQL Server Execution plan as scalar computations immediately before the join. If you hover over the join block and look for “Probe Residual”, you’ll know it has happened. However, there’s no need to dig into each query. You can just download and execute this script to find any probe residual conversions – thanks Kimberly! Just try joining an int field with a bigint field, and you’ll get a hit from this script.

 

SQL Myth: The NULL bitmap is only added to a record when it’s got nullable columns

As seen in my previous post SQL How To: Dump the contents of a SQL Page, I mention right at the end that we have a null bitmap, yet our table did not contain any nullable columns.

We can prove this further by creating a variant of our original table with a nullable column, and another with the same column non-nullable to see the difference.

CREATE TABLE PersonWithNullable(
FirstName char(40),
LastName char(40),
NickName char(40) NULL,
Age int )

INSERT INTO PersonWithNullable VALUES ('Giles','Middleton', NULL, 40 )

EXEC sp_AllocationMetadata 'PersonWithNullable'

Results in:

PersonWithNullable

And when we inspect the page null bitmap, it yeilds (our database is called ‘test’)

DBCC TRACEON (3604);
GO

DBCC PAGE('test', 1,179,3)

Results in
PersonWithNullablePageDump

 

0400  04 is actually

0004 (4 columns in little endian order) plus the null bitmap 04 (00000100)

This ‘3rd bit’ represents column 3. Indicating it is null.

 

Watching the bits go by

To demonstrate this nullable bit pattern, let us create  table of all nulls and fill it with different patterns

CREATE TABLE PersonWithAllNullable(
FirstName char(40) NULL,
LastName char(40) NULL,
NickName char(40) NULL,
Age int NULL )

INSERT INTO PersonWithAllNullable VALUES (NULL,'Middleton', 'Gilesey' ,40 )

INSERT INTO PersonWithAllNullable VALUES ('Bob',NULL, 'Bobby' ,41 )

INSERT INTO PersonWithAllNullable VALUES ('Matt','Didd', NULL ,42 )

INSERT INTO PersonWithAllNullable VALUES ('Simon','Hedgeson', 'Hedgey', NULL )

After using our stored procedure and DBCC PAGE we saw…

1FirstNameNull 2LastNameNull 3NickNameNull

4AgeNull

Let’s finish off our test

To make sure I wasn’t imagining things, lets get rid of the nullable columns and see if the bit mask byte is still there.

CREATE TABLE PersonWithNoNullable(
FirstName char(40),
LastName char(40),
NickName char(40),
Age int )

INSERT INTO PersonWithNoNullable VALUES (‘Giles’,’Middleton’, ‘Gilesey’ ,40 )

After using our stored procedure and DBCC PAGE we saw…

5FinalProofnullbitmap

 

Yep, an extra byte we don’t need!

 

The final twist

A table with 9 Nullable columns will cause an extra byte to be added to the row size just for the null bitmap we will not use.

CREATE TABLE NineCols(
c1 tinyint,
c2 tinyint,
c3 tinyint,
c4 tinyint,
c5 tinyint,
c6 tinyint,
c7 tinyint,
c8 tinyint,
c9 tinyint)

INSERT INTO NineCols VALUES (0xff,0xfe,0xaa,0xbb,0xcc,0xdd,0x22,0x33,0xee)

EXEC sp_AllocationMetadata ‘NineCols’

Resulted in…

6PeskyByte

 

That’s an extra byte. For our particular record, that’s quite an overhead.

So the lesson there kids, is if you are about to exceed 8, 16, 24, 32 (etc) columns, you’ll add another byte per row to your record for null bitmap storage, even if you do not use nulls.

http://www.sqlskills.com/blogs/paul/misconceptions-around-null-bitmap-size/

SQL How To: Dump the contents of a SQL Page

This technique uses an excellent stored procedure developed by Paul S. Randal, which I’ve copied here for safe keeping, and described how to use it (in my own words). Be sure to check out his articles, they are insightful to say the least.

http://www.sqlskills.com/blogs/paul/inside-the-storage-engine-sp_allocationmetadata-putting-undocumented-system-catalog-views-to-work/

 

After installing the stored procedure below called sp_AllocationMetaData, and allowing it to be executed from any database by  marking it as a system object, you can simply call this for a given table to extract a raw view of the page. Not entirely useful every day, but fun.

CREATE TABLE Person(
FirstName char(40),
LastName char(40) )

INSERT INTO Person VALUES ('Giles','Middleton')

EXEC sp_AllocationMetadata 'Person'

Which results in this bad boy:

AllocationMetaData

The two underlined numbers are then fed into the DBCC PAGE tool

http://blogs.msdn.com/b/sqlserverstorageengine/archive/2006/06/10/625659.aspx

http://support.microsoft.com/kb/83065

Notice that we pass in the database name, not the table name. We also have to enable a trace flag in order to see this output.

DBCC TRACEON (3604);
GO

DBCC PAGE (‘test’, 1, 175, 3);

Et voilà – a memory dump!

SimplePageOutput

As you can see, there is our data, and some extra bytes before and after it.

The first 4 bytes are meta information, followed by the data, then ended with 0x020000.

0x02 is actually little-endian and represents 0x0002, or 2 which is the number of columns.

The remaining byte represents a bitmask of nullable columns. We have none, so it’s a byte of zeros.

 

For more information on the data spewed out, see Paul’s other post. http://www.sqlskills.com/blogs/paul/inside-the-storage-engine-anatomy-of-a-page/

And

http://www.sqlskills.com/blogs/paul/inside-the-storage-engine-anatomy-of-a-record/

 

 

Stored procedure – sp_AllocationMetadata

USE master;
GO

 

IF OBJECT_ID (‘sp_AllocationMetadata’) IS NOT NULL
   DROP PROCEDURE sp_AllocationMetadata;
GO

 

CREATE PROCEDURE sp_AllocationMetadata
(
@object VARCHAR (128) = NULL
)
AS
SELECT
OBJECT_NAME (sp.object_id) AS [Object Name],
sp.index_id AS [Index ID],
sa.allocation_unit_id AS [Alloc Unit ID],
sa.type_desc AS [Alloc Unit Type],
‘(‘ CONVERT (VARCHAR (6),
CONVERT (INT, SUBSTRING (sa.first_page, 6, 1) +
   SUBSTRING (sa.first_page, 5, 1))) +
‘:’ CONVERT (VARCHAR (20),
CONVERT (INT, SUBSTRING (sa.first_page, 4, 1) +
   SUBSTRING (sa.first_page, 3, 1) +
SUBSTRING (sa.first_page, 2, 1) +
SUBSTRING (sa.first_page, 1, 1))) +
‘)’ AS [First Page],
   ‘(‘ CONVERT (VARCHAR (6),
CONVERT (INT,
SUBSTRING (sa.root_page, 6, 1) +
SUBSTRING (sa.root_page, 5, 1))) +
‘:’ CONVERT (VARCHAR (20),
CONVERT (INT,
SUBSTRING (sa.root_page, 4, 1) +
SUBSTRING (sa.root_page, 3, 1) +
SUBSTRING (sa.root_page, 2, 1) +
SUBSTRING (sa.root_page, 1, 1))) +
‘)’ AS [Root Page],
‘(‘ CONVERT (VARCHAR (6),
CONVERT (INT,
SUBSTRING (sa.first_iam_page, 6, 1) +
SUBSTRING (sa.first_iam_page, 5, 1))) +
‘:’ CONVERT (VARCHAR (20),
CONVERT (INT,
SUBSTRING (sa.first_iam_page, 4, 1) +
SUBSTRING (sa.first_iam_page, 3, 1) +
SUBSTRING (sa.first_iam_page, 2, 1) +
SUBSTRING (sa.first_iam_page, 1, 1))) +
‘)’ AS [First IAM Page]
FROM
sys.system_internals_allocation_units AS sa,
sys.partitions AS sp
WHERE
sa.container_id = sp.partition_id
AND sp.object_id =
(CASE WHEN (@object IS NULL)
         THEN sp.object_id
ELSE OBJECT_ID (@object)
END);
GO

EXEC sys.sp_MS_marksystemobject sp_AllocationMetadata;
GO

 

 

 

 

SQLServer Myth: NULL columns use no space

TL;DR

A table with NULL columns uses exactly the same amount of space as NOT NULL columns. Even indexes using NULL columns do not reduce.

If you really need to save space then the options are

Simply do not let a designer or developer insist on having a column exist in a table ‘just incase’ it is needed. Especially if that table is likely to store millions of records. Yes, “disk space is cheap”, but when a system is heavily used and critical – memory, backups, CPU, transfer times and network traffic start to matter. The more rows you can fit on a SQL page, the better.

Here’s the proof:
Create a new database called test, and run this SQL:
USE 'test'
GO;
CREATE TABLE tableNulls17SizeTest (
int1 INT NULL, int2 INT NULL, int3 INT NULL, int4 INT NULL, int5 INT NULL,
int6 INT NULL, int7 INT NULL, int8 INT NULL, int9 INT NULL, int10 INT NULL,
int11 INT NULL, int12 INT NULL, int13 INT NULL, int14 INT NULL, int15 INT NULL,
int16 INT NULL, int17 INT NULL )
GO

SET NOCOUNT ON

INSERT INTO tableNulls17SizeTest VALUES(
NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,
NULL,NULL,NULL,NULL,NULL )
GO 1000000

After 6 minutes….
sp_spaceused 'tableNulls17SizeTest'
SELECT page_count, record_count, avg_record_size_in_bytes
FROM sys.dm_db_index_physical_stats(DB_ID(N'test'),
OBJECT_ID(N'tableNulls17SizeTest'), NULL, NULL, 'DETAILED')
GO

Results in:

NullColumnsUseSpace

And if you check the file system, the .mdf file grows to 84mb.

Now, to test that the file is the same size with data, recreate the blank ‘test’ database, then execute this:

USE [test]
GO

CREATE TABLE tableNoNulls17SizeTest (
int1 INT NOT NULL, int2 INT NOT NULL, int3 INT NOT NULL, int4 INT NOT NULL, int5 INT NOT NULL,
int6 INT NOT NULL, int7 INT NOT NULL, int8 INT NOT NULL, int9 INT NOT NULL, int10 INT NOT NULL,
int11 INT NOT NULL, int12 INT NOT NULL, int13 INT NOT NULL, int14 INT NOT NULL, int15 INT NOT NULL,
int16 INT NOT NULL, int17 INT NOT NULL )
GO
SET NOCOUNT ON
INSERT INTO tableNoNulls17SizeTest VALUES(
RAND()*10000000,RAND()*10000000,RAND()*10000000,RAND()*10000000,
RAND()*10000000,RAND()*10000000,RAND()*10000000,RAND()*10000000,
RAND()*10000000,RAND()*10000000,RAND()*10000000,RAND()*10000000,
RAND()*10000000,RAND()*10000000,RAND()*10000000,RAND()*10000000,
RAND()*10000000 )
GO 1000000

sp_spaceused ‘tableNoNulls17SizeTest’

SELECT page_count, record_count, avg_record_size_in_bytes
FROM sys.dm_db_index_physical_stats(DB_ID(N’test’),
OBJECT_ID(N’tableNoNulls17SizeTest’), NULL, NULL, ‘DETAILED’)
GO

And the result….

CollumnsNonNullWithValues

I also tested the size of indexes with respect to nullable columns

CREATE NONCLUSTERED INDEX ix1 ON tableNulls17SizeTest ( int1,int2 )

CREATE NONCLUSTERED INDEX ix1 ON tableNoNulls17SizeTest ( int1,int2 )

And they had the same size on disk once built.

SQL Server – The Basics of Files and File Groups

This post describes the role of Files and File Groups within a SQL Server database. We will look at how they are used within the system and user defined databases.

Files

SQL Server Files, much like documents or music files, are logical containers where the operating system stores and retrieves data for the database. The storage subsystem of the operating system then decides how these logical files are physically stored. This distinction is important as high-performance fault tolerant SQL Server configurations may use redundant disk technologies to distribute a file’s contents across many hard disks.

A single database consists of one primary file, zero or more secondary files and one or more log files. The file extensions SQL Server uses conventionally are .MDF, .NDF and .LDF respectively. Typically, for a new SQL Server database a primary .MDF file and single .LDF file is created for the user as seen in Figure 1.

 

Figure 1. The default layout of files and file groups for a new user database.

MDF is more commonly known as the primary data file. It is believed that it originally stood for the master database file before the master database concept arrived[1]. It contains a database header and system tables [2], and points to other files in the database [3][4]. User objects and relational data can be stored in the primary data file and/or in many secondary files.

NDF is an instance of a secondary data file, N denoting the canonical term for ‘1..N’. Some databases do not need or have any secondary files [5].  Separation of a primary and N secondary data files is primarily useful to allow systematic growth of the database across multiple disks. It is also very useful to increase IO performance, allowing less contention for file locks and allowing multi-threaded access to the data.[6]

LDF is the Log Data File (the transaction log). One or more transaction log files can exist for a database and are used to help disaster recovery scenarios as well as ensure that operations on the data are ACID. [7]

The file extension suffixes are purely conventional and decorative. They could be called .main, .data1, .data2 and .trans or all .data if desired. However, I recommend against such creativity. It could prevent future teams from understanding the content of files.

 

Filegroups

Within SQL Server, a number of files can be grouped into a logical container called a file group. This is the logical concept with which tables, indexes and transaction logs bind to storage,  objects do not bind to MDF/NDF files directly. [8]

Although you have access (at the operating system level) to the data files within a file group, the data will be split between many files in a file group [3], so do not be tempted to back up certain files within the file group only, treat all files within a file group as an atomic entity.

Each database contains at minimum one primary file group and zero or more user-defined file groups.

The Primary File Group contains the primary data file (MDF as mentioned above) and any other files which are not directly assigned a file group. Any system table allocations for the database are performed in the primary file group. [3]

User-Defined File Groups can be created and destroyed by a user at will and are useful for dividing logical separations of data and to help manage growth. They may also be used to group data files which sit on similar disks, or RAID array so that you can partition by speed/fault tolerance.

 Files and File Group usage with system Databases on a SQL Server Instance

In addition to user-defined databases, SQL Server provides the following databases as part of its core engine. These databases are shared resources and actions performed by the many user defined databases will inevitably bottleneck to these core services.

MSDB is a scheduling database for features such as Mail. MSDB can utilize many file groups and files just like a normal database, but has some restrictions. [9]

TempDB is used to process sorts in queries and store temporary tables. It is a highly temporal database. Extra NDF and LDF Files can be added to the primary file group, but extra File Groups cannot.[10]

Model is the default template for all newly created databases including tempdb. Files and file groups cannot be added to this database. They can only be added after creation. [11]

Master contains critical system information such as the list of all other databases on the instance, the logon accounts and any linked servers. Files and File Groups cannot be added to the Master database. [12]

Summary of database NDF, LDF and file group capabilities

Database Extra NDF Extra LDF FileGroup Support?
MSDB Y Y Y
TempDB Y Y N
Model N N N
Master N N N
User Created Databases Y Y Y

 

A new installation of SQL Server contains the following system databases and files. All mdf files are created in the PRIMARY filegroup for each database. Log files do not live in filegroups.

 

Figure 2 The default configuration of a new SQL Server Instance (SQL Server 2005 used).

Although installing a SQL Server instance and creating a user database with default settings provides a good starting point for beginners, it is far from optimal. As the database and performance demands increase, SQL Server needs to be reconfigured to utilize the storage platform effectively. Specifically, creating more data files to load balance the system and help manage growth, to separate log file, tempdb, main data files, and even to consider which data is mostly read/read write.

In a later post I will describe how to use files and file groups to increase server performance. In many cases, configuring files and file groups to provide a speed boost is only sensible when the server is starved of RAM, CPU or disk bandwidth.

Citations

[1] “.mdf or .ndf”. SQL Server Developer Center. Michael Hotek. June 26th 2006. (http://social.msdn.microsoft.com/forums/en-US/transactsql/thread/a4aee15c-0142-43e4-99e2-4c790e04736c/)

[2] “SQL Server 2000 Backup and Restore”. Microsoft Technet. Microsoft. March 1st 2005. (http://technet.microsoft.com/en-us/library/cc966495.aspx)

[3] “Understanding Files and Filegroups (SQL Server 2008 R2)”. Microsoft Developer Network. No date. (http://msdn.microsoft.com/en-us/library/ms189563.aspx)

[4] “Files and Filegroups Architecture. SQL Server 2008 R2”. Microsoft Developer Network. No date. (http://msdn.microsoft.com/en-us/library/ms179316.aspx).

[5] “Physical Database Files and File Groups (SQL Server 2000)”. Microsoft Developer Network. January 2004. (http://msdn.microsoft.com/en-us/library/aa174545(v=sql.80).aspx).

[6] “Microsoft SQL Server 2000 Unleashed”. Ray Rankins, Paul Jensen, Paul Bertucci. Sams Publishing. December 18th 2002.

[7] ACID topic. Wikipedia. (http://en.wikipedia.org/wiki/ACID).

[8] CREATE TABLE (SQL Server 2000). Microsoft Developer Network. July 2003. (http://msdn.microsoft.com/en-us/library/aa258255(v=sql.80).aspx)

[9] msdb Database (SQL Server 2008 R2). Microsoft Developer Network. No date. (http://msdn.microsoft.com/en-us/library/ms187112.aspx).

[10] tempdb Database (SQL Server 2008 R2). Microsoft Developer Network. No date. (http://msdn.microsoft.com/en-us/library/ms190768.aspx).

[11] model Database (SQL Server 2008 R2). Microsoft Developer Network. No date.

(http://msdn.microsoft.com/en-us/library/ms186388.aspx)

[12] master Database (SQL Server 2008 R2). Microsoft Developer Network. No date.

(http://msdn.microsoft.com/en-us/library/ms187837.aspx)