SQL Server – The Basics of Files and File Groups

This post describes the role of Files and File Groups within a SQL Server database. We will look at how they are used within the system and user defined databases.

Files

SQL Server Files, much like documents or music files, are logical containers where the operating system stores and retrieves data for the database. The storage subsystem of the operating system then decides how these logical files are physically stored. This distinction is important as high-performance fault tolerant SQL Server configurations may use redundant disk technologies to distribute a file’s contents across many hard disks.

A single database consists of one primary file, zero or more secondary files and one or more log files. The file extensions SQL Server uses conventionally are .MDF, .NDF and .LDF respectively. Typically, for a new SQL Server database a primary .MDF file and single .LDF file is created for the user as seen in Figure 1.

 

Figure 1. The default layout of files and file groups for a new user database.

MDF is more commonly known as the primary data file. It is believed that it originally stood for the master database file before the master database concept arrived[1]. It contains a database header and system tables [2], and points to other files in the database [3][4]. User objects and relational data can be stored in the primary data file and/or in many secondary files.

NDF is an instance of a secondary data file, N denoting the canonical term for ‘1..N’. Some databases do not need or have any secondary files [5].  Separation of a primary and N secondary data files is primarily useful to allow systematic growth of the database across multiple disks. It is also very useful to increase IO performance, allowing less contention for file locks and allowing multi-threaded access to the data.[6]

LDF is the Log Data File (the transaction log). One or more transaction log files can exist for a database and are used to help disaster recovery scenarios as well as ensure that operations on the data are ACID. [7]

The file extension suffixes are purely conventional and decorative. They could be called .main, .data1, .data2 and .trans or all .data if desired. However, I recommend against such creativity. It could prevent future teams from understanding the content of files.

 

Filegroups

Within SQL Server, a number of files can be grouped into a logical container called a file group. This is the logical concept with which tables, indexes and transaction logs bind to storage,  objects do not bind to MDF/NDF files directly. [8]

Although you have access (at the operating system level) to the data files within a file group, the data will be split between many files in a file group [3], so do not be tempted to back up certain files within the file group only, treat all files within a file group as an atomic entity.

Each database contains at minimum one primary file group and zero or more user-defined file groups.

The Primary File Group contains the primary data file (MDF as mentioned above) and any other files which are not directly assigned a file group. Any system table allocations for the database are performed in the primary file group. [3]

User-Defined File Groups can be created and destroyed by a user at will and are useful for dividing logical separations of data and to help manage growth. They may also be used to group data files which sit on similar disks, or RAID array so that you can partition by speed/fault tolerance.

 Files and File Group usage with system Databases on a SQL Server Instance

In addition to user-defined databases, SQL Server provides the following databases as part of its core engine. These databases are shared resources and actions performed by the many user defined databases will inevitably bottleneck to these core services.

MSDB is a scheduling database for features such as Mail. MSDB can utilize many file groups and files just like a normal database, but has some restrictions. [9]

TempDB is used to process sorts in queries and store temporary tables. It is a highly temporal database. Extra NDF and LDF Files can be added to the primary file group, but extra File Groups cannot.[10]

Model is the default template for all newly created databases including tempdb. Files and file groups cannot be added to this database. They can only be added after creation. [11]

Master contains critical system information such as the list of all other databases on the instance, the logon accounts and any linked servers. Files and File Groups cannot be added to the Master database. [12]

Summary of database NDF, LDF and file group capabilities

Database Extra NDF Extra LDF FileGroup Support?
MSDB Y Y Y
TempDB Y Y N
Model N N N
Master N N N
User Created Databases Y Y Y

 

A new installation of SQL Server contains the following system databases and files. All mdf files are created in the PRIMARY filegroup for each database. Log files do not live in filegroups.

 

Figure 2 The default configuration of a new SQL Server Instance (SQL Server 2005 used).

Although installing a SQL Server instance and creating a user database with default settings provides a good starting point for beginners, it is far from optimal. As the database and performance demands increase, SQL Server needs to be reconfigured to utilize the storage platform effectively. Specifically, creating more data files to load balance the system and help manage growth, to separate log file, tempdb, main data files, and even to consider which data is mostly read/read write.

In a later post I will describe how to use files and file groups to increase server performance. In many cases, configuring files and file groups to provide a speed boost is only sensible when the server is starved of RAM, CPU or disk bandwidth.

Citations

[1] “.mdf or .ndf”. SQL Server Developer Center. Michael Hotek. June 26th 2006. (http://social.msdn.microsoft.com/forums/en-US/transactsql/thread/a4aee15c-0142-43e4-99e2-4c790e04736c/)

[2] “SQL Server 2000 Backup and Restore”. Microsoft Technet. Microsoft. March 1st 2005. (http://technet.microsoft.com/en-us/library/cc966495.aspx)

[3] “Understanding Files and Filegroups (SQL Server 2008 R2)”. Microsoft Developer Network. No date. (http://msdn.microsoft.com/en-us/library/ms189563.aspx)

[4] “Files and Filegroups Architecture. SQL Server 2008 R2”. Microsoft Developer Network. No date. (http://msdn.microsoft.com/en-us/library/ms179316.aspx).

[5] “Physical Database Files and File Groups (SQL Server 2000)”. Microsoft Developer Network. January 2004. (http://msdn.microsoft.com/en-us/library/aa174545(v=sql.80).aspx).

[6] “Microsoft SQL Server 2000 Unleashed”. Ray Rankins, Paul Jensen, Paul Bertucci. Sams Publishing. December 18th 2002.

[7] ACID topic. Wikipedia. (http://en.wikipedia.org/wiki/ACID).

[8] CREATE TABLE (SQL Server 2000). Microsoft Developer Network. July 2003. (http://msdn.microsoft.com/en-us/library/aa258255(v=sql.80).aspx)

[9] msdb Database (SQL Server 2008 R2). Microsoft Developer Network. No date. (http://msdn.microsoft.com/en-us/library/ms187112.aspx).

[10] tempdb Database (SQL Server 2008 R2). Microsoft Developer Network. No date. (http://msdn.microsoft.com/en-us/library/ms190768.aspx).

[11] model Database (SQL Server 2008 R2). Microsoft Developer Network. No date.

(http://msdn.microsoft.com/en-us/library/ms186388.aspx)

[12] master Database (SQL Server 2008 R2). Microsoft Developer Network. No date.

(http://msdn.microsoft.com/en-us/library/ms187837.aspx)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s