join the MSSQLTips community

MSSQLTips.com - your daily source for SQL Server tips

Google
 
Web mssqltips.com

 
Fragmentation Station - Stop #1 - Storage basics and Access Methods - Chad Boyd

MSSQLTips

MSSQLTips.com - your daily source for SQL Server tips
Welcome to MSSQLTips Sign in | Join | Help
in Search

Chad Boyd

MSSQLTips - SQL Server Blog

Fragmentation Station - Stop #1 - Storage basics and Access Methods

Lots of times I get customers and non-customers talking about fragmentation - everything from what it is, to how it impacts performance, to what objects can be fragmented, to how to check for fragmentation. Quite often (almost always) the discussion inevitably includes lots of points that are not 100% accurate, or open to interpretation depending on what exactly someone is referring to using a particular terminology.  There are also quite a few misnomers, or myths even, that exist regarding fragmentation and related topics.

So, let's try to understand some of these different things. In this and a few related posts, I'm going to try and discuss some of the following topics:

  1. What is fragmentation, and what are the different types of fragmentation?
  2. What objects can be fragmented, and in what ways can they become fragmented?
  3. How can I avoid fragmentation proactively?
  4. How does fragmentation affect performance and in what scenarios will I see an impact (and what scenarios will I NOT see impact)?
  5. How can I check for fragmentation, including differentiating the types of fragmentation? ?
  6. Do I need to worry about fragmentation, and if so, how should I address it?

Inevitably (must be my word of the day, 2nd time I've used it in this post), along the course of discussing this topic, we'll have to hit on some other topics such as:

  1. Storage characteristics
  2. Clusters, Heaps, B-Trees, etc.
  3. Access Methods
  4. etc.

So, let's get this party started...in this post, we're going to discuss some basics that need to be at least touched on in terms of storage and access methods. aI'm going to provide minimal detail here, since you can easily get much more detailed information elsewhere.

Pages

A page in Sql Server is the primary storage structure for all data (data, indexes, BLOB/CLOB/LOB, row-overflow/SLOB, etc.). Pages are 8k in size and store records in no particular order (row offset is used to find each logical record on the page) - a very basic structure is as follows:

image

Extents

An Extent in Sql Server is the primary allocation structure (i.e. what is allocated to objects when they request more space to store stuff). Extents are made up of 8 contiguous pages (64k) and can be either shared or uniform in nature. Yes, this means anytime new storage space is needed for data, or an index, or a heap, it is allocated a single, full, contiguous extent (8 contiguous pages). The only exceptions to this are when space is allocated for a new object and the 1118 trace flag is not enabled, or when you have the -E startup option enabled, or if space is being allocated for a very small object that is less than ~64k in size.  We'll discuss the -E startup option in a later blog and how it can be helpful, for information on the 1118 trace flag, see this KB article: http://support.microsoft.com/kb/328551. We aren't going to discuss the differences between shared and uniform extents, but a simple read in Books Online will get you information to cover that.

HoBT

HoBT (pronounced "hobbit") stands for "Heap or B-Tree" and is the logical name for how all user data is stored in Sql Server (either in a heap or a B-tree).

Heap

A heap is basically a bunch of pages with no particular structure (aside from that of a page/extent), no order, no linkage. If you have a table that has no Clustered Index defined on it, then the data for that table is stored in a heap. Heaps are logically a flat structure (i.e. they are made up of only data pages, no root/intermediate pages). Given that there is no order, structure, or page linkage in a heap, there is no direct support in a heap for singleton lookups or range scans.

B-Tree

A B-Tree (balanced-tree) is a balanced, structured object, and it is used for all indexes in Sql Server (Clustered, Nonclustered, XML, etc.). Think of a B-Tree as a triangle-shaped structure of pages - it has a single root page at the top of the triangle with pointers to the next level down the triangle (which is wider than the root level by the number of pages that the root page can contain pointers to), each of those pages contain pointers to the next level, and so on (these are called intermediate pages) down the triangle until you get to the bottom of the triangle, which is the widest and referred to as the 'leaf' level of the structure.  In ASCII x marks, a B-tree would look like the following if it contained 2 intermediate levels where each page in the root and intermediate levels could each could point to 3 other pages:

                 x
                xxx
            xxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxx

What resides at the 'leaf' level of the structure depends on what type of index the B-tree supports. If it is a clustered index, then the leaf level of the index contains all the data for all the columns for all the rows of the table; if it is a non-clustered index on a clustered table it contains the non-clustered index keys, the included column values, and the keys for the clustered index (uses those as a pointer to the rest of the data); if it is a non-clustered index on a heap table, it contains the non-clustered index keys, the included column values, and the physical row-id value for the record in the heap (uses this to get a pointer to the rest of the data).

Data at the leaf-level of a B-Tree is logically ordered in order of the index key - this applies to both clustered and non-clustered indexes (ever heard that the data is physically sorted this way? not the case...this is exactly what logical fragmentation is, the logical ordering of pages in an index not matching the physical layout within the file). This logical ordering is achieved via the "row offset"/"slot array" for records on a page, and via a doubly-linked list for pages in an index. A doubly-linked list is basically a pointer in the header of each page that points to the prior logical page and the next logical page in the index. We'll discuss this and how it relates to fragmentation more later...

Seek

A seek is an efficient access method that can be fulfilled using an index structure. Seeking touches fewer pages than scanning, and can only occur on an index of some type. Think of a seek as what you would do with a phone book if I told you to find the phone number of someone with last name 'Boyd' and the first name 'Chad'...

Scan

A scan is basically an access method whereby all data, or some range of data, in an index or heap must be touched or retrieved in order to fulfill a request. Think of a scan as what you would do with a phone book if I told you to find the phone numbers for everyone in the book, or the phone number for all people with the first name 'Chad', or for someone with a phone number of (555) 555-5555...

Singleton Lookup

A singleton lookup is a seek operation whereby a single record/page of data is retrieved. This can occur for example when you perform a query that needs to access only a single record, or a few records from a single page. The navigation through a B-tree in a singleton lookup would be similar to the following diagram:

image

Full Scan - key ordered

A key-ordered full scan is a scan operation whereby all the leaf pages of a given B-tree structure are retrieved by following the page linkage (i.e. starting with the first page in the leaf level and following to the next page via the doubly-linked list, and so on and so on). Note that this type of scan can only be performed on a B-tree structure and NOT against a heap. This occurs for example when you perform a query that must return all the rows and columns in a clustered table, or only a few rows but the filter predicate has no index to seek with, or against a non-clustered index if you want all rows for a table but only values from columns that are covered by the non-clustered index. (navigational diagram follows under Range Scan)...

Range Scan

A range-scan is a scan operation whereby a range of pages are retrieved - think of this as a full-scan with boundaries. The navigation through a B-tree in a range scan would be similar to the following diagram (a full scan operation would be exactly the same, only the entire leaf would be scanned instead of just a portion of it):

image

Full Scan - Allocation Ordered

An allocation ordered full scan is the same as a key ordered full scan, only instead of scanning the data via page linkage, the pages from the leaf of the index (or from a heap) are scanned via page ID values taken from system pages that track which pages are allocated to the given index/heap (pages that track this are called 'IAM' pages). Basically all the physical page ID values are gathered for pages allocated to the given index/heap, and then those pages are retrieved directly - even in a B-tree structure, the root/intermediate pages for the index would not be touched at all (no reason to do so, since the physical page ID values take you directly to the leaf pages needed). This is the ONLY type of operation that is directly supported against a heap (though seeks and range-scans can occur indirectly via non-clustered indexes that are built against the heap table).  The navigation through a B-tree in this type of scan would be similar to the following (notice that the root/intermediate levels aren't touched)...navigation through a heap would be similar as well, only a heap wouldn't include root/intermediate pages at all in the structure:

image 

Read-Ahead

Read-ahead is an access method mechanism that attempts to intelligently pre-fetch data that resides on-disk into the data cache prior to it being needed for use by the CPU - this is done to try and optimize the IO throughput of a system in order to keep the CPU as busy as possible without waiting on the slower IO system to get needed data.  This type of operation is typically reserved for scans of data that fetch medium/large amounts of pages/data, and can issue 1,8,32, or 64 page IOs in a single IO operation - the size of operation that can be performed is very heavily impacted by the contiguity of the data (the more contiguous the data, the bigger the IOs that can performed, the better performance you see).  There are 2 kinds of read-ahead operations, 1 that works with allocation ordered scans, and 1 that works with key-ordered scans.

With allocation ordered scan, the IAM pages in the database list the extents used by a heap or index - the engine will read the IAM and build a sorted list of the physical disk addresses that must be read, allowing the engine to optimize it's IOs as large sequential read performed in sequence based on their location on disk, vs. multiple small, random reads.

With key-ordered scans, the engine uses the information stored in the intermediate index pages 1 level above the leaf level to schedule serial read-aheads for pages that contain the keys found.  If a request is made for example for all keys from 1 to 100, the engine will first read the index page above the leaf page for key 1 (on it's way to traversing to the leaf page); however, instead of simply reading each leaf page in sequence from page 1 to page 100, the engine scans the intermediate level page and builds a list of the leaf pages that must be read to get pages 1 thru 100, then schedules all reads in key order - in addition, the engine will recognize if pages are contiguous and perform a single read to retrieve the adjacent pages in a single operation as opposed to multiple, smaller operations. A similar type of operation is used to pre-fetch data from a base cluster or heap when scanning through a non-clustered index - since the leaf rows of a non-clustered index contain pointers to the data rows in the cluster/heap structure, as the storage engine reads through the leaf of the non-clustered index, it also starts scheduling asynchronous reads for the corresponding data rows whose pointers have already been retrieved. This allows the engine to fetch data efficiently from the underlying cluster/heap before it has completed the scan of the non-clustered index.

The navigation for a read-ahead in a key-ordered scan would look something like the following:

image

 

Ok, that should do it in terms of some of the basics to cover around storage, allocation, and access methods in regards to what we need to understand when talking about Fragmentation. In the next post, we'll discuss what different types of fragmentation there are, what storage structures they apply to, and what access methods are impacted by each. In future posts, we'll delve into much more...

Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.mssqltips.com/disclaimer.asp and http://www.mssqltips.com/copyright.asp.

Comments

 

Christopher Steen said:

ASP.NET First Line of Defense for Web Applications – Part 4 [Via: techjunkie ] ASP.NET Applications...

November 13, 2007 4:45 AM
 

Chad Boyd said:

November 28, 2007 8:27 AM
 

Chadhoc.net said:

Fragmentation Station - Stop #2 - What it is, what types there are

November 28, 2007 8:31 AM
 

aprato said:

Chad, this is the best, most lucid write up I've seen yet on this topic.  Excellent article.

December 1, 2007 10:34 AM
 

Chadhoc.net said:

Fragmentation Station - Stop #2 - What it is, what types there are

January 25, 2008 11:44 AM
 

Nielsjoergen Hvidberg said:

Hi Chad, what happened to the images in your great article :-)

October 13, 2009 2:47 AM

About Chad Boyd

Chad is an Architect, Administrator, and Developer with technologies such as Sql Server (and all related technologies), Windows Server, and Windows Clustering. He currently works as an independent consultant and also spends a significant amount of time writing, talking, presenting and blogging about Sql Server in person and online at http://mssqltips.com. In the past, Chad has worked with companies and organizations such as Microsoft Corporation and The American Red Cross, and provided consulting/support services at companies such as Bank of America, HP, Citigroup, Qualcomm, Scottrade, TJX, SunTrust, and Zurich Financial Services. For over 3 years with Microsoft Corporation Chad was responsible for providing onsite and remote support, guidance, and advice with SQL Server products to some of Microsoft’s foremost enterprise customers running the largest, most complex SQL Server installations and configurations in the world. This included all SQL Server products and versions, including SQL Server 7.0, 2000, 2005, and recently 2008, the SQL Server database engine, Reporting Services, SSIS/DTS, Notification Services, and Analysis Services on both 32 and 64 bit systems. Chad's primary responsibilities today include troubleshooting critical server situations, performance tuning and monitoring, disaster recovery planning and execution, architectural guidance for new Sql Server related deployments, and delivering deep technical workshops/presentations/proof-of-concept sessions covering a variety of technologies and functionality. Chad regularly posts Sql Server related content, tools, and advice with the mssqltips team at http://blogs.mssqltips.com/blogs and http://mssqltips.com. Chad can be contacted via his blog or email at chad dot boyd dot tips at gmail dot com.

This Blog

Syndication