MSSQLTips.com - your daily source for SQL Server tips

Google
 
Web mssqltips.com

ESSENTIALS: Home | Tips | Search | Categories | Top 10 | Products | Authors | Blogs | Forums | Webcasts | Advertise | About
Chad Boyd

in Search

Chad Boyd

  • Installing Clustered SQL Servers - Outline, Checklists, Document Sheets

    A high percentage of my interactions with clients revolve around clustering SQL Server in some manner or another - could be to review a high-availability solution, could be to stand up a new clustered instance of SQL, could be to troubleshoot a system that is having stability issues, etc. In all cases, I start with a set of checklists and informational-gathering sheets and tools to both review the existing state of the system, and also to understand the system we are working with - once we have all the configuration data, we run through the checklists to ensure the system is configured in a manner matching years of best-practices. If we are installing and configuring a new SQL Server instance within a cluster, I usually send the configuration sheets to my clients a few weeks in advance for them to fill-out in preparation for the install event - I've found that in nearly every case where we have all the configuration data prior to the install event, the installation and configuration goes off very smoothly and typically without a hitch...however, whenever the data isn't available, or a client wants to 'wing-it', the percentages of a smooth sail event drop significantly.  Additionally, whenever I arrive onsite to troubleshoot a server issue for a client, using the checklists as a guide to review the configuration generally yields a wide variety of things that are contributing to and/or causing the issue at hand.

    These checklists and informational sheets have been put together by myself over years of experience, with input from a variety of sources, most of which are referenced in the first section within the primary document included with the pages - the Installation and Configuration Outline document. This document is the starting point, and contains a plethora of links, supporting documents, and descriptions / experiences with why each setting or configuration recommended in the checklists is so.

    The checklists are meant as a guide/verification during the actual server cluster and sql cluster configuration/install, or while reviewing an existing system.

    These info-sheet documents are meant to aid in documenting the both the required information for proper/easy Sql cluster configuration (ip’s, network names, user accounts, etc.) as well as your cluster solution as a whole moving forward for reference and supporting documentation purposes.  Each of these documents should be completed in their entirety prior to beginning a cluster/Sql installation/configuration, and as such, can be used also as a guide for required information prior to beginning a configuration.

    So, here are the links to the actual checklists, info sheets, the outline document, and the sample completed info sheets (for a guide on the types of things you may want to consider including in the sheets, or as a guide on how you may want to consider filling them out). Note that included with each page is a downloadable word version of the documents as well:

    1) Installation and Configuration Outline

    This is the primary document and contains a complete outline of what needs to occur and when it needs to occur when preparing for/installing/configuring a cluster solution. At the very top, there are multiple sections which contain reference material for many of the different components of Sql/Windows/Clustering/etc. involved in the process; I’d recommend becoming as familiar with the contents of this document as possible, as it outlines much of the necessary/possible knowledge/best practices/issues/concerns/etc. involved with Sql Server clustering solutions. This document also references all of the other documents included in the attached file. This is supposed to act as an outline/guide for the actual process of preparing for, installing, and configuring a given Sql cluster solution, so typically it is read/followed from the top to the bottom as steps are completed, though you are more than welcome to browse through the entirety of the document(s) and even perform a ‘mock’ walkthrough if you like to become familiar with the process/documents/requirements/etc. 

    2) Info Sheet - Cluster Configuration

    3) Info Sheet - Node Configuration

    4) Info Sheet - Disk Configuration

    5) Info Sheet - SQL Configuration

    6) Checklist #1 - Pre-Cluster Configuration

    7) Checklist #2 - Post-Cluster Configuration

    8) Checklist #3 - Pre-SQL Installation

    9) Checklist #4 - Post-SQL Installation

    10) Sample Completed Info Sheets

     

    Enjoy, send along comments, recommendations, etc.

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.mssqltips.com/disclaimer.asp and http://www.mssqltips.com/copyright.asp.

  • Katmai (Sql 2008) - Transaction Logging Enhancements

    Given some of the "larger" features being introduced with Sql 2008 which are getting lots of coverage and attention (understandably so), there are actually quite a few "smaller" features that are included which will provide a great deal of benefit to SQL Server users everywhere - one of those "smaller" features that hasn't been getting as much attention is improvements made in the database engine for bulk-logging / minimal logging of standard INSERT INTO statements and the new MERGE statement as well. Prior to this functionality, to get minimal-logging for an operation that required pushing data into an existing table with existing data would have required the use of partitioned tables/indexes and a merge/split/switch type operation where the data would have been bulk-loaded from a source into an empty staging table in your server, then switched into an empty partition within your pre-existing table. Naturally, this would necessitate the use of partitioning on the table, and inherently require you use the Enterprise edition of SQL Server (which is the only edition that supports partitioning). If you either didn't want to (or couldn't) partition your existing table, or ran a non-Enterprise version of the server, you really didn't have any options for bulk-loading into existing tables with existing data (baring a partitioned view configuration perhaps). This new enhancement in 2008 will allow bulk-loading / minimally-logged operations for many more scenarios than are possible today.

    Similar to the existing minimally-logged operations, there are some prerequisites for these statements to actually be minimally-logged - you can find a full and detailed list in SQL 2008 Books Online, and also a discussion about the different operations on the SQL Server Storage Engine team's blog.

    Sunil, a PM on the SQL Server Storage Engine Team, has a great 3-part series covering the enhancements, so I won't bother repeating what he has already described extremely well, instead I'll simply point you to each of the posts:

    Part 1

    Part 2

    Part 3

    I'll also leave a very simple sample script you can run to see some of the performance differences between the fully-logged operation in 2008 and the same statement in a minimally-logged execution...Sunil's posts referenced above have additional samples that go into much greater detail and cover a wide-variety of possible scenarios.

    Enjoy!

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.mssqltips.com/disclaimer.asp and http://www.mssqltips.com/copyright.asp.

     

    -------------------------------------------------------------------------------------------------------------
    CODE ONLY BELOW
    -------------------------------------------------------------------------------------------------------------

    use AdventureWorks;
    go

    -- Ensure full recovery...
    alter database AdventureWorks set recovery full;
    go

    -- Create a simple table...
    if object_id('dbo.insertLoadTest') > 0
        drop table dbo.insertLoadTest;
    go
    create table dbo.insertLoadTest (id int, charval char(36), filler char(250));
    go

    -- Fully logged insert...
    use AdventureWorks;
    go
    truncate table dbo.insertLoadTest;
    go
    declare @d datetime2;
    select @d = sysdatetime();
    insert    dbo.insertLoadTest with(tablock) (id, charval, filler)
    select    top 500000
            row_number() over (order by a.object_id), newid(), 'filler'
    from    sys.columns a with(tablock)
    cross join sys.columns b with(tablock);
    -- Get the time difference...
    select datediff(millisecond, @d, sysdatetime());
    go

    -- Minimally logged insert...
    use master;
    go
    -- Using simple vs. bulk-logged simply to ease the fact that I'd have to
    -- perform log backups with bulk-logged...this makes it obviously easier...
    alter database AdventureWorks set recovery bulk_logged;
    go

    -- Rerun the same tests as above again...should notice a significant
    -- improvement in not only run-time, but also a large difference in
    -- log-space usage as well...
    use AdventureWorks;
    go
    truncate table dbo.insertLoadTest;
    go
    declare @d datetime2;
    select @d = sysdatetime();
    insert    dbo.insertLoadTest with(tablock) (id, charval, filler)
    select    top 500000
            row_number() over (order by a.object_id), newid(), 'filler'
    from    sys.columns a with(tablock)
    cross join sys.columns b with(tablock);
    -- Get the time difference...
    select datediff(millisecond, @d, sysdatetime());
    go

  • SSD and SQL - Fragmentation Impact

    In the last post on Solid State and it's impact on SQL Server operations, we looked at a variety of different IO patterns and sizes on multiple systems to see where and when SSD would help out and when it wouldn't. If you read the post and analyzed the data, you could clearly see the huge gains Solid State provides with random read IO over traditional spindles - writing and large sequential reads seem to still favor traditional spindles. So, assuming you are in a system that performs lots of random read IO, SSDs will provide you tons of benefits, and will be much more resilient to the impact of fragmentation on the system (since fragmentation leads to heavier random io as a general rule). Since we recently had a series about fragmentation, and in that series we had a post showing the impact of fragmentation in different types of IO and patterns, let's see how those same tests are impacted with a SSD vs. a traditional drive.

    I re-ran the same exact tests that I had run previously to test the different levels of fragmentation and their impact on types of and patterns of IO and then folded them into the same spreadsheet I posted previously alongside my desktop results. See my prior SSD post for specs on my desktop and laptop machines, and see this post in the fragmentation series for the original sheet with fragmentation impact results on my desktop only.

    Given what we know about SSD and my laptop, we'd expect to see the laptop be much more resilient to fragmentation in general, outperform the desktop for cold-cache random IO read operations, and likely perform better comparatively in single-threaded operations vs. multi-threaded operations (since my desktop is a quad-core xeon and my laptop is a dual-core centrino). Additionally, we don't expect the SSD to out-perform my desktop in any warm-cache operations, since, well, they'd never touch the drive in either scenario (and again, my desktop has 16gb of cache, and my laptop has 4gb of cache).

    I've attached the spreadsheet in it's entirety to the post, so you can take a peak at the total results if you like, it's actually quite interesting. As expected, the SSD is much, much, much more resilient to fragmentation impact vs. traditional spindles - for heavily fragmented data, the traditional spindle system degraded from 300% up to 2,500%, whereas the SSD system for the same tests ranged in degradation of 10% up to 580%. My laptop/SSD system outperformed my desktop by as much as 300% under fragmented conditions. In nearly all cold-cache tests the SSD system destroyed the traditional spindle system with the few exceptions coming on large scans of contiguous data (which would take advantage of serialized large sequential IOs with the read-ahead manager), which we already know traditional spindles are still better at.

    I've attached the full spreadsheet with detailed results, enjoy!

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.mssqltips.com/disclaimer.asp and http://www.mssqltips.com/copyright.asp.

  • Partitioning Data for Query Performance - Where's the benefit?

    I spend a lot of time interfacing with SQL engineers of all types (developers, dba's, architects, ETL engineers, etc.) and something I hear very frequently is "I want to horizontally partition my data to improve my query response times to customers". This usually makes me follow up the statement(s) with questions of my own such as "what types of queries are you trying to improve response times on by partitioning your data?" or "what systems are you thinking of employing this type of strategy on ?".  Too often I get responses that in turn send us down a path of discussing why partitioning data isn't a viable strategy for improving many types of queries/workloads (such as typical OLTP workloads, seeks of data - particularly seeks that don't filter/join on partitioning keys, etc.). Sometimes engineers and others are downright shocked to see some of results of different query types against partitioned data - usually, by the end of a 1/2 hour demo and discussion and walking through some scripts (which I've attached as well), everything makes good sense and there's a good understanding of why things work the way they do. It's frequently a misconception that using the horizontal partitioning features built-in to the engine in SQL 2005 will lead to large benefits for OLTP-like workloads/queries/etc. (i.e. seeks of data, singleton lookups, etc.) when the reality is that horizontal partitioning is really more about improving performance for things like administrative tasks, bulk data loads/switches, and scans of data when the scanning can be limited by the partition ranges.

    This post is not about switching data with partitions, rolling window scenarios, administrative tasks with partitioning, bulk loading/moving data with partitioning, etc. - there is plenty of that in other places (but if someone would like to see some examples, by all means comment or contact me via email and I'll be happy to oblige). Instead, for the remainder of this post we are going to walk through query execution differences for a range of access methods (i.e. seeks, scans, singleton lookups, etc.) against a partitioned structure and a non-partitioned structure that are exactly the same with the exception of one having all indexes partitioned and one not being partitioned at all.

    The schema for the partitioned table is as follows:

    -- Create a sample table that will mimic the main table...

    create table dbo.Fund_Dtl (

    Fund_Dtl_ID int identity(1,1) not null,

    fillerColumn char(150) not null,

    Fund_Summary_ID int not null,

    Txn_Dt datetime not null,

    Partition_Column char(5) not null default ('xyz')

    ) on [PartitionPSFD] ([Partition_Column]);

    go

    -- Cluster...

    create clustered index Fund_Dtl_Fund_Summary_ID_CI on dbo.Fund_Dtl (

    Fund_Summary_ID,

    Partition_Column

    ) on [PartitionPSFD] ([Partition_Column]);

    go

    -- PK...

    alter table dbo.Fund_Dtl with check add

    constraint Fund_Dtl_pk primary key nonclustered

    (Fund_Dtl_ID,Partition_Column)

    on [PartitionPSFD] ([Partition_Column]);

    go

    -- Nonclustered...

    create nonclustered index ix_TxnDt on dbo.Fund_Dtl (

    Txn_Dt

    ) include (

    Partition_Column,Fund_Summary_ID

    ) on [PartitionPSFD] ([Partition_Column]);

    go

    As you can see, we include a clustered index and 2 nonclustered indexes (1 of which is the primary key) - the clustered index includes the partitioning column as the 2nd key column of the index, as does the primary key. Notice however that the 2nd nonclustered index doesn't include the partitioning column as a key column of the index, but instead only as a covered column (as an included column).  The script then proceeds to load 602,112 records into the partitioned table, filling the Partition_Column column with 7 distinct values (7 partitions) each with 86,016 records.

    Ok, then we create the non-partitioned table as follows:

    -- Create another table from the data in partitioned table that isn't partitioned...

    select *

    into dbo.Fund_Dtl_NoPartition

    from dbo.Fund_Dtl with(nolock)

    go

    -- Index it just like the main table, with no partitioning...

    -- Cluster...

    create clustered index c2 on dbo.Fund_Dtl_NoPartition (

    Fund_Summary_ID,

    Partition_Column

    );

    go

    -- PK...

    alter table dbo.Fund_Dtl_NoPartition with check add

    constraint Fund_Dtl_np_pk primary key nonclustered

    (Fund_Dtl_ID,Partition_Column);

    go

    -- Nonclustered...

    create nonclustered index ix_TxnDt_np on dbo.Fund_Dtl_NoPartition (

    Txn_Dt

    ) include (

    Partition_Column,Fund_Summary_ID

    );

    go

    As you can see, the non-partitioned table is identical to the partitioned table in every way with the exception that it isn't partitioned at all - the columns are the same, the indexes are the same, the data is the same, etc.

    Ok, at this point the script continues to run 6 different identical tests against the partitioned and non-partitioned tables, and commenting on the performance differences on my machine - I'd encourage you to download the attached script and run through it on your own system to become familiar with the setup, results, etc. (it is heavily commented). If anyone has additional tests they'd like to try out, or if anyone sees any discrepancies, by all means, let me know. I've included a brief summary of the tests included in the script and the results from my machine below for those who don't have access to anything for using the attached script (or mobile readers :-)). Enjoy!

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.mssqltips.com/disclaimer.asp and http://www.mssqltips.com/copyright.asp.

    TESTS:

    1. SEEK (small range-scan) of data on the Fund_Dtl_ID column. This query does not filter on the partitioning column in any way, and we iterate 30,000 executions of the query. In this test, the run against the partitioned table finishes consistently in about 3.7 seconds on my machine, performs 8 scans (partition constant scans) and 14 logical reads. The non-partitioned table finishes consistently in about 2.1 seconds (1.6 seconds faster), performs a single scan and 3 logical reads. Obviously this scenario is won by the non-partitioned table.
    2. Same as test #1, only we also filter on the partitioning column with an equality against a constant value. The run against the partitioned table improves to 3.4 seconds with a single scan (partition elimination due to the constant equality) and 2 logical reads. The non-partitioned run drops to about the same number, finishing in 3.3 seconds with 3 logical reads and a single scan. The non-partitioned query lost some time due to a residual check of the data for the partitioning column. Call this scenario a tie.
    3. Same as test #2, only we create 2 additional indexes with the key columns reversed (i.e. Partition_Column then Fund_Dtl_ID). Both runs improve by a second or so in duration, but the results end in a tie again.
    4. Same as test #2, only instead of an equality against the partitioning column, we use a non-equality filter (>=). The run against the partitioned table finishes in about 5 seconds with 7 scans and 14 logical reads. The non-partitioned column wins significantly again, finishing in about 3.45 seconds with a single scan and 3 logical reads.
    5. This test introduces a SCAN operation, but for this test there is no filtering for the partitioning column, just a standard scan of the clustered index for each table, and performing only 30 iterations instead of 30,000. The partitioned table run finishes in just over 2.6 seconds consistently with 8 scans and 14,112 logical reads. The non-partitioned test finishes in nearly the same time (just over 2.5 seconds) performing a single scan and slightly less reads with 14,052. Another tie.
    6. The final test is again a SCAN operation, this time including an additional equality filter on the partitioning column and extending the iterations from 30 to 75. Here is the scenario where we expect partitioning to help immensely with a query, and it doesn't disappoint us for sure.  The partitioned table test run finishes in well under a second consistently (850 milliseconds or so) with a single scan and 2,016 logical reads. The non-partitioned table tests take just over 6 seconds consistently, performing a single scan and again performing 14,052 logical reads (just like the last test).
  • SSD and SQL - SQLIO performance

    Let's start the SSD blogs off with a comparison of the SQLIO tool running on my laptop and compare the results to some other traditional spindle based systems.

    First, let's outline the systems we'll be comparing throughout the SSD blogs:

    • First, my laptop. This is a loaded Dell XPS M1330 laptop running Windows Vista Ultimate x64. Specs include a 2.2Ghz Intel Core2Duo T7500 processor (dual-core, 800MHz FSB, 4MB L2 cache), 4gb DDR2 SDRAM @ 667Mhz, and a single 64gb Solid State Drive (Samsung)
    • Next, my desktop.  This is a Dell Precision Workstation T5400 running Windows Vista Business x64. Specs include a 2.0Ghz Intel Xeon E5405 (quad-core, 1333MHz FSB, 12MB L2 cache), 16gb DDR2 SDRAM @ 667Mhz, a single SATA II 10k RPM 146gb boot-drive (system, swap), and a single SATA II 7.2k RPM 750gb additional drive.
    • Server #1. This server is a Dell PowerEdge 2950 running Windows Server Enterprise 2003 R2 x64. Specs include 2 x Intel Xeon 5160 3.0Ghz processors (dual-core, 1333MHz FSB, 4MB L2 cache each (8MB total)), 16gb RAM (8 x 2 gb Dual-Rank), dual PERC 5/E adapters, a RAID-1 boot-drive (system,swap) with 2xSATA II 7.2k RPM 250gb drives, and 2 LUNs attached via an external Dell MD-1000 DAS enclosure. Each external LUN is made up of 7 SATA 7.2k RPM 500gb spindles each in a RAID-5 configuration. For tests, we'll only make use of the 2 externally-attached LUNs.
    • Server #2. This server is an HP DL380 running Windows Server Enterprise 64-bit Edition attached to an HP EVA 6100 SAN. Specs include 2 x Intel Xeon X5450 3.0GHz processors (quad-core, 1333MHz FSB, 12MB L2 cache each (24MB total)), 32gb RAM, tri emulex LP1050 HBA's, 2 internal 15k SAS drives RAID 1 for boot/swap, external log LUN with 8 SAS 15k 146gb drives RAID10 configured, 2 external data LUNs each with 20 SAS 15k 146gb drives RAID 10 configured for data files. SAN is connected via 2Gb FC fabric and dual Brocade 4gb silkworm fiber switches.

    As you can see, we have quite a range of systems from my laptop, a beefy desktop, a mid-size server/storage configuration, and a larger-size server/storage configuration. This should give us a good range of tests throughout the different scenarios. So, let's get this party started, first with some SQL IO comparisons.

    For our SQL IO test, we'll use the following parameters:

    • All tests are run back-to-back in a serial manner - each test hits either the LOG file or the DATA files (so any tests that have the log file on the same LUN as a data file are never mixing IO operations)
    • All tests use 8 processing threads, run for 10 minutes each, are enabled for multi-io, and allow up to between 16 and 64 outstanding IOs each
    • Files used include 2 data files and 1 log file
      • On my laptop, all files reside on the solid state drive and are sized at 15gb each.
      • On my desktop the log file and one of the data files resides on the 10k boot drive - the other data file resides on the 7.2k drive. All files are sized at 75gb each.
      • On Server #1, the log file and one of the data files resides on the first LUN from the DAS - the other data file resides on the 2nd LUN from the DAS. All files are sized at 100gb each.
      • On Server #2, the log file resides on the log lun, and the 2 data files each reside on a separate data LUN each. All files are sized at 100gb each.
    • The tests include each of the following:
      • 8kb random writes to the data files (pattern for checkpoints, tempdb, etc.)
      • 8kb random reads to the data files (pattern for random data reads, singleton seeks, etc.)
      • 64kb sequential writes to the log file (bulk log writes)
      • 8kb random reads to the log file (rollbacks, log reader, etc.)
      • 1kb sequential writes to the log file (small log writes)
      • 64kb sequential writes to the data files (checkpoints, reindex, bulk inserts)
      • 64kb sequential reads to the data files (read-ahead, reindex, checkdb)
      • 128kb sequential reads to the data files (read-ahead, reindex, checkdb)
      • 128kb sequential writes to the data files (bulk inserts, reindex)
      • 256kb sequential reads to the data files (read-ahead, reindex)
      • 1MB sequential reads to the data files (backups)

    So, let's take a look at the results - in each test we've captured the number of IOs performed per second, the number of MBs per second, the total amount of data transferred in GBs, and the Cost per GB where I took the cost of each system and correlated it against the total GBs transferred.

    First, here are the results for operations against the simulated DATA files/luns:

    image

    And secondly, here are the results for the tests against the simulated LOG files/luns:

    image

    So, where did the SSD perform well? Clearly, it is king in the random read scenarios, nearly keeping pace with Server #2 on each LUN - and, if you include costing in the figures, it's a landslide in favor of the SSD. These benefits for random read patterns are great for day-to-day computing, which is why I see such improvements on my laptop for day-to-day use. In the SQL Server world, this would be beneficial for traditional OLTP-like read patterns (i.e. small, singleton type reads of data) and also for something else we've looked at in detail recently - reads against heavily fragmented structures.

    For writes, it's a different story - the SSD lagged significantly behind all other systems in nearly every write scenario.  The same holds true for sequential read patters as well, which aligns with findings elsewhere. On a side note, I did perform some sequential read/write tests with smaller IO sizes (1kb and 4kb) for the data files on each of the systems except the Server #2 system and the SSD performs quite a bit better with these smaller sized IOs than the larger sized IOs that are performed typically in a SQL Server system - seems to follow findings from other reviews I've found and researched a bit on the web. Of course, we really care about how it performs with SQL for these posts, and as you can see, it really is beneficial for smaller, random IO patterns - with the larger and sequential type patterns, it looks like for now most traditional spindle systems will outperform them for now (at least the commodity versions that you can get in a laptop for example, I'm sure some of the solid-state systems built for enterprise deployments are a bit more valid for server scenario type tests, you can find these types of systems by most major storage manufacturers today like EMC, HP, Hitachi, etc.).

    Good start to our SSD posts - next I think will be the impact of these drives on fragmentation (since it's fresh in our minds and all).

    Enjoy!

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.mssqltips.com/disclaimer.asp and http://www.mssqltips.com/copyright.asp.

  • SSD and Me

    I recently purchased 2 laptops with Solid State Drives in them (my loaded Dell XPS 1330 and my wife's Apple MacBook Air), and I have to say, I will never, ever again own a computer that doesn't have a Solid State Drive in it (at least not until they make something even better). These suckers are great, well worth the money IMHO (of course, I am a techie and find huge value in having the latest and greatest of technology to work with, so that may impact things I'm sure). Of things to note:

    1. Obviously, the performance - for boot times, running multiple concurrent applications, paging operations, and anything else that requires large amounts of random IO, these can't be beat today
    2. Power consumption - I've run my laptop for up to 6 hours with a full-brightness on the display - I do have a 9-cell battery, but still, 6 hours of battery time on my laptop is awesome.
    3. Heat - more accurately, the lack of - there is none...the fan never runs, it never gets warm on my lap, etc.
    4. Noise - again, more accurately the lack of - the laptops don't make any noise hardly at all. It's gotten to the point now where I think my keys making that clicky noise as I type is annoying.

    I used to leave my systems running all the time, 24*7, just so I wouldn't have to wait for a boot-up and re-launching all my apps, etc. (granted, this is a little over-the-top if I can't wait a few minutes and click 10 or 12 icons and wait another minute or so for them all to launch and load, but hey, you get used to things you have I guess) - now, I freely shut-down my laptop every night and boot-up in the AM with all my 7 or 8 apps that I launch in my startup group. Boot time on this thing is lightning fast (I have a work PC that has a 10k Raptor drive for it's boot system with 16gb of RAM and my laptop smokes it), and starting up applications is a breeze compared to traditional spindles.

    I also have to plug a little for the MacBook Air - IMHO again, this is by far the best laptop on the market today for home use. My M1330 is small for certain, but the MacBook Air is TINY - my wife tried using my XPS once and actually taunted me about it being heavy (it weighs just under 4 pounds - no heavyweight for sure, but again, all what you are used to). I love the XPS, don't get me wrong - but if I'm at home on the couch surfing the web, emailing, blogging, navigating pics/videos/etc., etc., I'll go for the Air every time. It's not just the size, but the usability, the gestures, touch integration, etc. makes it so enjoyable to use.

    Ok, so back to this blog. So given my pleasure with the performance of the drives, I started thinking it was high time to put them through some tests with regards to SQL Server operations and measure some of the impact they might have on different areas in the engine and otherwise. Over the next few blog posts, I'll be providing some analysis of the performance of different workloads/sql operations on my new laptop and SSD compared to the same operations on some other machines (like my work machine, and hopefully a production-like system or 2).  Things I plan to address are:

    1. SQLIO benchmarks for each system
    2. Checkpoint operations
    3. Backup/Restore comparisons
    4. Bulk load operations
    5. Heavy Sequential and Random read/write workloads
    6. Impact on Fragmentation

    I'm open to recommendations of other things I should plan to test against, feel free to email or comment on the post and I'll try to include what I can.

    Given that we know the SSD's are much better at Random IO operations than the traditional spindles, I'm expecting any heavy random IO operation to see some significant benefits over the traditional systems. Sequential write IO operations are typically better today on traditional spindles, so we'll verify if that impacts these operations in SQL Server and if so how.

    Look for the first in the series to come shortly!

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.mssqltips.com/disclaimer.asp and http://www.mssqltips.com/copyright.asp.

  • Custom Index Defrag / Rebuild Procedures

    I often get asked for custom procedures to help with index maintenance, including things  such as reorganizing, rebuilding, stats updates, etc. There are quite a variety of these out there today (a simple google search will get you a bunch), and all have their pros/cons. The big thing I generally stress to customers when asked about this type of thing is to do what is best for their environment - a custom procedure that works perfectly for me in my environment(s) might be completely wrong for your environment.  Of course, there are some general rules of thumb that you can use in most any environment, but you get the idea - there is usually never a solution that is perfect for all environments (though that doesn't mean we can't keep trying to make one :-)).

    Some general observations that I've seen in multiple custom procedures in the past include:

    1. Remember that an INDEX REBUILD (or DBREINDEX) also updates statistics - there is no need at all to update statistics for the index after rebuilding (or before). In reality, you could end up with a worse case than you had - when performing an index rebuild, stats are updated using data gathered from a look at all the data in the index (after all, all the data had to be touched to rebuild anyhow), similar to if you had performed a stats update with a full-scan. If you then update stats on the index after the rebuild and use anything less than a full-scan option, you'll most likely end up with stats that aren't as 'accurate' as the full-scan version.
    2. Do not simply blindly rebuild/reorganize all indexes for all tables in all databases - I see this far too often.  Doing so not only wastes resources and processing time that can/could be used for other things, it also typically fails to give you any insight/feedback on the state of fragmentation in your system, or how fragmentation impacts your workload (if at all).
    3. Generally you should include some simple logic to weed-out small indexes (say for example, indexes with less than 5,000 or 10,000 pages). In many, many database schemas, there are 100's or 1000's of 'small' tables with multiple indexes - spending time rebuilding these can add up as the number grows. Use that time instead on the 10's or 100's of tables that really make a difference.
    4. If your custom procedure samples fragmentation of indexes, consider storing that data somewhere for analysis - you've done the bulk of the work by having the engine retrieve the stats, why not store them and report/analyze it so you can gain insight into the fragmentation levels of your data over time?
    5. Consider placing a time limit on the operation - it won't be an exact limit, but something that stops the execution of the procedure after a certain amount of time has passed and the last full operation completed would suffice in most cases - too many times I've had people call about sluggish/slow responding servers early in the morning when it boiled down to large nightly maintenance operations running long overnight.
    6. When using methods to determine fragmentation levels of an index to determine if it should be rebuilt or not, I typically see people use the "avg_fragmentation_in_percent" value as the hard guideline - you may also want to consider page density as a contributor, since that is logical fragmentation also (and impacts operations in cache as well as IO). If you're already looking for a level of avg_fragmentation_in_percent, consider using the same value as a guideline for "100 - avg_page_space_used_in_percent" as well.

    I'm sure there are more, but that's my starting list anyhow.  So, the next question I usually get is "well, what procedure do you use?" - that's a loaded question :-). I do have a generic procedure that I'll typically start with, and if the workload or environment requires it, I will modify it to be effective in that environment. I really can't stress this enough again - it ALWAYS depends on the environment - you should know your environment enough to be able to determine what is best for it, I typically won't be able to tell you what's best without any knowledge, or even a little knowledge.  So, to this post I've attached my generic version of a custom procedure that allows you to specify some options, and protects against a few things, like:

    1. You can specify a fragmentation threshold (%) that will be the low-water mark for considering which indexes to bother with (by using both the avg_fragmentation_in_percent and (100-avg_page_space_used_in_percent) values both)
    2. You can specify a date/time that the operation should stop if it passes
    3. Given the date/time specification, indexes are operated on in order of highest fragmentation levels to lowest, so the time spent is hopefully most beneficial
    4. You're allowed to define online/offline rebuilds, reorganizations, stats updates, partition level operations, etc.
    5. You can specify a location where you'd like to store the results of the stats check off indexes
    6. Stats will not be updated if you perform a rebuild
    7. Only indexes with > 10,000 pages will be touched

    It's a fairly simple procedure for general use.  If you'd like to get a look at a custom procedure that is a bit more specialized, offers additional options, etc., take a look at Ola Hallengren's version. This version allows you to specify a multitude of options and operations, can iterate over multiple databases, additional logging operations, parameterized values for things like page counts and fragmentation levels, etc. (it's also documented quite well).

    Between the 2 solutions, I'm sure you could likely cover a wide variety of scenarios and workloads to find what works best for you.  Enjoy!

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.mssqltips.com/disclaimer.asp and http://www.mssqltips.com/copyright.asp.

  • Fragmentation Station - Stop #8 - Hands On Experience

    In our final stop for the fragmentation series we're going to walk through a complete script that will cover almost everything we've discussed in the series, including:

    • New object creation and allocation (heaps, clustered, non-clustered indexes)
    • Impact of parallel operations vs. serial, sort_in_tempdb, etc.
    • Full scan, range scan, and singleton seek performance tests against cold and warm caches with:
      • No fragmentation
      • Logical fragmentation (unordered pages with high page density and low page density)
      • Extent fragmentation (from concurrent allocation operations and multiple files for a single index)
    • Comparison of Defrag vs. Rebuild for multiple types of structures with multiple types of fragmentation
    • Output of full comparison for performance related measures (physical reads, logical reads, test durations, etc.) of each of the above for multiple tests

    This script should allow you to see and understand first-hand the impact of different types of operations against different types of structures in a variety of fragmentation levels and validate many of the things we discussed in terms of fragmentation's performance impact on operations, what causes fragmentation, understanding the output of detection for fragmentation, the different types of fragmentation and the impact on operations, and the different ways to address fragmentation once you have it and need to fix it. I'm going to walk through the script and what happens at a very high level in the remainder of this post, but to get details you'll want to run it and view the output for yourself in your own test environment.

    First, some logistics:

    • The script is built to be executed all at once and spits out a bunch of information and notes for you to consume once complete - alternatively, you can simply walk down the script manually if you like as well, following step-by-step
    • This script uses both the built-in SQL methods for detecting fragmentation as well as the custom procedure for viewing page linkage and summary information
    • The script needs to be run in SQLCMD mode, as there are 5 different variables you need to set for customizing to your environment. This is a script that will work with SQL 2005 only.
    • The script will create 2 databases (named as you specify and creating files in the location you specify) and also make use of Database Snapshots in the final portion of the script (defraging/rebuilding comparisons), so to successfully run that portion of the script it needs to be tested on either the Enterprise or Developer edition of SQL 2005...the databases will not be dropped at the end of the script, so be sure to do that if you want to remove them when finished testing
    • The output of the script will include detailed notes about preceding/following statistics if you set the "printComments" flag to 1 at the top of the script. Notes about results and operations are delimited by strings of dashes ("-") printed across the screen, then the comments, then an ending delimiter of dashes again.
    • Major 'sections' of the output are delimited by large headers that are equal signs (i.e. "=") printed across the screen, then the comment about the section, then an ending equal signs delimiter

    Ok, that should cover the logistics, below is an analysis of the results from the test run on my desktop which have been attached to this post in a spreadsheet called "desktopResults.xlsx". Additionally, I've attached to this post the output of the script on my desktop, as well as the script itself.

    My desktop machine is a Dell Precision Workstation T5400 64-bit machine, quad-core Xeon E5410 (2.33Ghz, 12Mb L2 cache, 1333 FSB), 16 GB DDR2 SDRAM, a 160GB 10k 16mb cache SATA boot drive, and a 500GB 7.2k 16mb cache SATA data drive. For these tests, the log files for the databases resided on the data drive and the data files resided on the faster boot drive. Multiple test runs were performed for both MAXDOP=1 and MAXDOP=0 configurations and I am reporting the results of one test from each that fell into the 'middle' of the test results for the given configuration.

    Some key points to takeaway from and inspect in the output include:

    • As expected, "cold cache" tests are significantly more impacted by fragmentation than "war cache" tests - the slowdown in warm-cache tests is not from improperly ordered page chains or interleaved pages, but instead from the page density of each page in cache being lower and requiring more logical reads (i.e. reading more pages) to return all the data. Notice the warm-cache tests for our logically fragmented structure with a higher page density is significantly less impacted than the same tests with a low page density.
    • Also as expected, scan operations are significantly impacted by higher fragmentation levels, whereas seek operations are hardly impacted at all (both warm and cold cache tests).
    • Comparing run times for parallel vs. sequential plans (i.e. maxdop = 0 vs maxdop = 1), you should notice that warm-cache tests benefited significantly from a parallel operation (both fragmented and non-fragmented data), cold-cache tests with non-fragmented data were often impacted negatively, and cold-cache tests with fragmented data benefited marginally. This is as expected, since parallelizing a plan is all about helping performance with added CPU power, not for improving IO throughput (the read-ahead manager is responsible for that). Warm-cache operations are all about churning through cached pages as fast as possible, so parallel plans would help this significantly. Non-fragmented data in a cold-cache scan operation would be optimized by the read-ahead manager feeding a single sequential operator as much as possible - adding additional threads to read additional data wouldn't help this type of operation much at all. With fragmented data, additional threads requesting different data pages may help marginally since the read-ahead manager can't do it's job optimally (non-contiguous pages), however again, it will be marginal at best typically. Since the disk system and IO operations are where fragmentation introduces the most pain, optimizing IO throughput is where benefits will be seen, not in adding CPU power.
    • A clustered object typically inherently contains a bit more "fragmentation" than a heap, particularly with insert-heavy type workloads. A big difference between a heap and a clustered table is that in a cluster, the storage structure is a B-Tree (balanced tree) where the leaf pages are the data and above the leaf there are intermediate pages and a single root page. These non-leaf levels of the B-tree are maintained as data is modified within the index. A heap does not have these non-leaf pages, since a heap is not a B-tree, it is simply a bunch-o-pages that are allocated as data is modified in the table.So, as data is inserted into a heap, the engine will simply try to get a free page somewhere, if it does not exist, it will allocate a new extent, and then continue using pages, there is no maintenance for non-leaf pages here, since they don't exist in a heap - in this way, no matter how much data you continue to add, so long as the extents that are next in the file haven't been taken up by something else, you'll get relatively contiguous data pages.With a cluster, as data is inserted, even if its ever-increasing data like an identity or date/time value, as the non-leaf pages are filled up, additional non-leaf pages will have to be allocated for storing the data for the upper-levels of the index.
    • Using a serial build (maxdop = 1) and using the sort_in_tempdb option can help significantly with building indexes that are more contiguous when initially built. When SORT_IN_TEMPDB is set to OFF, the default, the sort runs are stored in the destination filegroup. During the first phase of creating the index, the alternating reads of the base table pages and writes of the sort runs move the disk read/write heads from one area of the disk to another. The heads are in the data page area as the data pages are scanned. They move to an area of free space when the sort buffers fill and the current sort run has to be written to disk, and then move back to the data page area as the table page scan is resumed. The read/write head movement is greater in the second phase. At that time the sort process is typically alternating reads from each sort run area. Both the sort runs and the new index pages are built in the destination filegroup. This means that at the same time the Database Engine is spreading reads across the sort runs, it has to periodically jump to the index extents to write new index pages as they are filled.  If the SORT_IN_TEMPDB option is set to ON and tempdb is on a separate set of disks from the destination filegroup, during the first phase, the reads of the data pages occur on a different disk from the writes to the sort work area in tempdb. This means the disk reads of the data keys generally continue more serially across the disk, and the writes to the tempdb disk also are generally serial, as do the writes to build the final index. Even if other users are using the database and accessing separate disk addresses, the overall pattern of reads and writes are more efficient when SORT_IN_TEMPDB is specified than when it is not.  The SORT_IN_TEMPDB option may improve the contiguity of index extents, especially if the CREATE INDEX operation is not being processed in parallel. The sort work area extents are freed on a somewhat random basis with regard to their location in the database. If the sort work areas are contained in the destination filegroup, as the sort work extents are freed, they can be acquired by the requests for extents to hold the index structure as it is built. This can randomize the locations of the index extents to a degree. If the sort extents are held separately in tempdb, the sequence in which they are freed has no effect on the location of the index extents. Also, when the intermediate sort runs are stored in tempdb instead of the destination filegroup, there is more space available in the destination filegroup. This increases the chances that index extents will be contiguous.
    • Note that the performance of a multi-file index on the same spindles is more like the extent fragmented performance than the contiguous performance. Of course, this is my desktop system and not a high-end many-spindle SAN/DAS system or anything, which could obviously produce different results. Key thing to understand is that multi-file systems introduce some level of extent fragmentation into a system for the added benefit of IO throughput and space (assuming you have a properly configured IO system with multiple spindles and IO paths and a BUS that isn't overloaded, etc., etc.). If you'd like to see what a difference it would make, rerun this same test on the same database after adding the -T1118 and -E startup switches and restarting the server. You would notice that the fragment sizes go back up to the 30-40 range, and the fragment count is back down to near single-file-database numbers. This shows what type of benefit the -E and -T1118 startup switches can provide when used appropriately.
    • While running the tests, consider monitoring the impact on your IO system and cache. If you look at the buffer pool cache size and IO sizes during the above tests, you should notice that when performing contiguous IO operations, the avg. IO sizes are consistently above 64k in size, up to 256k in size maybe even. When performing non-contiguous IO operations, the avg. IO sizes are most likely consistently lower than 64k - the more those sizes approach 8k, the more IO operations that are necessary to transfer the same number of data, and hence the longer the run times.
    • Pay close attention to the impact on defragmenting structures that a defrag vs. a rebuild has and correlate that back to what you know about performance - might make you rethink your current de-fragmentation plan.

    Enjoy, post or email any questions!

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.mssqltips.com/disclaimer.asp and http://www.mssqltips.com/copyright.asp.

  • Fragmentation Station - Stop #7 - How to address it

    In our 2nd to last post in the Fragmentation series, we'll discuss our options on addressing and removing/correcting fragmentation. In our next and final post in the series, we'll end with a full-fledged SQL script that will walk you through all the different things we've talked about in the series to give you some up close and personal experience with what we've gone through.

    Once you've detected that you have fragmentation, understand what it impacts in your workload, and understand the different types of fragmentation and the type you're trying to correct, you have a couple of options to address it - realistically, you have 3:

    • DEFRAG / REORGANIZE - A defrag operation (called a reorganize in 2005) will address logical fragmentation by trying to both increase page density in an index(es) and also by re-ordering pages in-place to create properly ordered pages as best as possible. A defrag will not however create contiguous pages if the existing pages are not contiguous, as a defrag does not allocate new pages during the operation, it only shuffles existing allocated pages in place trying to reorder them into proper logical order. Therefore, a defrag/reorg will not address extent fragmentation at all, only logical fragmentation - however this is traditionally the most intrusive type of fragmentation, so many times a defrag/reorg is the best option.
    • REINDEX / REBUILD - A reindex/rebuild operation will do everything that a defrag/reorganize does and also try and build fully contiguous pages as well by allocating new pages to the index where appropriate. A rebuild is a completely atomic operation (i.e. it is all or nothing), and is an 'offline' operation as well (though 2005 Enterprise Edition allows a new online rebuild option as well). One other side-benefit to a rebuild is that statistics related to the index are rebuilt in addition to the index itself (unlike a defrag)
    • DROP and CREATE - Typically there is no difference between a full rebuild and a drop/create operation. One benefit to a drop/create is that you can also change the index keys and included columns (sql 2005).

    The most frequently used options are the 1st and 2nd listed - here's a matrix comparing some of the key functional differences between the rebuild and reorganize options:

    Functionality REBUILD / DBREINDEX REORGANIZE / DEFRAG
    Online / Offline Offline; Online possible with 2005 Enterprise Edition Online
    Faster when logical fragmentation is High Low
    Parallel Processing Yes No
    Compacts Pages Yes Yes
    Can be stopped/started without losing completed work to that point No Yes
    Able to untangle interleaved pages Yes No
    Additional free space required in data file for process Yes - 1.2x - 2x existing size No
    Faster on larger indexes Yes No
    Rebuilds Statistics Yes No
    Log Space Used Full Recovery - High; Bulk/Simple Recovery - Low; Depends on work performed
    May skip pages on busy system No Yes
    Can specify additional options (fillfactor, etc.) Yes No

    Of course, one other option to 'addressing' fragmentation could be to choose to not fix it. Hopefully you're not the type of engineer who simply rebuilds/defrags indexes every day just for the sake of doing so or because it makes you feel better - instead, try to work out a plan to understand which indexes in your schema actually matter the most, are impacted by the different kinds of fragmentation, etc. and those that aren't.

    Let's take a look at the internals for defrag and reorganizing - reorganizing an index is performed in basically 2 stages:

    1. The page compaction stage - this stage is meant to try and make pages have a 'fullness' near the original fillfactor specification for the index. This operates at the leaf-level of the index only, and pages are compacted by shuffling rows towards the left-side of the B-tree and dropping ghosted records and freeing pages made empty.
    2. Page defrag-ing stage - this stage is meant to make the logical order of pages match the allocation order. Again this operates at the leaf-level only, by performing a logical-ordered scan and an allocation-ordered scan in lockstep with one another, re-establishing a scan position after every page. This is a totally online operation, aside from an eXclusive lock on each page for the duration of each given re-ordering. Remember that no new pages are allocated during a defrag operation (aside from the single page for temporary space), which means that logical re-ordering is achieved by shuffling pages that are already allocated to the index only - this is why a defrag operation can't solve interleaved pages, since only existing pages are shuffled.

    Let's take a look at some visual examples on how these stages work - the defrag-ing stage is the same for both 2000 and 2005, but the page compaction stage uses a different and improved in 2005 using a new sliding-window algorithm.

    SQL 2000 Defrag Compaction (left-to-right, top-to-bottom)

    clip_image002[5]clip_image004[8]clip_image006[8]

    clip_image008[7]clip_image010clip_image012

    clip_image014clip_image016clip_image018

    clip_image020

    SQL 2005 Reorganize Compaction (left-to-right, top-to-bottom)

    clip_image002clip_image004

    clip_image006clip_image008

    Page Reordering Stage (left-to-right, top-to-bottom)

    clip_image002[7]clip_image004[10]clip_image006[10]

    clip_image008[9]clip_image010[5]clip_image012[5]

    clip_image014[5]clip_image016[5]

    That wraps up our 2nd to last post in the series, in our next and final post we'll wrap it up with an all-in-one script for everyone to use in a test environment of your own to verify everything we've discussed.

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.mssqltips.com/disclaimer.asp and http://www.mssqltips.com/copyright.asp.

  • Fragmentation Station - Stop #6b - Custom Detection Procedure, Page Linkage

    In post #6 of the series, I mentioned that I often get asked for alternative methods of looking at fragmentation, a way to view the page chain or linkage, and ways to get insight into which pages are out of order in a given structure. There is nothing super graceful to be honest, but you can make use of a DBCC statement (DBCC IND()) to get page linkage and order data, then using some relatively complex querying against the data captured you can see some of this type of information.

    This post has an attachment (customFragInfo.sql) that contains the following procedures:

    • zcpReinitDbccInd - This procedure takes a single parameter (@tableName) which defines which table to get page-level information for. The procedure will then create a single table called 'ztblDbccInd' (if it exists it will drop it) which is used to store the results from DBCC IND() into. This will produce a table that contains a single record for every allocated page for all indexes in the given table. We then strip out all pages except data pages, add a few columns to the table, and update them to include indicators for the appropriate logical and physical positions of each page (based on the page linkage information present on the pages).  This is all done with a fairly complex recursive CTE and update statement. We then build some indexes and call it good. This procedure has to be called first on the table you want to analyze with the following procedures.
    • zcpShowPageLinkage - This will show a record for each page at the leaf-level of each index (or all data pages in a heap) sorted by the logical ordinal position of each page - the logical ordinal position meaning the order the page should fall in if you were to scan the data from front-to-back using an ordered-index scan of the data (doesn't apply for heaps, since they don't hold any logical order). If you see a value other than '0' in the physicalPageDiff column, this tells you that this page is that # of pages away from the prior page in the logical order, which indicates that there are other pages of data between the 2 pages - if the value is negative, then that page is earlier in the file; if positive, that page is later in the file (which comes into play when performing a scan of data).
    • zcpShowFragSummary - This will show a summary of the fragmentation for the given table's index(es) and optionally an additional 2 result sets if you set the single parameter (@showFragPages) to 1. The 1st is a summary of the count of logically fragmented pages, file-level fragmented pages, non-contiguous pages, non-leaf pages, and the total # of pages for each index/heap in a given table. The 2ndshows each page that is logically fragmented, and earlier physically within the file than the prior page, as well as the previous and next page logically ordered for comparison purposes (each logically unordered page will show a 'position' value of '00', whereas the previous page will have a position value of '-1', and the next page will show a position value of '+1'). The 3rd result set is the same as the2nd, except it shows logically unordered pages that are later physically in the file than the prior page.

    That's all of them, we'll be using these in the final post in the series where we get a full hands-on script, enjoy!

    Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.mssqltips.com/disclaimer.asp and http://www.mssqltips.com/copyright.asp.

  • Fragmentation Station - Stop #6 - How to detect it

    In our 6th post in the fragmentation series (I now know we are going to have 8 total) we are going to talk about the different ways to determine if you have fragmentation, and what type of fragmentation you have. This type of analysis will help you determine if you are being impacted by it for your given types of workloads and try to determine if you need to address it.  In the final 2 posts coming up in the series we'll talk about how to address it and then finally provide an all-in-one script that you can use to validate everything we've talked about in the series for yourself and get a first-hand look at code in action.

    SQL Server provides 1 primary tool for detecting fragmentation, and that tool is different for SQL 2000 vs. 2005. For SQL 2000, it's the DBCC SHOWCONTIG statement, and for 2005 it's the sys.dm_db_index_physical_stats DMF. The prior statement still works in 2005, however it's strongly recommended that you start using the new and improved DMF, which contains algorithmic improvements for more accurately determining fragmentation, uses only IS level locks (vs. full Shared locks in default scanning mode for DBCC SHOWCONTIG), outputs information on things like forwarded records and ghosted records (for a heap), and includes more detailed filtering capabilities. One fairly significant improvement to the algorithm for SQL 2005 is that it will accurately report extent level fragmentation across multiple files - with the SHOWCONTIG statement in 2000, if you have an index that spans multiple files, the extent fragmentation numbers reported would be completely useless.  These tools will each report on logical fragmentation, page densities, and extent level fragmentation - nothing exists in SQL Server for detecting file-level fragmentation, but you can use any standard OS-level defraging utility to check this if necessary (most likely it's not - if you do attempt this, be sure that SQL Server is offline).

    Each of the statements include options for different scanning modes - the 2005 version includes 3 modes, SHOWCONTIG only has 2:

    • FAST/LIMITED - This mode (fast in 2000, limited in 2005) will scan only the parent pages of the leaf-level pages for the given index(es) (i.e. the final intermediate level of the index before the leaf). The advantages to this mode are that only a fraction of pages will be touched during the scan compared to a scan of the full leaf-level of the index (a fraction of the pages that would be equal to reducing the number of pages scanned by a factor of the fanout of the index). The down-side to this mode is that the engine can't report on some data such as page density, record and page counts, ghost/forwarded record counts, record sizes data, etc.
    • SAMPLED - Only available in the sys.dm_db_index_physical_stats DMF, not available with SHOWCONTIG. This mode performs just like the DETAILED mode only on a subset of the leaf pages instead of all the pages of the leaf. If the index/heap has < 10,000 pages, the DETAILED mode will be used, but if there are >= 10,000 pages, only a 1% sample of the pages will be scanned and inspected. Obviously, the statistics reported will only be accurate for the pages scanned (i.e page density, etc.), but should be a pretty good indicator as a whole.
    • DETAILED - This is the non-fast mode for SHOWCONTIG. In this mode, all pages in all levels of the index/heap are scanned and inspected.  The most thorough and accurate of all modes, and also the most intrusive and longest running mode.

    Given that the most intrusive type of fragmentation in most scenarios is logical fragmentation, using the FAST/LIMITED modes can be very beneficial for gauging the amount of logical fragmentation you have on an index without impacting the SQL buffer cache of holding long S/IS locks on the structure. If you have a very large index/heap, the SAMPLED mode will most also likely get you a very good estimate of the amount of all types of fragmentation you may have. To give you an idea on the effects of the FAST/LIMITED options on run-times for a SHOWCONTIG / sys.dm_db_index_physical_stats runtime, take a gander at the following:

    OPTION Page Count Run Time
    <default> 1,702,889 5.02 minutes
    FAST "" 0.90 minutes
        5.6x faster
    <default> 111,626,354 382.35 minutes
    FAST "" 48.73 minutes
        7.8x faster

     

    As you can see, using the FAST/LIMITED options can have quite a drastic impact on the run-times of the statement, so be sure to use the appropriate options for the appropriate data/environment.

    So, what are the key metrics to look for in the 2 statements and how do they each correlate to the types of fragmentation and what we already know about fragmentation? Let's hit some fields from each statement that are key metrics to understand (in the bullets, I'll list the field name for SHOWCONTIG first, then the sys.dm_db_index_physical_stats column 2nd):

    • "Logical Scan Fragmentation" / "avg_fragmentation_in_percent" - This is the primary indicator for the amount of logical fragmentation you have for the given structure (more accurately, it's the percentage of non-properly order pages, since it reports nothing to do with page density). The lower the number the better - as this number approaches 100% the more pages you have in the given index that are not properly ordered. For heaps, this value is actually the percentage of extent fragmentation and not logical fragmentation.
    • "Avg. Page Density" / "avg_page_space_used_in_percent" - This is the primary indicator for how dense the pages in your index are, i.e. on average how full each page in the index is. The higher the number the better speaking in terms of fragmentation and read-performance. As this number approaches 0% the less space on each page in the index that is actually used to store data, and the more pages it takes to store the same amount of data (and hence the more pages that need to be read to get all records, the more memory space that is needed in the buffer pool, etc.)
    • "Extent Scan Fragmentation" / "fragment_count" and "avg_fragment_size_in_pages" - These are the primary indicators for extent fragmentation in the given structure. For 2000, it's a measure of out-of-order extents in the leaf-level of an index (not valid for heaps in 2000). For 2000, the lower the number the better.  For 2005, a fragment is a group of physically consecutive leaf pages in the same file for a given structure - each structure has at least 1 fragment (best-case scenario) and at most the same number of fragments as there are pages in the structure - so, the lower this number the better. The "avg_fragment_size_in_pages" value is an indication of the average size of each fragment for the given structure, and the higher the value the better (the higher the value, the greater the number of physically consecutive pages on average throughout the structure).

    Those are the best indicators from the internal tools for determin