SQL Server 2008 R2 : Row Estimation and Index Selection (part 2) - Estimating Access Path Cost

9/10/2011 4:16:30 PM

Estimating Access Path Cost

After the selectivity of each of the SARGs, OR clauses, and join clauses is determined, the next phase of optimization is estimating the access path cost of the query. The Query Optimizer attempts to identify the total cost of various access paths to the data and determine which path results in the lowest cost to return the matching rows for an expression.

The primary cost of an access path, especially for single-table queries, is the number of logical I/Os required to retrieve the data. Using the available statistics and the information stored in SQL Server regarding the average number of rows per page and the number of pages in the table, the Query Optimizer estimates the number of logical page reads necessary to retrieve the estimated number of rows using a table scan or any of the candidate indexes. It then ranks the candidate indexes to determine which access path would retrieve the matching data rows with the lowest cost, typically the access path that requires the fewest number of logical and physical I/Os.

Note

A logical I/O occurs every time a page is accessed. If the page is not in cache, a physical I/O is first performed to bring the page into cache memory, and then a logical I/O is performed against the page. The Query Optimizer has no way of knowing whether a page will be in memory when the query actually is executed, so it always assumes a cold cache, that the first read of a page will be from disk. In a very few cases (for example, small OLTP queries), this assumption may result in a slightly slower plan being chosen that optimizes for the number of initial I/Os required to process the query. However, the cold cache assumption is a minor factor in the query plan costing, and it’s actually the total number of logical I/Os that is the primary factor in determining the cost of the access path.

Clustered Index Cost

Clustered indexes are efficient for lookups because the rows that match the SARGs are clustered on the same page or over a range of adjacent pages. SQL Server needs only to find its way to the first page and then read the rows from that page and any subsequent pages in the page chain until no more matching rows are found.

Therefore, the I/O cost estimate for a clustered index is calculated as follows:

	Number of index levels in the clustered index
+	Number of pages to scan within the range of values

The number of pages to scan is based on the estimated number of matching rows divided by the number of rows per page. For example, if SQL Server can store 250 rows per page for a table, and 600 rows are within the range of values being searched, SQL Server would estimate that it requires at least three page reads to find the qualifying rows. If the index is three levels deep, the logical I/O cost would be as follows:

	3	(index levels to find the first row)
+	3	(data pages: 600 rows divided by 250 rows per page)
=	6	logical page I/Os

For a unique clustered index and an equality operator, the logical I/O cost estimate is one data page plus the number of index levels that need to be traversed to access the data page.

When a clustered index is used to retrieve the data rows, you see a query plan similar to the one shown in Figure 1.

Figure 1. An execution plan for a clustered index seek.

Nonclustered Index Cost

When searching for values using a nonclustered index, SQL Server reads the index key values at the leaf level of the index and uses the bookmark to locate and read the data row. SQL Server has no way of knowing if matching search values will be on the same data page until it has read the bookmark. It is possible that while retrieving the rows, SQL Server might find all data rows on different data pages, or it might revisit the same data page multiple times. Either way, a separate logical I/O is required each time it visits the data page.

The I/O cost is therefore based on the depth of the index tree, the number of index leaf rows that need to be scanned to find the matching key values, and the number of matching rows. The cost of retrieving each matching row depends on whether the table is clustered or is a heap table (that is, a table with no clustered index defined on it). For a heap table, the nonclustered row bookmark is the page and row pointer (the row ID [RID]) to the actual data row. A single I/O is required to retrieve the data row. Therefore, the worst-case logical I/O cost for a heap table can be estimated as follows:

	Number of nonclustered index levels
+	Number of leaf pages to be scanned
+	Number of qualifying rows (each row represents a separate data page read)

Note

This estimate assumes that the data rows have not been forwarded. In a heap table, when a row has been forwarded, the original row location contains a pointer to the new location of the data row; therefore, an additional page read is required to retrieve the actual data row. The actual I/O cost would be one page greater per row than the estimated I/O cost for any rows that have been forwarded.

When a nonclustered index is used to retrieve the data rows from a heap table with a clustered index, you see a query plan similar to the one shown in Figure 2 . Notice that in SQL Server 2008, the bookmark lookup operator is replaced by a RID lookup, essentially as a join with the RIDs returned by the nonclustered index seek.

Figure 2. An execution plan for a nonclustered index seek against a heap table.

If the table is clustered, the row bookmark is the clustered key for the data row. The number of I/Os to retrieve the data row depends on the depth of the clustered index tree because SQL Server has to use the clustered index to find each row. The logical I/O cost of finding a row using the nonclustered index on a clustered table is therefore as follows:

	Number of nonclustered index levels
+	Number of leaf pages to be scanned
+	Number of qualifying rows × Number of page reads to find a single row via the clustered index

For example, consider a heap table with a nonclustered index on last name. Assume that the index holds 800 rows per page (they’re really big last names!), and 1,700 names are within the range you are looking for. If the index is three levels deep, the estimated logical I/O cost for the nonclustered index would be as follows:

3	(index levels)
+ 3	(leaf pages: 1,700 leaf rows/800 rows per page)
+ 1,700	(data page reads)
= 1,706	total logical I/Os

Now, assume that the table has a clustered index on it, and the size of the nonclustered index is the same. If the clustered index is three levels deep, including the data page, the estimated logical I/O cost of using the nonclustered index would be as follows:

3	(nonclustered index levels)
+ 3	(leaf pages: 1,700 leaf rows/800 rows per page)
+ 5,100	(1,700 rows × 3 clustered page reads per row)
= 5,106	(total logical I/Os)

Note

Although the I/O cost is greater for bookmark lookups in a nonclustered index when a clustered index exists on the table, the cost savings during row inserts, updates, and deletes using the clustered index as the bookmark are substantial, whereas the couple extra logical I/Os per row during retrieval do not substantially impact query performance.

For a unique nonclustered index using an equality operator, the I/O cost is estimated as the number of index levels traversed to access the bookmark plus the number of I/Os required to access the data page via the bookmark.

When a nonclustered index is used to retrieve the data rows on a table with a clustered index, you see a query plan similar to the one shown in Figure 3 . Notice that in SQL Server 2008, the bookmark lookup operator is replaced by a clustered index seek, essentially as a join between the clustered index and the clustered index keys returned by the nonclustered index seek.

Figure 3. An execution plan for a nonclustered index seek against a table with a clustered index.

Covering Nonclustered Index Cost

When analyzing a query, the Query Optimizer considers any possibility to take advantage of index covering. Index covering is a method of using the leaf level of a nonclustered index to resolve a query when all the columns referenced in the query (in both the column list and WHERE clause, as well as any GROUP BY columns) are included in the index leaf row as either index key columns or included columns.

Index covering can save a significant amount of I/O because the query doesn’t have to access the data page to return the requested information. In most cases, a nonclustered index that covers a query is faster than a similarly defined clustered index on the table because of the greater number of rows per page in the index leaf level compared to the number of rows per page in the table itself. (As the nonclustered leaf row size approaches the data row size, the I/O cost savings are minimal, if any.)

If index covering can take place in a query, the Query Optimizer considers it and estimates the I/O cost of using the nonclustered index to cover the query. The estimated I/O cost of index covering is as follows:

	Number of index levels
+	Number of leaf level index pages to scan

The number of leaf-level pages to scan is based on the estimated number of matching rows divided by the number of leaf index rows per page. For example, if index covering could be used on the nonclustered index on title_id for the query in the previous example, the I/O cost would be the following:

3	(nonclustered index levels)
+ 3	(leaf pages: 1,700 leaf rows/800 rows per page)
= 6	total logical I/Os

When index covering is used to retrieve the data rows, you might see a query plan similar to the one shown in Figure 4 . If the entire leaf level of the index is searched, it displays as an index scan, as shown in this example.

Figure 4. An execution plan for a covered index scan without limits on the search.

Other times, if the index keys can be searched to limit the range, you might see an index seek used, as shown in Figure 5 . Note that the difference here from a normal index lookup is the lack of the RID or clustered index lookup because SQL Server does not need to go to the data row to find the needed information.

Figure 5. An execution plan for a covered index seek with limits on the search.

Table Scan Cost

If no usable index exists that can be matched with a SARG or a join clause, the Query Optimizer’s only option is to perform a table scan. The estimate of the total I/O cost is simply the number of pages in the table, which is stored in the system catalogs and can be viewed by querying the used_page_count column of the sys.dm_db_partition_stats dynamic management view (DMV):

select used_page_count
   from sys.dm_db_partition_stats
   where object_id = object_id('sales_noclust')
     and (index_id = 0     -- data pages for heap table
          or index_id = 1) -- data pages for clustered table
go

used_page_count
--------------------
1244

Keep in mind that there are instances (for example, large range retrievals on a nonclustered index column) in which a table scan might be cheaper than a candidate index in terms of total logical I/O. For example, in the previous nonclustered index example, if the index does not cover the query, it costs between 1,706 and 5,106 logical I/Os to retrieve the matching rows using the nonclustered index, depending on whether a clustered index exists on the table. If the total number of pages in the table is less than either of these values, a table scan would be more efficient in terms of total logical I/Os than using a nonclustered index.

When a table scan is used to retrieve the data rows from a heap table, you see a query plan similar to the one shown in Figure 6.

Figure 6. A table scan on a heap table.

When a table scan is used to retrieve the data rows from a clustered table, you see a query plan similar to the one shown in Figure 7 . Notice that it displays as a clustered index scan because the table is the leaf level of the clustered index.