SQL Server 2008 R2 : Row Estimation and Index Selection (part 4) - Optimizing with Indexed Views & Optimizing with Filtered Indexes

9/10/2011 4:20:22 PM

Optimizing with Indexed Views

In SQL Server 2008, when you create a unique clustered index on a view, the result set for the view is materialized and stored in the database with the same structure as a table that has a clustered index. Changes made to the data in the underlying tables of the view are automatically reflected in the view the same way as changes to a table are reflected in its indexes. In the Developer and Enterprise Editions of SQL Server 2008, the Query Optimizer automatically considers using the index on the view to speed up access for queries run directly against the view. The Query Optimizer in the Developer and Enterprise Editions of SQL Server 2008 also looks at and considers using the indexed view for searches against the underlying base table, when appropriate.

Note

Although indexed views can be created in any edition of SQL Server 2008, they are considered for query optimization only in the Developer and Enterprise Editions of SQL Server 2008. In other editions of SQL Server 2008, indexed views are not used to optimize the query unless the view is explicitly referenced in the query and the NOEXPAND Query Optimizer hint is specified. For example, to force the Query Optimizer to consider using the sales_Qty_Rollup indexed view in the Standard Edition of SQL Server 2008, you execute the query as follows:

select * from sales_Qty_Rollup WITH (NOEXPAND)
   where stor_id between 'B914' and 'B999'

The NOEXPAND hint is allowed only in SELECT statements, and the indexed view must be referenced directly in the query. (Only the Developer and Enterprise Editions consider using an indexed view that is not directly referenced in the query.) As always, you should use Query Optimizer hints with care. When the NOEXPAND hint is included in the query, the Query Optimizer cannot consider other alternatives for optimizing the query.

Consider the following example, which creates an indexed view on the sales table, containing stor_id and sum(qty) grouped by stor_id:

set quoted_identifier on
go

if object_id('sales_Qty_Rollup') is not null
    drop view sales_Qty_Rollup
go
create view sales_qty_rollup
with schemabinding
as
   select stor_id, sum(qty) as total_qty, count_big(*) as id
      from dbo.sales
      group by stor_id
go

create unique clustered index idx1 on sales_Qty_Rollup (stor_id)
go

The creation of the clustered index on the view essentially creates a clustered table in the database with the three columns stor_id, total_qty, and id. As you would expect, the following query on the view itself uses a clustered index seek on the view to retrieve the result rows from the view instead of having to scan or search the sales table itself:

select * from sales_Qty_Rollup
   where stor_id between 'B914' and 'B999'

However, the following query on the sales table uses the indexed view sales_qty_rollup to retrieve the result set as well:

select stor_id,  sum(qty)
   from sales
   where stor_id between 'B914' and 'B999'
   group by stor_id

Essentially, the Query Optimizer recognizes the indexed view essentially as another index on the sales table that covers the query. The execution plan in Figure 14 shows the indexed view being searched in place of the table.

Figure 14. An execution plan showing an indexed view being searched to satisfy a query on a base table.

Note

In addition to the seven required SET options that need to be set appropriately when the indexed view is created, they must also be set the same way for a session to be able to use the indexed view in queries. The required SET option settings are as follows:

SET ARITHABORT ON
SET CONCAT_NULL_YIELDS_NULL ON
SET QUOTED_IDENTIFIER ON
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
SET NUMERIC_ROUNDABORT OFF

If these SET options are not set appropriately for the session running a query that could make use of an indexed view, the indexed view is not used, and the table is searched instead.

You might find rare situations when using the indexed view in the Enterprise, Datacenter, or Developer Editions of SQL Server 2008 leads to poor query performance, and you might want to avoid having the Query Optimizer use the indexed view. To force the Query Optimizer to ignore the indexed view(s) and optimize the query using the indexes on the underlying base tables, you specify the EXPAND VIEWS query option, as follows:

select * from sales_Qty_Rollup
   where stor_id between 'B914' and 'B999'
   OPTION (EXPAND VIEWS)

Optimizing with Filtered Indexes

SQL Server 2008 introduces the capability to define filtered indexes and statistics on a subset of rows rather than on the entire rowset in a table. This is done by specifying simple predicates in the index create statement to restrict the set of rows included in the index. Filtered statistics help solve a common problem in estimating the number of matching rows when the estimates become skewed due to a large number of duplicate values (or NULLs) in an index or due to data correlation between columns. Filtered indexes provide query optimization benefits when you frequently query specific subsets of your data rows.

If a filtered index exists on a table, the optimizer recognizes when a search predicate is compatible with the filtered index; it considers using the filtered index to optimize the query if the selectivity is good.

For example, the titles table in the bigpubs2008 database contains a large percentage of rows where ytd_sales is 0. A nonclustered index typically doesn’t help for searches in which ytd_sales is 0 because the selectivity isn’t adequate, and a table scan would be performed. An advantageous approach then is to create a filtered index on ytd_sales without including the values of 0 to reduce the size of the index and make it more efficient.

For example, first create an unfiltered index on ytd_sales on the titles table:

create index ytd_sales_unfiltered on titles (ytd_sales)

Then, execute the following two queries:

select * from titles where ytd_sales = 0
select * from titles where ytd_sales = 10

As you can see by the query plan displayed in Figure 15, a query where ytd_sales = 0 still uses a table scan instead of the index because the selectivity is poor, whereas it uses the index for ytd_sales = 10.

Figure 15. An execution plan showing index not being used due to poor selectivity.

Now, drop the unfiltered index and re-create a filtered index that excludes values of 0:

drop index titles.ytd_sales_unfiltered
go
create index ytd_sales_filtered on titles (ytd_sales)
where ytd_sales <> 0

Re-run the queries and examine the query plan again. Figure 16 shows that the query where ytd_sales = 0 still uses a table scan as before, but the query where ytd_sales = 10 is able to use the filtered index.

Figure 16. An execution plan showing the filtered index being used.

In this case, it may be beneficial to define the filtered index instead of a normal index on ytd_sales because the filtered index will require less space and be a more efficient index by excluding all the rows with ytd_sales values of 0, especially if the majority of the queries against the table are searching for ytd_sales values that are nonzero.