Because the Query Optimizer might sometimes make poor
decisions as to how to best process a query, you need to know how and
when you may need to override the Query Optimizer and force SQL Server
to process a query in a specific manner.
How often does SQL Server
require manual intervention to execute a query optimally? Considering
the overwhelming number of query types and circumstances in which those
queries are run, SQL Server does a surprisingly effective job of query
optimization in most instances. For all but the most grueling, complex
query operations, experience has shown that SQL Server’s Query Optimizer
is quite clever—and very, very good at wringing the best performance
out of any hardware platform.
Before indiscriminately
applying the techniques discussed in this section, remember one very
important point: use of these features can effectively hide serious
fundamental design or coding flaws in your database, application, or
queries. In fact, if you’re tempted to use these features (with a few
moderate exceptions), it should serve as an indicator that the problems
might lie elsewhere in the application or queries.
If you are satisfied that no
such flaws exist and that SQL Server is choosing the wrong plan to
optimize your query, you can use the methods discussed in this section
to override two of the three most important decisions the Query
Optimizer makes:
Choosing which index, if any, to resolve the query
Choosing the join strategy to apply in a multitable query
Throughout this and following sections, one point must remain clear in your mind: these options should be used only in exception cases
to cope with specific optimization problems in specific queries in
specific applications. There are therefore no standard or global rules
to follow because the application of these features, by definition,
means that normal SQL Server behavior isn’t taking place.
Tip
As a general rule,
Query Optimizer and table hints should be used only as a last resort,
when all other methods to get the Query Optimizer to generate a more
efficient query plan have failed. Always try to find other ways to
rewrite the queries to encourage the Query Optimizer to choose a better
plan. This includes adding additional SARGs, substituting unknown values
for known values in SARGS or trying to replace unknown values with
known values, breaking up queries, converting subqueries to joins or
joins to subqueries, and so on. Essentially, you should try other coding
variations on the query itself to get the same result in a different
way and try to see if one of the variations ends up using the more
efficient query plan that you expect it to.
In reality, about the only
time you should use these hints is when you’re testing the performance
of a query and want to see if the Query Optimizer is actually choosing
the best execution plan. You can enable the various query analysis
options, such as STATISTICS PROFILE and STATISTICS IO,
and then see how the query plan and statistics change as you apply
various hints to the query. You can examine the output to determine
whether the I/O cost and/or runtime improves or gets worse if you force
one index over another or if you force a specific join strategy or join
order.
The problem with
hard-coding table and Query Optimizer hints into application queries is
that the hints prevent the Query Optimizer from modifying the query plan
as the data in the tables changes over time. Also, if subsequent
service packs or releases of SQL Server incorporate improved
optimization algorithms or strategies, the queries with hard-coded hints
will not be able to take advantage of them.
If
you find that you must incorporate any of these hints to solve query
performance problems, you should be sure to document which queries and
stored procedures contain Query Optimizer and table hints. It’s a good
idea to periodically go back and test the queries to determine whether
the hints are still appropriate. You might find that, over time, as the
data values in the table change, the forced query plan generated because
of the hints is no longer the most efficient query plan, and the Query
Optimizer now generates a more efficient query plan on its own.
Optimizer Hints
You can specify three types of hints in a query to override the decisions made by the Query Optimizer:
Table hints
Join hints
Query hints
The following sections examine and describe each type of table hint.
Forcing Index Selection with Table Hints
In addition to locking
hints that can be specified for each table in a query, SQL Server 2008
allows you to provide table-level hints that enable you to specify the
index SQL Server should use for accessing the table. The syntax for
specifying an index hint is as follows:
SELECT column_list FROM tablename WITH (INDEX (indid | index_name [, ...]) )
This syntax allows you to
specify multiple indexes. You can specify an index by name or by ID. It
is recommended that you specify indexes by name as the IDs for
nonclustered indexes can change if they are dropped and re-created in a
different order than that in which they were created originally. You can
specify an index ID of 0, or the table name itself, to force a table scan.
When you specify
multiple indexes in the hint list, all the indexes listed are used to
retrieve the rows from the table, forcing an index intersection or index
covering via an index join. If the collection of indexes listed does
not cover the query, a regular row fetch is performed after all the
indexed columns are retrieved.
To get a list of indexes on a table, you can use sp_helpindex.
However, the stored procedure doesn’t display the index ID. To get a
list of all user-defined tables and the names of the indexes defined on
them, you can execute a query against the sys.indexes catalog view similar to the one shown in Listing 1, which was run against the bigpubs2008 database.
Listing 1. Query Against sys.indexes Catalog View to Get Index Names and IDs
select 'Table name' = convert(char(20), object_name(object_id)), 'Index name' = convert(char(30), name), 'Index ID' = index_id, 'Index Type' = convert(char(15), type_desc) from sys.indexes where object_id > 99 —only system tables have id less than 99 and index_id between 1 and 254 /* do not include rows for text columns or tables without a clustered index*/ /* do not include auto statistics */ and is_hypothetical = 0 and objectproperty(object_id, 'IsUserTable') = 1 order by 1, 3 go
Table name Index name Index ID Index Type ----------------------------------------------------------------------------- authors UPKCL_auidind 1 CLUSTERED authors aunmind 2 NONCLUSTERED employee employee_ind 1 CLUSTERED employee PK_emp_id 2 NONCLUSTERED jobs PK__jobs__job_id__25319086 1 CLUSTERED PARTS PK__PARTS__09746778 1 CLUSTERED PARTS UQ__PARTS__0A688BB1 2 NONCLUSTERED pub_info UPKCL_pubinfo 1 CLUSTERED publishers UPKCL_pubind 1 CLUSTERED roysched titleidind 2 NONCLUSTERED sales UPKCL_sales 1 CLUSTERED sales titleidind 2 NONCLUSTERED sales ord_date_idx 7 NONCLUSTERED sales qty_idx 8 NONCLUSTERED sales_big ci_sales_big 1 CLUSTERED sales_big idx1 2 NONCLUSTERED sales_noclust idx1 2 NONCLUSTERED sales_noclust ord_date_idx 3 NONCLUSTERED sales_noclust qty_idx 4 NONCLUSTERED stores UPK_storeid 1 CLUSTERED stores nc1_stores 2 NONCLUSTERED titleauthor UPKCL_taind 1 CLUSTERED titleauthor auidind 2 NONCLUSTERED titleauthor titleidind 3 NONCLUSTERED titles UPKCL_titleidind 1 CLUSTERED titles titleind 2 NONCLUSTERED titles ytd_sales_filtered 11 NONCLUSTERED
|
SQL Server 2008 introduces the new FORCESEEK
table hint, which provides an additional query optimization option.
This hint specifies that the query optimizer use only an index seek
operation as the access path to the data in the table or view referenced
in the query rather than a index or table scan. If a query plan
contains table or index scan operators, forcing an index seek operation
may yield better query performance. This is especially true when inaccurate cardinality or cost estimations cause the optimizer to favor scan operations at plan compilation time.
Before using the FORCESEEK
table hint, you should make sure that statistics on the table are
current and accurate. Also, you should evaluate the query for items that
can cause poor cardinality or cost estimates and remove these items if
possible. For example, replace local variables with parameters or
literals and limit the use of multistatement table-valued functions and
table variables in the query.
Also, be aware that if you specify the FORCESEEK hint in addition to an index hint, the FORCESEEK hint can cause the optimizer to use an index other than one specified in the index hint.
Forcing Join Strategies with Join Hints
Join hints let you force the
type of join that should be used between two tables. The join hints
correspond with the three types of join strategies:
You can specify join hints only when you use the ANSI-style join syntax—that is, when you actually use the keyword JOIN in the query. The hint is specified between the type of join and the keyword JOIN, which means you can’t leave out the keyword INNER for an inner join. Thus, the syntax for the FROM clause when using join hints is as follows:
FROM table1 {INNER | OUTER} [LOOP | MERGE | HASH} JOIN table2
The following example forces SQL Server to use a hash join:
select st.stor_name, ord_date, qty
from stores st INNER HASH JOIN sales s on st.stor_id = s.stor_id
where st.stor_id between 'B100' and 'B599'
You can also specify a global join hint for all joins in a query by using a query processing hint.
Specifying Query Processing Hints
SQL Server 2008 enables you to
specify additional query hints to control how your queries are optimized
and processed. You specify query hints at the end of a query by using
the OPTION keyword. There can be only one OPTION clause per query, but you can specify multiple hints in an OPTION clause, as shown in the following syntax:
OPTION (hint1 [, ...hintn])
Query hints are grouped into four categories: GROUP BY, UNION, join, and miscellaneous.
GROUP BY Hints GROUP BY hints specify how GROUP BY or COMPUTE operations should be performed. The following GROUP BY hints can be specified:
HASH GROUP— This option forces the Query Optimizer to use a hashing function to perform the GROUP BY operation.
ORDER GROUP— This option forces the Query Optimizer to use a sorting operation to perform the GROUP BY operation.
Only one GROUP BY hint can be specified at a time.
UNION Hints The UNION hints specify how UNION operations should be performed. The following UNION hints can be specified:
MERGE UNION— This option forces the Query Optimizer to use a merge operation to perform the UNION operation.
HASH UNION— This option forces the Query Optimizer to use a hash operation to perform the UNION operation.
CONCAT UNION— This option forces the Query Optimizer to use the concatenation method to perform the UNION operation.
Only one UNION hint can be specified at a time, and it must come after the last query in the UNION. The following is an example of forcing concatenation for a UNION:
select stor_id from sales where stor_id like 'B19%'
UNION
select title_id from titles where title_id like 'C19%'
OPTION (CONCAT UNION)
Join Hints
The join hint specified in the OPTION
clause specifies that all join operations in the query are performed as
the type of join specified in the hint. The join hints that can be
specified in the query hints are the same as the table hints:
LOOP JOIN
MERGE JOIN
HASH JOIN
If you also specify a join
hint for a specific pair of tables, the table-level hints specified must
be compatible with the query-level join hint.
Miscellaneous Hints
The following miscellaneous hints can be used to override various query operations:
FORCE ORDER— This option tells the Query Optimizer to join the tables in the order in which they are listed in the FROM clause and not to determine the optimal join order.
FAST n— This hint instructs SQL Server to optimize the query to return the first n
rows as quickly as possible, even if the overall throughput is reduced.
In other words, it improves response time at the expense of total query
execution time. This option generally influences the Query Optimizer to
retrieve data using a nonclustered index that matches the ORDER BY
clause of a query instead of using a different access method that would
require a sort operation first to return rows in the specified order.
After n number of rows have been returned, the query continues execution normally to produce its full result set.
ROBUST PLAN—
This option forces the Query Optimizer to attempt a plan that works for
the maximum potential row size, even if it means degrading performance.
If you have very wide VARCHAR
columns, some types of query plans might create intermediate tables, and
if any of the internal operations need to store and process rows in
these intermediate tables, some rows might exceed SQL Server’s row size
limit. If this happens, SQL Server generates an error during query
execution. When the ROBUST PLAN hint is specified, the Query Optimizer does not consider any plans that might encounter this problem.
MAXDOP number— This hint overrides the server-level configuration setting for max degree of parallelism for the current query in which the hint is specified.
KEEP PLAN—
When this hint is specified, it forces the Query Optimizer to relax the
estimated recompile threshold for a query. The estimated recompile
threshold is the point at which a query is automatically recompiled when
the estimated number of indexed column changes have been made to a
table by updates, inserts, or deletes. Specifying KEEP PLAN
ensures that the query is not recompiled as frequently when there are
multiple updates to a table. This option is useful primarily for queries
whose execution plan stays in memory, such as for stored procedures.
You might want to specify this option for a stored procedure that does a
lot of work with temporary tables, which can lead to frequent
recompilations of the execution plan for the stored procedure.
KEEPFIXED PLAN—
This query hint tells the Query Optimizer not to recompile the query
plan when there are changes in statistics or modifications to indexed
columns used by the query via updates, deletes, or inserts. When this
option is specified, the query is recompiled only if the schema of the
underlying tables is changed or sp_recompile is executed against those tables.
EXPAND VIEWS—
This hint tells the Query Optimizer not to consider any indexed view as
a substitute for any part of the query and to force the view to be
expanded into its underlying query. This hint essentially prevents
direct use of indexed views in the query plan.
MAXRECURSION number— This hint specifies the maximum number of recursions allowed for the common table expression query, where number is an integer between 0 and 32767. When 0 is specified, no limit is applied. If this option is not specified, the default limit for the server is 100.
RECOMPILE—
This hint forces SQL Server not to keep the execution plan generated
for the query in the plan cache after it executes. This forces a new
plan to be generated the next time the same or a similar query plan is
executed. RECOMPILE is useful for
queries with variable values that vary widely each time they are
compiled and executed. This hint can be used for individual statements
within a stored procedure in place of the global WITH RECOMPILE option when you want only a subset of queries inside the stored procedure to be recompiled rather than all of them.
OPTIMIZE FOR ( @variable_name = literal_constant [ , ...n ] )—
This hint instructs SQL Server to use a specified value to optimize the
SARGs for a local variable that is otherwise unknown when the query is
compiled and optimized. The value is used only during query optimization
and not during query execution. OPTIMIZE FOR
can help improve optimization by allowing the Query Optimizer to use
the statistics histogram rather than index densities to estimate the
rows that match the local variable, or can be used when you create plan
guides.
OPTIMIZE FOR UNKNOWN—
This hint instructs the query optimizer to use statistical data instead
of the initial values for local variables when the query is compiled
and optimized, including parameters created with forced
parameterization.
TABLE HINT (object_name [ , table_hint [ [, ]...n ] ] )— New in SQL Server 2008, you can now specify table hints in the Query Hint OPTION clause. It is recommended that the TABLE HINT
clause be used only in the context of a plan guide. For all other ad
hoc queries, it is recommend that normal table hints be used.
USE PLAN N'xml_plan'— This hint instructs SQL Server to use an existing query plan for a query as specified by the designated xml_plan. The USE PLAN query hint can be used for queries whose plans result in slow execution times but for which you know better plans exist.
Note
Optimizer hints are not always executed. For example, the Query Optimizer is likely to ignore a HASH UNION hint for a query using the UNION ALL statement. Because UNION ALL
means to return all rows whether or not there are duplicates, you don’t
need to hash these values to determine uniqueness and remove
duplicates, so the normal concatenation is likely to still take place.