The job of the Query Optimizer is incredibly complex.
The Query Optimizer can consider literally thousands of options when
determining the optimal execution plan. The statistics are simply one of
the tools that the Query Optimizer can use to help in the
decision-making process.
In addition to examining
the statistics to determine the most efficient access paths for SARGs
and join clauses, the Query Optimizer must consider the optimum order in
which to access the tables, the appropriate join algorithms to use, the
appropriate sorting algorithms, and many other details too numerous to
list here. The goal of the Query Optimizer during join selection is to
determine the most efficient join strategy.
Join Processing Strategies
If you are familiar with SQL,
you are probably very familiar with using joins between tables in
creating SQL queries. A join occurs any time the SQL Server Query
Optimizer has to compare two inputs to determine an output. The join can
occur between one table and another table, between an index and a
table, or between an index and another index.
The SQL Server Query
Optimizer uses three primary types of join strategies when it must
compare two inputs: nested loops joins, merge joins, and hash joins. The
Query Optimizer must consider each one of these algorithms to determine
the most appropriate and efficient algorithm for a given situation.
Each of the three supported
join algorithms could be used for any join operation. The Query
Optimizer examines all the possible alternatives, assigns costs to each,
and chooses the least expensive join algorithm for a given situation.
Merge and hash joins often greatly improve the query processing performance for very large data tables and data warehouses.
Nested Loops Joins
The nested loops join
algorithm is by far the simplest of the three join algorithms. The
nested loops join uses one input as the “outer” loop and the other input
as the “inner” loop. As you might expect, SQL Server processes the
outer input one row at a time. For each row in the outer input, the
inner input is searched for matching rows.
Figure 1 illustrates a query that uses a nested loops join.
Note that in the graphical
execution plan, the outer loop is represented as the top input table,
and the inner loop is represented as the bottom input table. In most
instances, the Query Optimizer chooses the input table with the fewest
number of qualifying rows to be the outer loop to limit the number of
iterative lookups against the inner table. However, the Query Optimizer
may choose the input table with the greater number of qualifying rows as
the outer table if the I/O cost of searching that table first and then
performing the iterative loops on the other table is lower than the
alternative.
The nested loop join is the
easiest join strategy for which to estimate the I/O cost. The cost of
the nested loop join is calculated as follows:
| Number of I/Os to read in outer input |
+ | Number of matching rows × Number of I/Os per lookup on inner input |
= | Total logical I/O cost for query |
The Query Optimizer evaluates the
I/O costs for the various possible join orders as well as the various
possible access paths and indexes available to determine the most
efficient join
order. The nested loops join is efficient for queries that typically
affect only a small number of rows. As the number of rows in the outer
loop increases, the effectiveness of the nested loops join strategy
diminishes. The reason is the increased number of logical I/Os required
as the number of qualifying rows increases.
Also, if there are no useful
indexes on the join columns, the nested loop join is not an efficient
join strategy because it requires a table scan lookup on the inner table
for each row in the outer table. Lacking useful indexes for the join,
the Query Optimizer often opts to perform a merge or hash join.
Merge Joins
The merge join algorithm
is much more effective than the nested loops join for dealing with large
data volumes or when the lack of limiting SARGs or useful indexes on
SARGs leads to a table scan of one or both tables involved in the join. A
merge join works by retrieving one row from each input and comparing
them, matching on the join column(s). Figure 2 illustrates a query that uses a merge join.
A merge join requires that both inputs be sorted on the merge columns—that is, the columns specified in the equality (ON) clauses of the join predicate. A merge join does not work if both inputs are not sorted. In the query shown in Figure 35.18, both tables have a clustered index on stor_id, so the merge column (stor_id)
is already sorted for each table. If the merge columns are not already
sorted, a separate sort operation may be required before the merge join
operation. When the input is sorted, the merge join operation retrieves a
row from each input and compares them, returning the rows if they are
equal. If the inputs are not equal, the lower-value row is discarded,
and another row is obtained from that input. This process repeats until
all rows have been processed.
Usually, the Query Optimizer
chooses a merge join strategy, as in this example, when the data volume
is large and both columns are contained in an existing presorted index,
such as a clustered primary key. If either of the inputs is not already
sorted, the Query Optimizer has to perform an explicit sort before the
join. Figure 3 shows an example of a sort being performed before the merge join is performed.
In the query in Figure 3, the titles table is already sorted on the primary key on title_id, but the rows being returned from the sales table are being returned initially in stor_id order. (stor_id is the leading column in the clustered primary key on sales.) The resulting rows matching the search criteria on ord_date via the clustered index scan on the sales table are then re-sorted by title_id, and then the merge join is performed with the rows retrieved from the titles table.
If one or more of the inputs to
the merge join is not sorted, and the additional sorting causes the
merge join to be too expensive to perform, the Query Optimizer may
consider using the hash join strategy instead.
Hash Joins
The final—and most
complicated—join algorithm is the hash join. The hash join is an
effective join strategy for dealing with large data volumes where the
inputs might not be sorted and when no useful indexes exist on your
tables for performing the join. Figure 4 illustrates a query that uses a hash join.
The basic hash join
algorithm involves separating the two inputs into a build input and
probe input. The Query Optimizer usually attempts to assign the smaller
input as the build input. The hash join scans the build input and
creates a hash table. Each row from the build input is inserted into the
hash table based on a hash key value, which is computed. The probe
input is then scanned, one row at a time. A hash key value is computed
for each row in the probe, and the hash table is scanned for matches.
The hash join is an effective join strategy when dealing with large data
volumes and unsorted data inputs.
In a hash join, the keys
that are common between the two tables are hashed into a hash bucket,
using the same hash function. This bucket usually starts out in memory
and then moves to disk, as needed. The type of hashing that occurs
depends on the amount of memory required. Hashing is commonly used for
inner and outer joins, intersections, unions, and differences. The Query
Optimizer often uses hashing for intermediate processing.
Pseudocode for a simple hash join might look like this:
create an empty hash table
for each row in the input table
read the row
hash the key value
insert the hashed key into the hash bucket
for each row in the larger table
read the row
hash the key value
if hashed key value is found in the hash bucket
output hash key and both row identifiers
drop the hash table
Although hashing is useful when
no useful indexes are on the tables for a join, the Query Optimizer
still might not choose it as the join strategy if it has a high cost in
terms of memory required. If the entire hash table doesn’t fit in
memory, SQL Server has to split both the build and probe inputs into
partitions, each containing a different set of hash keys, and write
those partitions out to disk. As each partition is needed, it is brought
into memory. This increases the amount of I/O and general processing
time for the query.
If you want to use
the hashing strategy efficiently, it is best if the smaller input is
used as the build input. If, during execution, SQL Server discovers that
the build input is actually larger than the probe input, it might
switch the roles of the build and probe input midstream. The Query
Optimizer usually doesn’t have a problem determining which input is
smaller if the statistics on the columns involved in the query are
current. Column-level statistics can also help the Query Optimizer
determine the estimated number of rows matching a SARG, even if no
actual index will be used.
Grace Hash Joins
If the two inputs are too large to fit into memory for a normal hash join, SQL Server might use a modified method, called the grace hash join. This method partitions the smaller input table (also referred to as the build input)
into a number of buckets. The total number of buckets is calculated by
determining the bucket size that will fit in memory and dividing it into
the number of rows in the table. The larger table (also referred to as
the probe input)
is then also partitioned into the same number of buckets. Each bucket
from each input can then be read into memory and the matches made.
A hybrid join is a join method that uses elements of both a simple in-memory hash and grace hash.
Note
Hash and merge join
strategies can be applied only when the join is an equijoin—that is,
when the join condition compares columns from two inputs with the
equality (=) operator. If the join is not based on an equality, (for example, using a BETWEEN clause), using nested loop joins is the only strategy that can be employed.