SQL Server 2008 R2 : Join Selection (part 1) - Join Processing Strategies

9/20/2011 5:28:55 PM

The job of the Query Optimizer is incredibly complex. The Query Optimizer can consider literally thousands of options when determining the optimal execution plan. The statistics are simply one of the tools that the Query Optimizer can use to help in the decision-making process.

In addition to examining the statistics to determine the most efficient access paths for SARGs and join clauses, the Query Optimizer must consider the optimum order in which to access the tables, the appropriate join algorithms to use, the appropriate sorting algorithms, and many other details too numerous to list here. The goal of the Query Optimizer during join selection is to determine the most efficient join strategy.

Join Processing Strategies

If you are familiar with SQL, you are probably very familiar with using joins between tables in creating SQL queries. A join occurs any time the SQL Server Query Optimizer has to compare two inputs to determine an output. The join can occur between one table and another table, between an index and a table, or between an index and another index.

The SQL Server Query Optimizer uses three primary types of join strategies when it must compare two inputs: nested loops joins, merge joins, and hash joins. The Query Optimizer must consider each one of these algorithms to determine the most appropriate and efficient algorithm for a given situation.

Each of the three supported join algorithms could be used for any join operation. The Query Optimizer examines all the possible alternatives, assigns costs to each, and chooses the least expensive join algorithm for a given situation. Merge and hash joins often greatly improve the query processing performance for very large data tables and data warehouses.

Nested Loops Joins

The nested loops join algorithm is by far the simplest of the three join algorithms. The nested loops join uses one input as the “outer” loop and the other input as the “inner” loop. As you might expect, SQL Server processes the outer input one row at a time. For each row in the outer input, the inner input is searched for matching rows.

Figure 1 illustrates a query that uses a nested loops join.

Figure 1. An execution plan for a nested loops join.

Note that in the graphical execution plan, the outer loop is represented as the top input table, and the inner loop is represented as the bottom input table. In most instances, the Query Optimizer chooses the input table with the fewest number of qualifying rows to be the outer loop to limit the number of iterative lookups against the inner table. However, the Query Optimizer may choose the input table with the greater number of qualifying rows as the outer table if the I/O cost of searching that table first and then performing the iterative loops on the other table is lower than the alternative.

The nested loop join is the easiest join strategy for which to estimate the I/O cost. The cost of the nested loop join is calculated as follows:

	Number of I/Os to read in outer input
+	Number of matching rows × Number of I/Os per lookup on inner input
=	Total logical I/O cost for query

The Query Optimizer evaluates the I/O costs for the various possible join orders as well as the various possible access paths and indexes available to determine the most efficient join order. The nested loops join is efficient for queries that typically affect only a small number of rows. As the number of rows in the outer loop increases, the effectiveness of the nested loops join strategy diminishes. The reason is the increased number of logical I/Os required as the number of qualifying rows increases.

Also, if there are no useful indexes on the join columns, the nested loop join is not an efficient join strategy because it requires a table scan lookup on the inner table for each row in the outer table. Lacking useful indexes for the join, the Query Optimizer often opts to perform a merge or hash join.

Merge Joins

The merge join algorithm is much more effective than the nested loops join for dealing with large data volumes or when the lack of limiting SARGs or useful indexes on SARGs leads to a table scan of one or both tables involved in the join. A merge join works by retrieving one row from each input and comparing them, matching on the join column(s). Figure 2 illustrates a query that uses a merge join.

Figure 2. An execution plan for a merge join.

A merge join requires that both inputs be sorted on the merge columns—that is, the columns specified in the equality (ON) clauses of the join predicate. A merge join does not work if both inputs are not sorted. In the query shown in Figure 35.18 , both tables have a clustered index on stor_id, so the merge column (stor_id) is already sorted for each table. If the merge columns are not already sorted, a separate sort operation may be required before the merge join operation. When the input is sorted, the merge join operation retrieves a row from each input and compares them, returning the rows if they are equal. If the inputs are not equal, the lower-value row is discarded, and another row is obtained from that input. This process repeats until all rows have been processed.

Usually, the Query Optimizer chooses a merge join strategy, as in this example, when the data volume is large and both columns are contained in an existing presorted index, such as a clustered primary key. If either of the inputs is not already sorted, the Query Optimizer has to perform an explicit sort before the join. Figure 3 shows an example of a sort being performed before the merge join is performed.

Figure 3. An execution plan for a merge join with a preliminary sort step.

In the query in Figure 3, the titles table is already sorted on the primary key on title_id, but the rows being returned from the sales table are being returned initially in stor_id order. (stor_id is the leading column in the clustered primary key on sales.) The resulting rows matching the search criteria on ord_date via the clustered index scan on the sales table are then re-sorted by title_id, and then the merge join is performed with the rows retrieved from the titles table.

If one or more of the inputs to the merge join is not sorted, and the additional sorting causes the merge join to be too expensive to perform, the Query Optimizer may consider using the hash join strategy instead.

Hash Joins

The final—and most complicated—join algorithm is the hash join. The hash join is an effective join strategy for dealing with large data volumes where the inputs might not be sorted and when no useful indexes exist on your tables for performing the join. Figure 4 illustrates a query that uses a hash join.

Figure 4. An execution plan for a hash join.

The basic hash join algorithm involves separating the two inputs into a build input and probe input. The Query Optimizer usually attempts to assign the smaller input as the build input. The hash join scans the build input and creates a hash table. Each row from the build input is inserted into the hash table based on a hash key value, which is computed. The probe input is then scanned, one row at a time. A hash key value is computed for each row in the probe, and the hash table is scanned for matches. The hash join is an effective join strategy when dealing with large data volumes and unsorted data inputs.

In a hash join, the keys that are common between the two tables are hashed into a hash bucket, using the same hash function. This bucket usually starts out in memory and then moves to disk, as needed. The type of hashing that occurs depends on the amount of memory required. Hashing is commonly used for inner and outer joins, intersections, unions, and differences. The Query Optimizer often uses hashing for intermediate processing.

Pseudocode for a simple hash join might look like this:

create an empty hash table
for each row in the input table
    read the row
    hash the key value
    insert the hashed key into the hash bucket
for each row in the larger table
    read the row
    hash the key value
    if hashed key value is found in the hash bucket
        output hash key and both row identifiers
drop the hash table

Although hashing is useful when no useful indexes are on the tables for a join, the Query Optimizer still might not choose it as the join strategy if it has a high cost in terms of memory required. If the entire hash table doesn’t fit in memory, SQL Server has to split both the build and probe inputs into partitions, each containing a different set of hash keys, and write those partitions out to disk. As each partition is needed, it is brought into memory. This increases the amount of I/O and general processing time for the query.

If you want to use the hashing strategy efficiently, it is best if the smaller input is used as the build input. If, during execution, SQL Server discovers that the build input is actually larger than the probe input, it might switch the roles of the build and probe input midstream. The Query Optimizer usually doesn’t have a problem determining which input is smaller if the statistics on the columns involved in the query are current. Column-level statistics can also help the Query Optimizer determine the estimated number of rows matching a SARG, even if no actual index will be used.

Grace Hash Joins

If the two inputs are too large to fit into memory for a normal hash join, SQL Server might use a modified method, called the grace hash join. This method partitions the smaller input table (also referred to as the build input) into a number of buckets. The total number of buckets is calculated by determining the bucket size that will fit in memory and dividing it into the number of rows in the table. The larger table (also referred to as the probe input) is then also partitioned into the same number of buckets. Each bucket from each input can then be read into memory and the matches made.

A hybrid join is a join method that uses elements of both a simple in-memory hash and grace hash.

Note

Hash and merge join strategies can be applied only when the join is an equijoin—that is, when the join condition compares columns from two inputs with the equality (=) operator. If the join is not based on an equality, (for example, using a BETWEEN clause), using nested loop joins is the only strategy that can be employed.

Related -----------------

- SQL Server 2008 R2 : Join Selection (part 2) - Determining the Optimal Join Order & Subquery Processing

Other -----------------

- Managing Microsoft Windows Server 2003 Disk Storage : Configuring Disks and Volumes

- Managing Microsoft Windows Server 2003 Disk Storage : Understanding Disk Storage Options

- Microsoft Lync Server 2010 Edge : Reverse Proxy

- Microsoft Lync Server 2010 Edge : Edge Configuration

- Managing Exchange Server 2010 Clients : Checking Private and Public Folders with IMAP4 and UNIX Mail Servers

- Managing Exchange Server 2010 Clients : Leaving Mail on the Server with POP3

- Securing Windows Server 2008 R2 : Encrypting File System

- Securing Windows Server 2008 R2 : Auditing

- Microsoft Dynamics NAV : Business Intelligence - Reporting capabilities in NAV

- Microsoft Dynamics NAV and Business Intelligence