Microsoft SQL Server 2012 : Knowing Tempdb - Troubleshooting Common Issues (part 3) - Troubleshooting Space Issues

12/21/2014 8:58:19 PM

Monitoring Tempdb I/O Performance

Troubleshooting SQL Server implies a reactive activity; an issue has occurred that now needs to be fixed. That may be true but one of the differences that separates an average SQL Server professional from a good one is knowing about a problem before it has an impact on a live system.

You should be aware by now of tempdb’s importance to the overall health of an entire instance, so it shouldn’t be a hard sell to realize the benefits of being proactive and monitoring tempdb to get early warning of potential problems before they affect a production system. This section covers the specifics of monitoring tempdb I/O: What you should be looking at and what thresholds should prompt you to do something.

The speed at which requests to store and retrieve data are processed against tempdb is important to the overall performance of any SQL Server instance and can even be critical where tempdb is either heavily used or part of an important business process.

Whether you have tempdb on local storage or a SAN (storage area network), on a RAID10 volume or RAID1, the simplest way to check I/O system performance is to look at the latency of I/O requests.

There are two methods for measuring disk latency: using Performance Monitor and using SQL Server DMVs. Which one you should choose depends on how you want to monitor performance and how accurate you need it to be.

Performance Monitor

The PerfMon counters that you should be interested in are as follows:

Avg. Disk sec/Transfer
Avg. Disk sec/Read
Avg. Disk sec/Write

You’ll find these grouped under Logical Disk, which shows the logical drives and drive letters presented in Windows as you would see them in Explorer; and Physical Disk, which shows the drives as Windows sees them internally. Which group you get the counters from won’t matter in most cases — I tend to use the Logical Disk counters because it’s easier to work with drive letters.

The counters themselves all provide the average latency in milliseconds for I/O requests. “Avg. Disk sec/Transfer” is the combined average for both reads and writes to a drive. This counter provides the simplest measurement for regular long-term monitoring.

“Avg. Disk sec/Read” and “Avg. Disk sec/Write” separate the requests into read and write measurements, respectively, that can be useful for determining how to configure disk controller cache . For example, if you’re seeing poor read performance and excellent write performance, you might want to optimize the cache for reads.

SQL Server DMVs

Monitoring the performance of a disk volume using Performance Monitor is a useful indicator of a potential storage performance issue, but you can get a further level of granularity from SQL Server itself. The following script (code file: Ch8_5ReadAndWriteLatency.sql) uses the sys.dm_io_virtual_file_stats DMV to calculate the read and write latency for all database files that have been used since the SQL Server service was last started.

SELECT  DB_NAME(database_id) AS 'Database Name',
        file_id,
        io_stall_read_ms / num_of_reads AS 'Avg Read Transfer/ms',
        io_stall_write_ms / num_of_writes AS 'Avg Write Transfer/ms'
FROM    sys.dm_io_virtual_file_stats(-1, -1)
WHERE   num_of_reads > 0
        AND num_of_writes > 0 ;

You can see part of the output from running the script on a busy production SQL Server in Figure 4. Tempdb has four data files with file_id’s 1, 3, 4, and 5, and a transaction log with file_id 2. All the data files have the same read and write latency, which is a positive indicator that the I/O is balanced across all the files, and all the results indicate good performance from tempdb.

FIGURE 4

Thresholds

Microsoft suggests the following performance thresholds for disk latency on drives containing SQL Server database files:

Database data files:

Target: <10ms
Acceptable: 10–20ms
Unacceptable: >20ms

Database log files:

Target: <5ms
Acceptable: 5–15ms
Unacceptable: >15ms

You should use these thresholds for guidance only because some systems will never be able to achieve the target latency. If you don’t have any performance issues with your application and you’re seeing latency of 20ms, then it’s not so important.

Using 20ms is a good rule of thumb target on most systems unless SQL Server is spending a lot of time waiting for I/O requests.

Troubleshooting Space Issues

All the code in this section uses the Ch8_6TempDBSpaceIssues.sql code file.

sys.dm_db_file_space_usage

This DMV provides a view of the number and types of pages that are allocated in tempdb by file, allowing you to see the distribution of pages across your data files.

You can also use this DMV to total the values across all the files to get a single view of the breakdown of tempdb usage, which can help you narrow down the scope of the problem in the event of unexpected usage. Here is an example script for this:

SELECT  SUM(total_page_count)*8/1024 AS 'tempdb size (MB)',
        SUM(total_page_count) AS 'tempdb pages',
        SUM(allocated_extent_page_count) AS 'in use pages',
        SUM(user_object_reserved_page_count) AS 'user object pages',
        SUM(internal_object_reserved_page_count) AS 'internal object pages',
        SUM(mixed_extent_page_count) AS 'Total Mixed Extent Pages'
FROM    sys.dm_db_file_space_usage ;

Example results from the preceding script are shown in Figure 5. Note that user, internal and version pages are from uniform extents only.

FIGURE 5

sys.dm_db_task_space_usage

This DMV provides details of tempdb usage for currently running tasks. The values are set to 0 at the start of the task and deleted when the task completes, so it’s useful for troubleshooting live issues with currently executing tasks. For example, the following script will give you the top five sessions currently using space in tempdb, ordered by the total amount of space in use:

SELECT TOP 5 *
FROM    sys.dm_db_task_space_usage
WHERE   session_id > 50
ORDER BY user_objects_alloc_page_count + internal_objects_alloc_page_count

sys.dm_db_session_space_usage

When a task completes, the values from sys.dm_db_task_usage are aggregated by session, and these aggregated values are viewable using sys.dm_db_session_space_usage.

The following example code demonstrates how to use this DMV, showing you all the sessions in order of total tempdb usage:

SELECT  *
FROM    sys.dm_db_session_space_usage
WHERE   session_id > 50
ORDER BY user_objects_alloc_page_count + internal_objects_alloc_page_count DESC ;

The output won’t include any currently executing tasks, so it’s not very useful for a live issue; but you can look up the session_id in sys.dm_exec_requests to gather information about who’s using that session_id, such as their login details; the server they’re connecting from; and the application they are using.

Related -----------------

- Microsoft SQL Server 2012 : Knowing Tempdb - Troubleshooting Common Issues (part 2) - Latch Contention - ikelihood of it happening. Multiple Tempdb Data Files

- Microsoft SQL Server 2012 : Knowing Tempdb - Troubleshooting Common Issues (part 1) - Latch Contention - Allocation Page Contention

Other -----------------

- Microsoft SQL Server 2012 : Knowing Tempdb - Overview and Usage (part 2) - The Version Store

- Microsoft SQL Server 2012 : Knowing Tempdb - Overview and Usage (part 1) - User Temporary Objects

- Microsoft Lync Server 2013 : Windows Client - Navigating in the Client - Useful Lync Client Shortcuts

- Microsoft Lync Server 2013 : Windows Client - Navigating in the Client - Managing Contacts, Managing Groups, Status View

- Microsoft Lync Server 2013 : Windows Client - Navigating in the Client - Configuring Basic Options

- Microsoft Lync Server 2013 : Windows Client - Installing the Client

- Microsoft Exchange Server 2013: Configuring mailbox delivery restrictions, permissions, and storage limits (part 4) - Setting storage restrictions on mailbox and archives

- Microsoft Exchange Server 2013: Configuring mailbox delivery restrictions, permissions, and storage limits (part 3) - Permitting others to access a mailbox

- Microsoft Exchange Server 2013: Configuring mailbox delivery restrictions, permissions, and storage limits (part 2) - Setting message send and receive restrictions on individual mailboxes

- Microsoft Exchange Server 2013: Configuring mailbox delivery restrictions, permissions, and storage limits (part 1) - Setting message size restrictions on delivery to and from individual mailboxes