SQL Server 2005 : Testing Database Routines - Introduction to Black Box and White Box Testing

8/3/2011 11:13:07 AM

What defines a great developer? Is it the ability to quickly code complex routines? The ability to implement business requirements correctly, within budget, and on schedule? Or perhaps it can be defined by how quickly the developer can track down and fix bugs in the application—or the inverse, the lack of bugs in the developer's code?

All of these are certainly attributes of a great developer, but in most cases they don't manifest themselves merely due to raw skill. The hallmark of a truly great developer, and what allows these qualities to shine through, is a thorough understanding of the importance of testing.

By creating unit tests early on in the development process, developers can continuously validate interfaces and test for exceptions and regressions. Carefully designed functional tests ensure compliance with business requirements. And performance testing—the kind of testing that always seems to get the most attention—can be used to find out whether the application can actually handle the anticipated amount of traffic.

Unfortunately, like various other development practices that are better established in the object-oriented community, testing hasn't yet caught on much on the database side. Although some shops performance test stored procedures and other database code, it's rare to see database developers writing data-specific unit tests.

There is no good reason that database developers should not write just as many—or more—tests than their application developer counterparts. It makes little sense to test a data-dependent application without validating the data pieces that drive the application components!

Introduction to Black Box and White Box Testing

A number of testing methodologies are defined within the world of quality assurance, but by and large the types of tests that can be done can be split into two groups. Black box testing refers to tests that make assumptions only about inputs and outputs, and as such do not validate intermediate conditions. White box testing, on the other hand, includes any test in which the internal implementation of the routine or function being tested is known and validated by the tester.

Although these terms break up all types of testing into two camps, the majority of what we as software developers think of as "tests" are actually black box tests. The black box variety includes unit tests, most types of functional and security tests, and basic performance testing. As testing progresses once issues are identified, testing get more pinpointed, and the tests tend to shift from black box to white box.

From a database development perspective, if data access is properly encapsulated, virtually all tests necessary to thoroughly analyze a database will be black box tests. The only exceptions to this will be times when data validation is necessary, or when performance tuning requires thorough knowledge of the access methods. For instance, retrieving (and reviewing) query plans during a performance test is an example of white box testing against a stored procedure.

Unit and Functional Testing

Developing software with a specific concentration on the data tier can have a benefit when it comes to testing: there aren't too many types of tests that you need to be familiar with. The most important type of test, and the kind you'll find yourself writing constantly, is the unit test.

Unit tests are black box tests that verify the contracts exposed by interfaces. For instance, a unit test of a stored procedure should validate that given a certain set of inputs, the stored procedure returns the correct set of output results, as defined by the interface of the stored procedure being tested. The term "correct" as used here is important to carefully define. It means correct only insofar as what is defined as the contract for the stored procedure; the actual data returned is not important. So as long as the results are in the correct format and of the correct data types given the interface's contract, a unit test should pass. Phrased another way, unit tests test the ability of interfaces to communicate with the outside world exactly as their contracts say they will.

On the other hand, just as its name implies, a functional test verifies the functionality of whatever is being tested. In testing nomenclature, the term "functional test" has a much vaguer meaning than "unit test." It can mean any kind of test, at any level of an application, which tests whether that piece of the application works properly. But at the database layer, this can mean only one thing: is the stored procedure returning the correct data? Again, I will carefully define the term "correct." This time, correct means both the kind of validation done for a unit test (data must be in the correct format), as well as a deeper validation of the accuracy of the actual values returned. The logic required for this kind of validation means that a functional test is a white box test in the database world, compared to the black box of unit testing.

Let's take a look at an example to make these ideas a bit clearer. Consider the following stored procedure, which might be used for a banking application:

CREATE PROCEDURE GetAggregateTransactionHistory
    @CustomerId INT
AS
BEGIN
    SET NOCOUNT ON

    SELECT
        SUM
        (
            CASE TransactionType
                WHEN 'Deposit' THEN Amount

ELSE 0
            END
        ) AS TotalDeposits,
        SUM
        (
            CASE TransactionType
                WHEN 'Withdrawal' THEN Amount
                ELSE 0
            END
        ) AS TotalWithdrawals
    FROM TransactionHistory
    WHERE
        CustomerId = @CustomerId
END

This stored procedure's implied contract states that given the input of a customer ID into the @CustomerId parameter, a result set of two columns and zero or one rows will be output (the contract does not imply anything about invalid customer IDs or customers who've made no transactions). The column names in the output result set will be TotalDeposits and TotalWithdrawals, and the data types of the columns will be whatever the data type of the Amount column is (we'll assume it's DECIMAL).

WHAT IF THE CUSTOMER DOESN'T EXIST?

The output of the GetAggregateTransactionHistory stored procedure will be the same whether you pass in a valid customer ID for a customer that happens to have had no transactions, or an invalid customer ID. Either way, the procedure will return no rows. Depending on the requirements of a particular situation, it might make sense to make the interface richer by changing the rules a bit, only returning no rows if an invalid customer ID is passed in. That way, the caller will be able to identify invalid data and give the user an appropriate error message rather than implying that the nonexistent customer made no transactions. To fix the interface, use an outer join to the Customers table, as the following modified version of the procedure does:

CREATE PROCEDURE GetAggregateTransactionHistory
    @CustomerId INT
AS
BEGIN
    SET NOCOUNT ON

    SELECT
        SUM
        (
            CASE TH.TransactionType
                WHEN 'Deposit' THEN TH.Amount
                ELSE 0
            END
        ) AS TotalDeposits,
        SUM
        (
            CASE TH.TransactionType
                WHEN 'Withdrawal' THEN TH.Amount
                ELSE 0
            END
        ) AS TotalWithdrawals
    FROM Customers AS C
    LEFT JOIN TransactionHistory AS TH ON C.CustomerId = TH.CustomerId
    WHERE
        C.CustomerId = @CustomerId
END

A unit test against this stored procedure should do nothing more than validate the interface. A customer ID should be passed in, and the unit test should interrogate the output result set (or lack thereof) and ensure that there are two columns of the correct name and data type and zero or one rows. No verification of data is necessary; it would be out of scope, for instance, to find out whether the aggregate information was valid or not—that would be the job of a functional test.

The reason that we draw such a distinction between unit tests and functional tests is that when testing pure interface compliance, we want to put ourselves in the position of someone programming against the interface from a higher layer. Is the interface working as documented, providing the appropriate level of encapsulation and returning data in the correct format?

Each interface in the system will need one or more of these tests , so they need to be kept focused and lightweight. Programming full white box tests against every interface may not be feasible, and it might be simpler to test the validity of data at a higher layer, such as via the user interface itself. In the case of the GetAggregateTransactionHistory stored procedure, writing a functional test would essentially entail rewriting the entire stored procedure again—hardly a good use of developer time.

Unit Testing Frameworks

Unit testing is made easier through the use of unit testing frameworks, which provide structured programming interfaces designed to assist with quickly testing software. These frameworks generally make use of debug assertions, which allow the developer to specify what conditions make a test true or false.

A debug assertion is a special kind of macro that is turned on only when a piece of software is compiled in debug mode. It accepts an expression as input and throws an exception if the expression is false; otherwise, it returns true (or void, in some languages). For instance, the following assertion would always throw an exception:

Assert(1 == 0);

Assertions allow a developer to self-document assumptions made by the code of a routine. If a routine expects that a variable is in a certain state at a certain time, an assertion can be used in order to help make sure that assumption is enforced as the code matures. If, at any time in the future, a change in the code invalidates that assumption, an exception will be thrown should the developer making the change hit the assertion during testing or debugging.

In unit testing, assertions serve much the same purpose. They allow the tester to control what conditions make the unit test return true or false. If any assertion throws an exception in a unit test, the entire test is considered to have failed.

Unit testing frameworks exist for virtually every language and platform, including T-SQL. A project called TSQLUnit is available on the SourceForge open source project web site. Personally, I find unit testing in T-SQL to be cumbersome compared to other languages, and prefer to write my tests in a .NET language using the .NET unit testing framework, NUnit.

Given that unit testing stored procedures is still somewhat of a mystery to many developers, I will provide a few hints. When writing stored procedure unit tests in NUnit, the following basic steps can be followed:

First, determine what assumptions should be made about the stored procedure's interface. What are the result sets that will be returned? What are the data types of the columns, and how many columns will there be? Does the contract make any guarantees about a certain number of rows?
Next, write code using ADO.NET to execute the stored procedure. I find that using the stored procedure to fill a DataSet is generally the easiest way of exposing its output for interrogation. Be careful at this stage; you want to test the stored procedure, not your ADO.NET data access framework. You might be tempted at this point to call the stored procedure using the same method that the application uses to call it. However, this would be a mistake, as you would end up testing both the stored procedure and that method. Given that you only need to fill a DataSet, recoding the data access in the unit test should not be a major burden, and will keep you from testing parts of the code that you don't intend to.
Finally, use one assertion for each assumption you're making about the stored procedure. That means one assertion per column name, one per column data type, one for the row count if necessary, etc. Err on the side of using too many assertions—it's better to have to remove an assumption later because it turns out to be incorrect, than to not have had an assumption there to begin with and had your unit test pass when the interface was actually not working correctly.

Following is an example of what an NUnit test of the GetAggregateTransactionHistory stored procedure might look like:

[TestMethod]
public void TestAggregateTransactionHistory()
{

//Set up a command object
    SqlCommand comm = new SqlCommand();

    //Set up the connection
    comm.Connection = new SqlConnection(
        @"server=serverName; trusted_connection=true;");

    //Define the procedure call
    comm.CommandText = "GetAggregateTransactionHistory";
    comm.CommandType = CommandType.StoredProcedure;

    comm.Parameters.AddWithValue("@CustomerId", 123);

    //Create a DataSet for the results
    DataSet ds = new DataSet();

    //Define a DataAdapter to fill a DataSet
    SqlDataAdapter adapter = new SqlDataAdapter();
    adapter.SelectCommand = comm;

    try
    {
        //Fill the dataset
        adapter.Fill(ds);
    }
    catch
    {
        Assert.Fail("Exception occurred!");
    }

    //Now we have the results -- validate them...

    //There must be exactly one returned result set
    Assert.IsTrue(
        ds.Tables.Count == 1,
        "Result set count != 1");

    DataTable dt = ds.Tables[0];

    //There must be exactly two columns returned
    Assert.IsTrue(
        dt.Columns.Count == 2,
        "Column count != 2");

    //There must be columns called TotalDeposits and TotalWithdrawals
    Assert.IsTrue(
        dt.Columns.IndexOf("TotalDeposits") > −1,
        "Column TotalDeposits does not exist");

Assert.IsTrue(
        dt.Columns.IndexOf("TotalWithdrawals") > −1,
        "Column TotalWithdrawals does not exist");

    //Both columns must be decimal
    Assert.IsTrue(
        dt.Columns["TotalDeposits"].DataType == typeof(decimal),
        "TotalDeposits data type is incorrect");

    Assert.IsTrue(
        dt.Columns["TotalWithdrawals"].DataType == typeof(decimal),
        "TotalWithdrawals data type is incorrect");

    //There must be zero or one rows returned
    Assert.IsTrue(
        dt.Rows.Count <= 1,
        "Too many rows returned");
}

Although it might be disturbing to note that the unit test is over twice as long as the stored procedure it is testing, keep in mind that most of this code can be easily turned into a template for quick reuse. As noted before, you might be tempted to refactor common unit test code into a data access library, but be careful lest you end up testing your test framework instead of the actual routine you're attempting to test. Many hours can be wasted debugging working code trying to figure out why the unit test is failing, when it's actually the fault of some code the unit test is relying on to do its job.

Unit tests allow for quick, automated verification of interfaces. In essence, they help you as a developer to guarantee that in making changes to a system you didn't break anything obvious. In that way, they are invaluable. Developing against a system with a well-established set of unit tests is a joy, as each developer no longer needs to worry about breaking some other component due to an interface change. The unit tests will complain if anything needs to be fixed.

The Importance of Regression Testing

As you build up a set of unit tests for a particular application, the tests will eventually come to serve as a regression suite, which will help to guard against regression bugs—bugs that occur when a developer breaks functionality that used to work. Any change to an interface—intentional or not—will cause unit tests to fail. For the intentional changes, the solution is to rewrite the unit test accordingly. But it is these unintentional changes for which we create unit tests, and which regression testing targets.

Experience has shown that fixing bugs in an application often introduces other bugs. It can be difficult to substantiate how often this happens in real development scenarios, but figures as high as 50% are likely in some cases.^[] By building a regression suite, the cost of fixing these "side-effect" bugs is greatly reduced. They can be discovered and mended during the development phase, instead of being reported by end users once the application has already been deployed.

^[] Frederick P. Brooks, The Mythical Man-Month (Boston, MA: Addison-Wesley, 1995), pp. 122.

Regression testing is also the key to newer software development methodologies, such as Agile Development and eXtreme Programming. As these methodologies increase in popularity, it can be expected that database developers will begin to adopt some of these techniques more readily.

Other -----------------

- Microsoft SQL Server 2008 Analysis Services : Designing simple dimensions (part 2) - Using the Dimension Editor

- Microsoft SQL Server 2008 Analysis Services : Designing simple dimensions (part 1) - Using the 'New Dimension' wizard

- SharePoint 2010 Search : Tuning Search (part 4) - Search Keywords and Best Bets

- SharePoint 2010 Search : Tuning Search (part 3) - The noindex Class & The Ratings Column

- SharePoint 2010 Search : Tuning Search (part 2) - The Thesaurus & Custom Dictionaries

- SharePoint 2010 Search : Tuning Search (part 1) - Authoritative Pages & Result Removal

- Automating Dynamics GP 2010 : Using Reminders to remember important events

- Organizing Dynamics GP 2010 : Going straight to the site with Web Links

- Microsoft Lync Server 2010 : Collaboration Benefits & Management and Administration Benefits

- Microsoft Lync Server 2010 : Benefits for Lync Server Users & Enterprise Voice Benefits