When you’re reading, talking, or thinking about statistics and the word mean
comes up, it refers to the total divided by the count. The total of the
heights of everyone in your family divided by the number of people in
your family. The total price per gallon of gasoline at all the gas
stations in your city, divided by the number of gas stations. The total
number of a baseball player’s hits divided by the number of at bats.
In the context of statistics, it’s very convenient, and more precise, to use the word mean this way. It avoids the vagueness of the word average, which—as just discussed—can refer to the mean, to the median, or to the mode.
So it’s sort of a shame that Excel uses the function name AVERAGE() instead of MEAN(). Nevertheless, Figure 1 gives an example of how you get a mean using Excel.
Understanding the elements that Excel’s worksheet
functions have in common with one another is important to using them
properly, and of course you can’t do good statistical analysis in Excel
without using the statistical functions properly. There are more
statistical worksheet functions in Excel, about one hundred, than any
other function category. So I propose to spend some ink here on the
elements of worksheet functions in general and statistical functions in
particular. A good place to start is with the calculation of the mean,
shown in Figure 1.
Understanding Functions, Arguments, and Results
The function that’s depicted in Figure 1, AVERAGE(), is a typical example of statistical worksheet functions.
Defining a Worksheet Function
An Excel worksheet function—more briefly, a function—is just a formula that someone at Microsoft wrote to save you time, effort, and mistakes.
Note
Formally, a formula
in Excel is an expression in a worksheet cell that begins with an equal
sign (=); for example, =3+4 is a formula. Formulas often employ
functions such as AVERAGE() and an example is =AVERAGE(A1:A20) + 5,
where the AVERAGE() function has been used in the formula.
Nevertheless, a worksheet function is itself a formula; you just use
its name and arguments without having to deal with the way it goes
about calculating its results.
Suppose that Excel had no AVERAGE() function. In that case, to get the result shown in cell B13 of Figure 1, you would have to enter something like this in B13:
=(B2+B3+B4+B5+B6+B7+B8+B9+B10+B11) / 10
Or, if Excel had a SUM() and a COUNT() function but no AVERAGE(), you could use this:
=SUM(B2:B11)/COUNT(B2:B11)
But you don’t need to bother with those because Excel has an AVERAGE() function, and in this case you use it as follows:
=AVERAGE(B2:B11)
So—at least in the cases of Excel’s statistical, mathematical, and financial functions—all the term worksheet function means is a prewritten formula. The function results in a summary value that’s usually based on other, individual values.
Defining Arguments
More terminology: Those “other, individual values” are called arguments.
That’s a highfalutin name for the values that you hand off to the
function—or, put another way, that you plug into the prewritten
formula. In the instance of the function
=AVERAGE(B2:B11)
the range of cells represented by B2:B11 is the function’s argument. The arguments always appear in parentheses following the function.
A single range of cells is regarded as one argument,
even though the single range B2:B11 contains ten values.
AVERAGE(B2:B11,C2:C11) contains two arguments: one range of ten values
in column B and one range of ten values in column C. (Excel has a few
functions, such as PI(), which take no arguments but you have to supply
the parentheses anyway.)
Note
Excel 2010 enables you to specify as many as 255
arguments to a function. (Earlier versions, such as Excel 2003, allowed
you to specify only 30 arguments.) But this doesn’t mean that you can
pass a maximum of 255 values to a function. Even AVERAGE(A1:A1048576),
which calculates the mean of the values in over a million cells, has
only one argument.
Many statistical and mathematical functions in Excel
take the contents of worksheet cells as their arguments—for example,
SUM(A2:A10). Some functions have additional arguments that you use to
fine-tune the analysis.
=FREQUENCY(B2:B11,E2:E6)
In this example, suppose that you wanted to categorize the price per gallon data in Figure 1
into five groups: less than $1, between $1 and $2, between $2 and $3
and so on. You could define the limits of those groups by entering the
value at the upper limit of the range—that
is, $1, $2, $3, $4, and so on—in cells E2:E6. The FREQUENCY() function
expects that you will use its first argument to tell it where the
individual observations are (here, they’re in B2:B11, called the data array
by Excel) and that you’ll use its second argument to tell it where to
find the boundaries of the groups (here, E2:E6, called the bins array).
So in the case of the FREQUENCY() function, the
arguments have different purposes: The data array argument contains the
range address of the values that you want to group, and the bins array
argument contains the range address of the boundaries you want to use
for the bins.
Contrast that with something such as =SUM(A1, A2,
A3), where the SUM() function expects each of its arguments to
contribute to the total. To use worksheet functions properly, you must
be aware of the purpose of each one of a function’s arguments.
Excel gives you an assist with that. When you start
to enter a function into a cell in an Excel worksheet, Excel responds
by prompting you for the remaining arguments. See Figure 2,
where the user has just begun entering the FREQUENCY() function. Excel
displays the names of the arguments in a small pop-up window.
Excel is often finicky about the order in which you
supply the arguments. In the prior example, for instance, you get a
very different (and very wrong) result if you incorrectly give the bins
array address first:
=FREQUENCY(E2:E6,B2:B11)
The order matters if the arguments serve different
purposes, as they do in the FREQUENCY() function. If they all serve the
same purpose, the order doesn’t matter. For example,
=SUM(A2:A10,B2:B10) is equivalent to =SUM(B2:B10,A2:A10) because the
only arguments to the SUM() function are its addends.
Defining Return
One final bit of terminology used in
functions: When a function calculates its result using the arguments
you have supplied, it displays the result in the cell where you entered
the function. This process is termed returning the result. For example, the AVERAGE() function returns the mean of the values you supply.