Setting Up the Worksheet for Solver
With the actual observations in A2:A11, as shown in Figure 4, continue by taking these steps:
1. | Enter any number in cell G2. It is 0 in Figure 4,
but you could use 10 or 1066 or 3.1416 if you prefer. When you’re
through with these steps, you’ll find the mean of the values in A2:A11
has replaced the value you now begin with in cell G2.
|
2. | In cell B2, enter this formula:
=$G$2
|
3. | Copy
and paste the formula in B2 into B3:B11. Because the dollar signs in
the cell address make it a fixed reference, you will find that each
cell in B2:B11 contains the same formula. And because the formulas
point to cell G2, whatever number is there also appears in B2:B11.
|
4. | In cell C2, enter this formula:
=A2 − B2
|
5. | Copy
and paste the formula in C2 into C3:C11. The range C2:C11 now contains
the differences between each individual observation and whatever value
you chose to put in cell G2.
|
6. | In
cell D2, enter the following formula, which uses the caret as an
exponentiation operator to return the square of the value in cell C2:
=C2∧2
|
7. | Copy
and paste the formula in D2 into D3:D11. The range D2:D11 now contains
the squared differences between each individual observation and
whatever number you entered in cell G2.
|
8. | To get the sum of the squared differences, enter this formula in cell D13:
=SUM(D2:D11)
|
9. | Now
start Solver. With cell D13 selected, click the Data tab and locate the
Analysis group. Click Solver to bring up the dialog box shown in Figure 5.
|
10. | You want to minimize the sum of the squared differences, so choose the Min radio button. |
11. | Because
D13 was the active cell when you started Solver, it is the address that
appears in the Set Objective field. Click in the By Changing Variable
Cells box and then click in cell G2. This establishes the cell whose
value Solver will modify.
|
12. | Click Solve.
|
Solver now iterates through a sequence of values for
cell G2. It stops when its internal decision-making rules tell it that
it has found a minimum value for cell D13 and that testing more values
in cell G2 won’t help.
Using the data given in Figure 4, Solver finishes with a value of 68.8 in cell G2 (see Figure 6).
Because of the way that the worksheet was set up, that’s the value that
now appears in cells B2:B11, and it’s the basis for the differences in
C2:C11 and the squared differences in D2:D11. The sum of the squared
differences in D13 is minimized, and the value in cell G2 that’s
responsible for the minimum sum of the squared differences—or, in more
typical statistical jargon, least squares—is the mean of the values in A2:A11.
Tip
If you take another look at Figure 6, you’ll see a bar at the bottom of the Excel window with the word Ready at its left. This bar is called the status bar.
You can arrange for it to display the mean of the values in selected
cells. Right-click anywhere on the status bar to display a Customize
Status Bar window. Select or deselect any of these to display or
suppress them on the status bar: Average, Count, Numeric Count,
Minimum, Maximum, and Sum. The Count statistic displays a count of all
values in the selected range; the Numeric Count displays a count of
only the numeric values in the range.
A few comments on this demonstration:
It works with any set of real numbers, and
any size set. Supply some numbers, total their squared differences from
some other number, and then tell Solver to minimize that sum. The
result will always be the mean of the original set.
This
is a demonstration, not a proof. The proof that the squared differences
from the mean sums to a smaller total than from any other number is not
complex and it can be found in a variety of sources.
This discussion uses the terms differences and squared differences. You’ll find that it’s more common in statistical analysis to speak and write in terms of deviations and squared deviations.
This has to be the most roundabout way of
calculating a mean ever devised. The AVERAGE() function, for example,
is lots simpler. But the exercise using Solver in this section is
important for two reasons:
Understanding other concepts, including
correlation, regression, and the general linear model, will come much
easier if you have a good feel for the relationship between the mean of
a set of scores and the concept of minimizing squared deviations.
If
you have not yet used Excel’s Solver, you have now had a glimpse of it,
although in the context of a problem solved much more quickly using
other tools.
I have used a very simple statistical function,
AVERAGE(), as a context to discuss some basics of functions and
formulas in Excel. These basics apply to all Excel’s mathematical and
statistical functions, and to many functions in other categories as
well.