ilar to the normal (Z) distribution, with t being slightly more spread out.

The equation for the t-distribution is:

b

t (2-6)

x2

s/ i

i

where the denominator is the standard error of b, commonly denoted as

sb (the standard error of a is sa).

Since is unobservable, we have to make an assumption about it in

order to calculate a t-distribution for it. The usual procedure is to test for

the probability that, regardless of the regression™s estimate of ”which

is our b”the true is really zero. In statistics, this is known as the ˜˜null

hypothesis.™™ The magnitude of the t-statistic is indicative of our ability

to reject the null hypothesis for an individual variable in the regression

equation. When we reject the null hypothesis, we are saying that our

regression estimate of is statistically signi¬cant.

We can construct 95% con¬dence intervals around our estimate, b, of

the unknown . This means that we are 95% sure the correct value of

is in the interval described in equation (2-7).

b t0.025 sb (2-7)

Formula for 95% confidence interval for the slope

Figure 2-2 shows a graph of the con¬dence interval. The graph is a

t-distribution, with its center at b, our regression estimate of . The mark-

ings on the x-axis are the number of standard errors below or above b.

As mentioned before, we denote the standard error of b as sb. The lower

boundary of the 95% con¬dence interval is b t0.025 sb, and the upper

boundary of boundary of the 95% con¬dence interval is b t0.025 sb. The

CHAPTER 2 Using Regression Analysis 33

34

F I G U R E 2-1

Z-distribution vs t-distribution

0.4

0.35

0.3

0.25

probability density

0.2

0.15

0.1

t t

0.05

Z Z

0

-6 -4 -2 0 2 4 6

for Z, standard deviations from mean

For t, standard errors from mean

F I G U R E 2-2

t-distribution of B around the Estimate b

0.4

0.35

0.3

probability density

0.25

0.2

0.15

area = 2.5%

area =2.5%

0.1

0.05

0

-6 -4 -2 0 2 4 6

B =b+t0.025 sb

B =b

B= b“t 0.025sb B measured in standard

errors away from b

35

area under the curve for any given interval is the probability that will

be in that interval.

The t-distribution values are found in standard tables in most statis-

tics books. It is very important to use the 0.025 probability column in the

tables for a 95% con¬dence interval, not the 0.05 column. The 0.025 col-

umn tells us that for the given degrees of freedom there is a 21„2% prob-

ability that the true and unobservable is higher than the upper end of

the 95% con¬dence interval and a 21„2% probability that the true and

unobservable is lower than the lower end of the 95% con¬dence interval

(see Figure 2-2). The degrees of freedom is equal to n k 1, where n

is the number of observations and k is the number of independent vari-

ables.

Table 2-3 is an excerpt from a t-distribution table. We use the 0.025

column for a 95% con¬dence interval. To select the appropriate row in

the table, we need to know the number of degrees of freedom. Assuming

n 10 observations and k one independent variable, there are eight

degrees of freedom (10 1 1). The t-statistic in Table 2-3 is 2.306 (C7).

That means that we must go 2.306 standard errors below and above our

regression estimate to achieve a 95% con¬dence interval for . The re-

gression itself will provide us with the standard error of . As n, the

number of observations, goes to in¬nity, the t-distribution becomes a z-

distribution. When n is large”over 100”the t-distribution is very close

to a standardized normal distribution. You can see this in Table 2-3 in

that the standard errors in Row 9 are very close to those in Row 10, the

latter of which is equivalent to a standardized normal distribution.

The t-statistics for our regression in Table 2-1B are 3.82 (D33) and

56.94 (D34). The P-value, also known as the probability (or prob) value,

represents the level at which we can reject the null hypothesis. One minus

the P-value is the level of statistical signi¬cance of the y-intercept and

independent variable(s). The P-values of 0.005 (E33) and 10 11 (E34) mean

that the y-intercept and slope coef¬cients are signi¬cant at the 99.5% and

99.9% levels, respectively, which means we are 99.5% sure that the true

y-intercept is not zero and 99.9% sure that the true slope is not zero.10

T A B L E 2-3

Abbreviated Table of T-Statistics

A B C D

4 Selected t Statistics

5 d.f.\Pr. 0.050 0.025 0.010

6 3 2.353 3.182 4.541

7 8 1.860 2.306 2.896

8 12 1.782 2.179 2.681

9 120 1.658 1.980 2.358

10 In¬nity 1.645 1.960 2.326

10. For spreadsheets that do not provide P-values, another way of calculating the statistical

signi¬cance is to look up the t-statistics in a Student™s t-distribution table and ¬nd the level

of statistical signi¬cance that corresponds to the t-statistic obtained in the regression.

PART 1 Forecasting Cash Flows

36

The F test is another method of testing the null hypothesis. In mul-

tivariable regressions, the F-statistic measures whether the independent

variables as a group explain a statistically signi¬cant portion of the var-

iation in Y.

We interpret the con¬dence intervals as follows: there is a 95% prob-

ability that true ¬xed costs (the y-intercept) fall between $22,496 (F33) and

$91,045 (G33); similarly, there is a 95% probability that the true variable

cost (the slope coef¬cient) falls between $0.77 (F34) and $0.84 (G34).

The denominator of equation (2-6) is called the standard error of b,

or sb. The standard error of the Y-estimate, which is de¬ned as

n

1 ˆ

Yi)2

s (Yi

n 2 i1

is $16,014 (B23). The larger the amount of scatter of the points around

the regression line, the greater the standard error.11

Precise Con¬dence Intervals12

Earlier in the chapter, we estimated 95% con¬dence intervals by subtract-

ing and adding two standard errors of the y-estimate around the regres-

sion estimate. In this section, we demonstrate how to calculate precise

95% con¬dence intervals around the regression estimate using the equa-

tions:

x2

1 o

t0.25s (2-8)

x2

n i

95% confidence interval for the mean forecast

x2

1 o

t0.025s 1 (2-9)

x2

n i

95% confidence interval for a specific year™s forecast

In the context of forecasting adjusted costs as a function of sales,

equation (2-8) is the formula for the 95% con¬dence interval for the mean

adjusted cost, while equation (2-9) is the 95% con¬dence interval for the

costs in a particular year. We will explain what that means at the end of

this section, after we present some material that illustrates this in Table

2-1B, page 2.

Note that these con¬dence intervals are different than those in equa-

tion (2-7), which was around the forecast slope only, i.e., b. In this section,

11. This standard error of the Y-estimate applies to the mean of our estimate of costs, i.e., the

average error if we estimate adjusted costs and expenses many times. This is appropriate in

valuation, as a valuation is a forecast of net income and / or cash ¬‚ows for an in¬nite

number of years. The standard error”and hence 95% con¬dence interval”for a single

year™s costs is higher.

12. This section is optional, as the material is somewhat advanced, and it is not necessary to

understand this in order to be able to use regression analysis in business valuation.

Nevertheless, it will enhance your understanding should you choose to read it.

CHAPTER 2 Using Regression Analysis 37

we are calculating con¬dence intervals around the entire regression fore-

cast.

The ¬rst 15 rows of Table 2-1B, page 2, are identical to the ¬rst page

and require no explanation. The $989,032 in B16 is the average of the 10

years of sales in B6“B15.

Column D is the deviation of each observation from the mean, which

is the sales in Column B minus the mean sales in B16. For example, D6