AP Statistics / Mr. Hansen
10/20/2004 [rev. 11/9/2004]

Name: _________________________

Partial Answer Key to Test Excerpt

Note: This answer key is provided in order to help you check a few of the answers from the October 2000 test. Some of the details are omitted here. Remember that you must show full work for full credit. In general, full work consists of formula, plug-ins, and answer (circled, with correct units such as dollars or years).

16.

. . . the difference is too large to be plausibly explained by chance alone.

 

 

17.

The fallacy is “post hoc, ergo propter hoc.” The mere fact that crime rate reductions followed the change in office is no proof that Sheriff Jimmy was the cause. There could be a lurking variable, such as an improved economy, an aging population of young males (who commit most of the crimes), or a change in the methodology used for gathering and reporting crime statistics.

 

 

18.

The sum of residuals for any LSRL equals 0. Here, there is one medium-sized positive residual, which is no match for the four medium-sized negative residuals. The residual plot as shown has a sum of residuals that is negative, not 0.

 

 

19.(a)

explanatory: speed (mph)
response: wake horsepower (hp)

 

 

(b)

Linear, exponential, and power fits all give good r values. (Desirable: Show a scatterplot with all three overlaid.) However, the residual plots for linear and exponential show a greater lack of randomness than the power fit does. Although the power fit produces the best r value of the lot, that fact by itself is not sufficient to prove that the power fit is superior. You must look at the residual plot. [Graphs are omitted here to save space. However, both a scatterplot and one or more residual plots are required.] The chosen model is yhat = .000000011001365(x6.968158443).

 

 

(c)

yhat = .000000011001365(x6.968158443)
           = .000000011001365(306.968158443)
           = 215.904 hp

 

 

(d)

To say that there is a 7th-degree fit could mean a 7th-degree polynomial with arbitrary coefficients. However, we have no realistic way of computing such a thing. Instead, let as assume that y » f (x) where f is a simple function of degree 7, essentially some slight modification of the function that raises a number to the 7th power. That means that f –1 is essentially a 7th-root function, and we can obtain a straight line by composing f –1 with f.

(This is exactly the same sort of thing we did when performing exponential regression. We began by taking the logarithm of both sides of the equation y
» abx, since logarithms are the inverse of exponentials.)

Note that the inverse of raising to the 7th degree is to apply a 7th-root function. We will ignore the constants for now, since the LSRL will compute constants for us that ensure a good fit. We have
y
» f (x)
f –1(y)
» f –1 (f (x)) = some linear function of x
y1/7
» some linear function of x
yhat1/7 = a + bx, where a and b are determined by LSRL fit
yhat1/7 = .0054198625 + .0716784499x, with r = .999996994 and a random resid. plot (show it)
yhat = (.0054198625 + .0716784499x)7

This is certainly a 7th-degree function, though not a power function. The fit is extremely close, with a suitably random residual plot (show it) and no residual larger than about .002 in absolute value.

Comment 1: The r value reported above, namely r = .999996994, is for the linear fit between x and f –1(y), not for a fit between x and y directly. This is reminiscent of the r value reported in an exponential regression, which signifies the strength of the correlation between x and log y, not between x and y.

Comment 2: For what it’s worth, if we repeat part (c) with our new model, we obtain
yhat = (.0054198625 + .0716784499x)7
           = (.0054198625 + .0716784499 · 30)7
           = 216.382 hp, a slightly different result