STAtistics / Mr. Hansen
4/30/2008

Name: _________________________

“Everything Test” I

 

Part I: Problems (20 pts. for #1, 30 pts. for #2)

1.

In a class of 9 students, the probability that Smokey selects any given student on any given trial should be , or . Because of the *TRUE VOLUNTEER* and **TRUE VOLUNTEER** entries, volunteers should be twice as likely, and Mr. Hansen is also a possibility. Suppose that we have the following data from a random set of 80 Smokey selections:

 

 

 

Student #

Observed likelihood

Expected probability

 

1

0.0750

 

2

0.0875

 

3

0.0875

 

4

0.1125

 

5

0.0500

 

6

0.0750

 

7

0.0750

 

8

0.0500

 

9

0.1125

 

Mr. Hansen

0.1000

 

volunteer

0.1750

 

 

 

 

 

Set up and execute a suitable statistical test to determine if there is evidence of a lack of randomness. Continue your work on the reverse side. Show adequate justification for your conclusion, including showing that you know the assumptions and have checked them. It is not necessary to show every little nitnoid detail. For example, when computing the contributions to chi-square, you may show one or two of the calculations and let your calculator do the rest.

 


 

2.

Here is part of the output from a Minitab output in a study of 22 data points in a scatterplot of foreign vs. domestic investment returns.

 

 

 

Predictor    Coef         Stdev        t-ratio      p
Constant     4.777        5.477        *            0.393
US return    0.8130       0.2628       **           0.006

s = 20.08    R-sq = 32.4%

 

 

 

Unfortunately, the t-ratios have been obscured by a coffee spill.

 

 

(a)

Which of the t-ratios (* or **) is essentially of no interest to us?

 

 

(b)

What is the t statistic of the slope? Circle the appropriate value, or if computation is required, show how you computed the t statistic of the slope.

 

 

 

 

 

 

(c)

What is the standard error of the slope? Double-circle the appropriate value, or if computation is required, show how you computed the standard error of the slope. Note: If you cannot get this, raise your hand to purchase it, because you need it for some of the later work.

 

 

 

 

 

 

(d)

How can you tell that the value of the linear correlation coefficient (statistic) is positive?

 

 

 

 

(e)

What is the value of the linear correlation coefficient (statistic)? Triple-circle the appropriate value, or if computation is required, show how you computed it.

 

 

 

 

 

 

(f)

Compute a 95% confidence interval for the true value of the slope. You may assume that all appropriate assumptions have been met. Show work.

 

 

 

 

 

 

 

 

 

 

(g)

Interpret the slope in the context of the problem. The explanatory variable is the percentage rate of return on US investments, and the response variable is the percentage rate of return on overseas investments.

 

 

 

 

 

 

 

 

(h)

Predict the overseas return associated with a US return of 8.2%. Show a wee bit of work.

 


Solutions:

 

1.

Let p1 = true probability of selecting student 1, . . . ,
      p2 =    "               "         "            "              "  9,
      pMr. H =     "               "         "            "              "  Mr. Hansen,
      pvol =     "               "         "            "              "  volunteer

H0: All true probabilities are as claimed (p1 = , . . . , pvol = )
Ha: Not all true probabilities are as claimed

Assumptions for  goodness-of-fit test [must identify name of test!]:
      SRS? Yes, can treat as SRS of all possible Smokey selections.
      All exp. counts  5? Yes, in fact, all are 6 or greater.

Expected counts (mult. through by n = 80):


Observed counts deduced from data:
6, 7, 7, 9, 4, 6, 6, 4, 9, 8, 14

Test statistic:

P-value = 0.9328

Conclusion: There is no evidence ( = 4.3, df = 10, P = 0.9328) that the true selection probabilities are other than the ones claimed.

 

 

 

Scoring:
     4 pts. for hypotheses
     4 pts. for assumptions and identifying name of test
     4 pts. for test statistic (–1 if there was no attempt to explain how computed)
     4 pts. for P-value
     4 pts. for conclusion in context

 

 

2.

In the original version, your all-too-human instructor accidentally forgot to specify the number of data points. Some people assumed n = 22 and df = n – 2 = 20, which was a correct guess. As a result of the mistake, scoring for part (f) will need to be extra-lenient.

 

 

(a)

* [refers to t-ratio for the intercept, which is not on the AP syllabus]

 

 

(b)

Since ,

algebra (or knowledge of how standardized statistics are computed) gives
. Answer: 3.0936.

 

 

(c)

 (given)

 

 

(d)

b1 = 0.8130 > 0

 

 

(e)

Since r2 = 0.324, r = 0.5692.

 

 

(f)

m.o.e. =

We are 95% confident that the true LSRL slope () is 0.8130  0.548.

Alternate format: 95% C.I. for true LSRL slope () is (0.265, 1.361).

 

 

(g)

For every percentage point by which the US investment rate of return increases, the model predicts the overseas rate of return to increase by 0.813 percentage point.

 

 

(h)

 

 

 

Scoring:
     3 pts. ea. for all parts except (f) and (g)
     6 pts. for (f), graded leniently as explained above
     6 pts. for (g), with 2 pts. deducted if student forgot “the model predicts”