AP Statistics / Mr. Hansen
Test #8 (due by 3:30 p.m. on 4/28/1999—see note below)

Name: _________________________________

 

Total time required to complete this test (for information only; will not affect grade): ______

Note: The due date and time for this test are not affected by Diversity Day activities on Wednesday, April 28. Even though F period will not be meeting that day, this test is due no later than 3:30 p.m. on April 28 for everyone. Please deliver the test to me in person or slide it under my office door (look for the door with my name on it in Steuart 007). Do not leave your test in my mailbox.

General Instructions: Treat each problem as if it were an AP exam free-response question. In other words, write all of the necessary parts, show all of the necessary assumptions/conditions, show work (even if your calculator can do it in an instant!), and write a conclusion if needed. Do not write calculator notation. For formulas, you must follow the standard 3-step process ("Show me the formula, show me the plugged-in formula, show me the result"). Mark an "X" through any portion that you wish to be ignored during grading. If you have any question about what is expected, please ask. Collaboration with other STAtistics students is permitted if you document the names of the people with whom you worked.

1.

Consider a standard 52-card deck that is thoroughly shuffled between draws.

(a)

What is the probability of drawing a face card (defined here as jack, queen, or king) on a single draw?

(b)

What is the probability that the first face card appears on the fourth draw?

(c)

What is the probability that the first face card appears before the fourth draw?

(d)

What is the probability that the first face card appears on or after the fourth draw?

(e)

Compute the expected number of draws needed in order to obtain a face card. You may use the shortcut derived in class on Friday, April 23, provided that you document the required conditions.

(f)

What is the expected number of draws needed in order to obtain a heart? Is this greater than or less than your answer to part (e)?

(g)

Which has the greater probability in a single random draw: drawing a face card, or drawing a heart? In light of this, does your answer to part (f) make sense?

(h)

If the deck were not thoroughly shuffled between draws, explain why a geometric probability distribution would not be an appropriate model for the number of draws required.

2.

Let p be the probability of success and q the probability of failure for any one trial in a binomial setting.

(a)

Prove that for any positive integer n, the probability that the first success occurs on or after the nth trial is qn–1.

Hint: There are various ways to prove this. One method (not necessarily the easiest) is to mimic the algebraic approach we took on Friday, April 23, when we proved that any geometric random variable X satisfies the amazing formula m X = 1/p. If you take the algebraic approach, you may need to borrow an Algebra II or Precalculus textbook from an underclassman to help with one of the sums involved.

(b)

Use the result of part (a), even if you couldn’t prove it, to double-check your answer to #1(d).

(c)

Suppose that April Fools (i.e., people born in the month of April) constitute 8.5% of the U.S. population and are randomly dispersed in the country. We will select an SRS of 1200 Americans and give each subject a sequential ID number (0001 through 1200). As we contemplate performing this process and scanning down the list from 0001 to 1200, what is the probability that the first April Fool person will have an ID number of 18 or greater?

(d)

Reread the portion before part (a). Explain why the SRS described in part (c) violates the assumptions of the model.

(e)

Explain why it is legitimate to use the formula in part (a) to solve part (c), even though part (c) violates the assumptions of the model.

3.

Consider a game in which a freshman will roll a fair die. (This is not an STA freshman, I hasten to add, since promoting any sort of gambling would be wrong. This is an imaginary freshman from an imaginary school on an imaginary planet.) You will roll your die repeatedly until a 2 appears. Let X denote the number of rolls required in order to obtain the first roll of 2.

(a)

Explain why the geometric probability distribution is an appropriate model for the distribution of X.

(b)

Using the same ground rules as in #1(e), compute m X.

(c)

Show, using any appropriate method, that the median value for X is less than 6.

(d)

If you were to bet the freshman (even money) that the first roll of 2 would occur in fewer than 6 rolls, would this be a fair bet? (An "even money" bet means that each of you risks the same amount. In other words, neither of you is pledging to pay "odds" of 2-to-1 or whatever.)

(e)

Explain why your answer to (d) is consistent with (b) and (c).

(f)

Let Y be your net winnings when playing this game against the freshman using a $1 wager. Compute m Y (or, if you believe m Y = 0, explain why that is so).

(g)

What odds would you have to offer in order to make this a fair game? (If it already is a fair game, the answer is 1-to-1.)
4.

On Tuesday, April 13, you received a number of sample test questions. There was one set dated 3/18/98 and another dated 4/16/98, and both sets were stapled together (total of 4 sheets of paper). The answer key for most of these questions is posted here (or you can go to our class Web page under the "Test #7" heading and click on "Answer key for sample test problems").

In your hard-copy handout (not the Web page), look at the junk food table and Minitab printout that cover the front and back of the last sheet, and answer the following questions:

(a)

How many grams of carbohydrate would we expect to be associated with a fast food item containing 22 g of protein?

(b)

Is there a causal link between grams of protein and carbohydrate? (In other words, does adding protein to a food cause its carbohydrate content to increase?)

(c)

Does the residual plot reveal any reason to question the appropriateness of a linear fit model?

(d)

Which foods are regression outliers? Which foods are influential observations? Speculate briefly on the real-world meaning of what you have just said and write two or three additional sentences.

(e)

The printout tells you that SST for carbohydrate is 9103.8. (This is the sum of the squares of the deviations between the carbohydrate values and their mean, i.e., the total SS reflecting contributions both from the regression line and from the residuals.) It is an algebraic fact that SST for carbohydrates equals S (yiybar)2. Even though the printout does not report SST for protein (since protein is an explanatory variable and it doesn’t make much sense here to do ANOVA for an explanatory variable), you could compute the corresponding SST for protein, i.e., S (xixbar)2 using 1-Var Stats and some list manipulations. Please do this.

(f)

Using both your answer to part (e) and values that appear on the Minitab printout, but not any list manipulations, compute the standard error of the slope. Use proper notation for the standard error of the slope, and circle the values from the printout that you are using.

(g)

Does the standard error of the slope appear anywhere in the printout? If so, what is it labeled? If not, is there a way that you can quickly calculate it (i.e., more easily than in part (f)) by using only values that do appear on the printout? Explain.

(h)

Give a 95% confidence interval for the true slope of the regression line.

(i)

Assess the statistical significance of the regression. Is there evidence that protein is positively associated with carbohydrate?