Solutions to Geometric Distribution Problems (Spring 1999)

AP Statistics / Mr. Hansen

Solutions to problems 2 and 3 from spring 1999 test on geometric distributions

2.(a)	P(X ³ n) = P(first n–1 trials are all failures) = qⁿ – 1 There are harder ways to do this, of course. Here’s one that only a math nerd could love: P(X ³ n) = 1 – P(X £ n – 1) = 1 – (p + qp + q²p + q³p + . . . + qⁿ – 2p), where the sum in parentheses consists of the first n – 1 terms of a geometric series with first term p and common ratio q. In Algebra II, you learned that the sum of the first k terms of a geometric sequence with first term a and common ratio r is given by the formula a(1 – r^k)/(1 – r). Letting k = n – 1, a = p, and r = q, we get 1 – (p + qp + q²p + q³p + . . . + qⁿ – 2p) = 1 – a(1 – r^k)/(1 – r) = 1 – p(1 – qⁿ – 1)/(1 – q) = 1 – p(1 – qⁿ – 1)/p = 1 – (1 – qⁿ – 1) = qⁿ – 1.
2.(b)	By part (a), P(X ³ 4) = qⁿ – 1 = (40/52)^{4 – 1} = 0.455. This agrees with the answer we obtained for #1(d) in class on Monday, 2/14/00.
2.(c)	Let X denote the number of trials needed to find an April Fool. By part (a), P(X ³ 18) = qⁿ – 1 = (1 – 0.085)^{18 – 1} = 0.221.
2.(d)	Background (not part of what you would write in your solution): The so-called "binomial setting" (i.e., only 2 possible outcomes, constant p, and independent trials) applies to both binomial distributions and geometric distributions. The only difference is the random variable of interest, often denoted X. For binomial distributions, X = # of successes in n trials. For geometric distributions, X = # of trials needed for first success. Solution writeup: An SRS consitutes sampling without replacement and therefore does not give independent trials.
2.(e)	Because the U.S. population is large compared to the 1200-person SRS, the SRS is an acceptable approximation of independent trials.
3.(a)	ü Only 2 possible outcomes (2, or not a 2) ü Constant p (namely, 1/6) ü Independent trials ü Variable of interest (X) is # of trials needed to obtain first success \ X follows a geometric distribution.
3.(b)	If you did part (a) to justify using geometric distribution formulas, m _X = 1/p = 6. If you didn’t get part (a), you could still compute m _X by the "pixi" method, as follows: m _X = 1(P(X = 1)) + 2(P(X = 2)) + 3(P(X = 3)) + . . . = 1(1/6) + 2(5/6 · 1/6) + 3(5/6 · 5/6 · 1/6) + . . . = 1/6 + 10/36 + 75/216 + 500/1296 + 3125/7776 + . . . = 6. Unfortunately, it takes 50 or so terms before the result gets really close to 6, so you’d have to indicate that you had used your calculator to do this. (Note: If you’re not up to speed with the seq (sequence) function, don’t even bother—just use the 1/p formula and hope for the best.)
3.(c)	Method 1: Since X follows a geometric distribution, P(X ³ 6) = 1 – P(X < 6) = 1 – 0.598 = 0.402. Since less than half the area is at or above X = 6, the median must lie to the left of 6. Method 2: Since X follows a geometric distribution, P(X < 6) = P(X £ 5) = 0.598 (by calc.). Since more than half the area is below X = 6, the median must lie to the left of 6. Method 3 (most elegant): All geometric distributions are skew right.* Therefore, the median lies to the left of the mean, which is 6 by part (b). * Explanation (optional): With p and q both strictly between 0 and 1, the sequence of probabilities {p, qp, q²p, q³p, . . .} is decreasing. Therefore, all geometric distributions are skew right.
3.(d)	No, since P(X < 6) = 0.598 (by calc.). If you are betting the freshman that the first 6 will occur in fewer than 6 rolls, you will win almost 60% of the time.
3.(e)	The expected occurrence (mean) of the first successful roll is 6, but this is a weighted average that counts large values more heavily. For example, it is very unlikely that 20 rolls would be needed to obtain the first success, but that probability has a weighting factor of 20 that pulls the mean to the right. (Or, you could appeal to right skewness of the distribution.) The distribution of X "balances" at 6 based on this weighted average. This fact does not contradict the conclusion of part (c) that more than half the area lies below 6.
3.(f)	Clearly m _Y > 0 since we have an advantage over the freshman. Our 2 possible outcomes are to win $1 (almost 60% likely) or to lose $1. Applying the "pixi" rule, we have m _Y = 0.598($1) + 0.402(–$1) = $0.196.
3.(g)	Assume that we offer the freshman odds of "k to 1." In other words, if the freshman wins, we will pay him k dollars (k > 1) to compensate for the lopsidedness of the game. If we win, however, the freshman needs to pay only $1. For this to be a fair game, m _Y = 0.598($1) + 0.402(–$k) must equal 0. Solving for k gives k = 1.488. Answer: We must offer odds of 1.488 to 1 to make this a fair game. Note: The average person on the street may find the outcome of this problem baffling. While most people can imagine that the "expected" wait time for a success is 6 rolls, few have the sophistication to realize that that does not mean that the first success occurs before or after roll #6 with equal probability. Some of you are perhaps already imagining a barroom bet in which you lead your victim to buy into the notion that success is equally likely before or after roll #6. You flip a coin to see who gets the "before 6" and who gets the "after 6"; using an unfair coin or a hustling strategy, you make sure that you have the "before 6" to yourself when the wager is large. You either ignore the case of "exactly 6" or generously give it away to your victim. I strenuously discourage you from performing any gambling or deception of this type. I offer this problem only as an example of how a good knowledge of probability and statistics may someday prevent you from being swindled. After all, you will be a freshman yourself next year, n’est-ce pas?