STAtistics / Mr. Hansen
2/13/2013

Name: _________________________
Bonus (for Mr. Hansen’s use only): ________

Test through Chapter 9 (Calculator Required)

 

Rules

  • You may not write calculator notation anywhere unless you cross it out. For example, normalcdf(A,B,C,D) is not allowed; write a sketch with shaded area instead.
  • Adequate justification and notation are required unless otherwise stated.
  • All final answers in free-response portions should be circled or boxed.

1.

The following cornstalk heights (in meters) were measured from an SRS of Farmer Brown’s summer 2012 crop just before harvest:

2.65, 2.7, 2.9, 2.95, 2.65, 2.6, 2.7, 2.8, 2.8, 2.85, 2.78, 2.83, 2.82, 2.8, 2.81, 2.75

(a)

Write some words (approximately a sentence and a half) to finish the following thought in a way that shows that you really know what you are talking about. The true mean height of Farmer Brown’s cornstalks at harvest time in 2012 is . . .

 

 

 

______________________________________________________________________

 

 

 

______________________________________________________________________

 

 

(b)

Give a point estimate for the number described in part (a). Be sure to use correct notation.

 

 

 

 

(c)

Is it reasonable to assume that the population from which Farmer Brown’s SRS was drawn is normal? ____ Explain briefly.

 

 

 

______________________________________________________________________

 

 

(d)

For the sample as shown above, find the t-critical value associated with a 90% confidence interval for the true mean height of Farmer Brown’s cornstalks. Be sure to use correct notation.

 

 

 

 

(e)

Compute the margin of error for a 90% confidence interval for the true mean height of Farmer Brown’s cornstalks. Show your work.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(f)

Fill in the blanks (no work required). We are 90% confident that the true mean height of Farmer Brown’s cornstalks at harvest time 2012 was between ____________ and ____________ . We can rewrite this interval in the “estimate  m.o.e.” format as ____________  ____________ . Units for all values are ____________ .

 

 

(g)

Approximately how large a sample would be needed in order to reduce the m.o.e. to 2 cm? This question is super-difficult and is generally not asked on the AP exam, because the sample size feeds back into the t-critical value used in the computation. Ugh! That is why you will be allowed to make any sort of wild estimate you wish, so long as it is reasonable. Answer: ____________ cornstalks.


 

2.

Fill in the blanks.
The second semester of our course is primarily concerned with ________________ statistics (hint: starts with an I), and our activities/problems include estimating ________________ , computing ________________ intervals, writing null and alternative ________________ , and determining ________________ significance or lack thereof. Regarding the latter, significance is often deemed to occur when the ________________-value of a statistical test is below ________________ . However, lower values are sometimes required (for example, in a ________________  of law or in any situation where the H0 being refuted is well established or widely accepted). It goes without saying that any experiment designed to show that the speed of light is greater than the commonly accepted value would require an exceedingly ________________ value of P in order to get published. Our mechanism for computing P is to look at the tail probability in an imaginary distribution called the ________________ distribution, specifically the distribution of values for the ________________ of interest GIVEN THAT ________________ IS ASSUMED TO BE TRUE. If the statistic we compute from our study is far out in the tail (i.e., has a low ________________-value), then we conclude that the ________________ hypothesis is exceedingly unlikely to explain our observed value, and we reject the ________________ hypothesis in favor of a one- or ________________-tailed ________________ hypothesis. (The choice of which type of test to use depends on the wording of the original research question. If we think that the true ________________ value is greater than the hypothesized value, we would use a ________________-tailed test; if less than the hypothesized value, a ________________-tailed test. If we think the true value is simply different from the hypothesized value, with no direction specified, we would use a ________________-tailed test.)

 

There are 3 main categories of ________________ distributions that we study in AP Statistics: the ________________ model (used for tests or C.I.’s involving ________________ or differences of ________________), the ________________ model (used for tests or C.I.’s involving ________________ or differences of ________________), and the ________________ model, which we have not studied yet. The latter is for statistical significance tests involving 2 or more (usually 3 or more) ________________ .

 

We studied random ________________ before we studied ________________ distributions for a very logical reason, namely this: A ________________ computed from a random sample really is a random ________________ . Thus everything we learned about computing probabilities with r.v.’s carries over directly into the world of P-values.

 

When computing P-values, it is important to remember that the hypotheses should really be made ________________ the data are gathered. P-values computed ________________ the data are gathered are subject to the data-mining fallacy, a.k.a. the Texas ________________ fallacy. After all, if we test a vast pool of data for statistically significant correlations at the 0.05 significance level, then about ________________ % of the pairings will show “significance” even if ________________ alone is the only force at work. Thus we must exercise great caution in sifting through data and pronouncing a pattern to be “significant.” We can hunt for patterns, yes, but when we think we have found one, we should make some hypotheses and run a ________________ experiment to see if we can find strong evidence to refute H0. If we find strong evidence to refute H0, and if our finding is interesting to the scientific community, we probably have something we can _______________ in a scientific journal. If we fail to find evidence to refute H0, can we conclude that H0 is true? _______________

 

Homoscedasticity and heteroscedasticity refer to assumptions of “equal variances” and “unequal variances,” respectively, when dealing with 2-sample situations. Homoscedasticity (equal variance) means that we “pool” the data to come up with a single shared value for the _______________ _______________ of the sampling distribution. Do we ever do this when working with proportions? _______________ If so, briefly describe a situation where we would do this:

___________________________________________________________________________
Do we ever make the homoscedasticity (equal variance) assumption when working with means?
_______________ If so, briefly describe a situation where we would do this:

___________________________________________________________________________

 

 

 

 

3.

We learned 3 rules of thumb in a certain context. One of the rules was that

State the other 2 rules of thumb and the purpose of the rules (i.e., the rules are to determine when such-and-such is a valid approximation for the such-and-such).

 

Rule #2: __________________________________

Rule #3: __________________________________

Purpose of the rules of thumb: to determine . . .

____________________________________________________________________

____________________________________________________________________

 

 

4.

Suppose that The Independent wishes to determine what percentage of the STA community (defined as the collection of current students, current parents, current grandparents, current faculty and staff, emeritus faculty and staff, alumni, and parents of alumni) agrees with the following statement: St. Albans is one of the 20 quirkiest places on the East Coast. A pilot study reveals that roughly half of the people polled informally seem to agree with the statement.

 

 

(a)

Let  denote the sample proportion of those from the community who agree with the statement. What is the true sampling distribution of sample proportions, if people are selected at random, with replacement, from the STA community? Is it z, t, or something else? Explain briefly.

 

 

 

 

 

 

 

 

 

 

(b)

Suppose that we gather an SRS of 250 people from the STA community. Are the rules of thumb met? Check and verify all three of the rules of thumb. Show your work. Don’t just say, “Yeah, whatevs, they all check out.”

 

 

 

 

 

 

 

 

 

 

 

 

 

 


 

(c)

How large a sample would be needed to guarantee that the m.o.e. (at 95% confidence) for the true proportion of people who agree with the boldface statement will be less than 4 percentage points? Show work.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(d)

Compute a 95% confidence interval for p, given that 138 of the 250 people polled agree with the boldface statement. Work is optional. You read correctly! It’s optional. IMPORTANT: Write your answer as a complete sentence in the context of this problem.