The "Must-Pass" Quiz

STAtistics / Mr. Hansen
5/5/2003 [rev. 3/4/04, 4/19/05, 4/15/06, 5/10/07, 4/21/08, 4/16/010, 5/8/21]

Name: _________________________

The “Must-Pass” Quiz
(Doubles as a review for the AP examination. A partial answer key and tote board are available.)

	Instructions: Please learn how to answer each of the following questions in your own words. “Canned” answers will not earn full credit. A random sampling of questions will be used. Everyone must pass this quiz on or before the last day of classes. I know what you’re thinking: “Mr. Hansen, if I fail the first time, may I take the quiz again without penalty?” The answer is yes, but please try to do well on your first try, because the quiz becomes one question longer with each additional attempt. You will be given an SRS of questions: 10 on the first try, 11 on the second try, and so on. Starred questions (*) indicate the “20/20 set”: that is, 20 things I would hope you still remember when I see you in 20 years! You are allowed to miss up to 2 questions without penalty, except for the starred questions, which cannot be missed. (Missing a starred question is an instant scratch.)

1.*	What is a statistic? Give several examples.

2.*	What is a parameter? Give several examples.

3.*	What alternate meaning does the word parameter have in other mathematical disciplines?

4.	What are the parameters of a uniform distribution? a normal distribution? a binomial distribution? a geometric distribution? a t distribution? a distribution?

5.	Describe how to recognize uniform, normal, binomial, geometric, t, and distributions.

6.	Define range and describe how to find it.

7.	Define IQR and describe how to find it.

8.	Describe how to find outliers
(a)	in a column of data;

(b)	in a regression setting.

9.	In regression, what names are given to the x and y variables?

10.	What does MSE mean? Is it a synonym for variance?

11.	What does s.d. measure, and how is it computed?

12.	What special geometric meaning does s.d. have in a normal distribution?

13.	What is skewness? Give two examples of different ways to detect skewness.

14.	How does one recognize lack of normality?

15.*	What is the most common type of regression?

16.	Which is usually of greater interest, the LSRL slope or the LSRL y-intercept? Why?

17.	What name do we give to r? What does r mean? How do we compute r?

18.	What name do we give to r²? What does r² mean?

19.	Is r affected by choice of units (e.g., mm, cm, inches, feet, light-years)? How about a and b?

20.	Is r affected by choice of which variable is x and which is y? How about a and b?

21.	How do we typically compute a and b? What other ways are there?

22.	Describe a few interesting properties of the LSRL.

23.	What is a residual? How does one make a residual plot? If a residual plot for a LSRL model has residuals on the y axis, what variable goes on the x axis?

24.	Give several examples of “good” and “bad” residual plots and what they should be telling us.

25.	Tell whether the following regression-related terms are synonyms: ____________ outlier and ____________ observation. If not, why not?

26.	Interpret a and b for a layperson.

27.	What do the letters r.v. mean? Give two examples, one that is ____________ and another that is ____________ .

28.	If X is a(n) ____________ , then is calculated by ____________ and is known by two names: ____________ or ____________ .

29.	If X is a(n) ____________ , then ____________ is calculated as probability-weighted MSE and is indicated by either of two possible notations: ____________ or ____________. The ____________ ____________ of ____________ equals s.d., denoted ____________ .

30.	The mean of a ____________ equals the ____________ of the ____________ . Is this always true? What about for differences?

31.	The variance of a ____________ equals the ____________ of the ____________ . Is this always true? What about for differences?

32.	The s.d. of a ____________ multiple of X equals the ____________ times ____________ . Is this always true?

33.	Describe how each of the following is affected by linear transformations: r, , , IQR, range.

34.	What is the purpose of a z score? Under what circumstances may one compute a z score? Describe how to compute it and what it means.

35.	In probability theory, a Venn diagram showing no overlap indicates that two ____________ are ____________ ____________ . Is this term a synonym for ____________ ? If not, explain the difference.

36.*	Why do we care about probability? Is it merely of interest to casinos and misguided people who waste their money on state lotteries?

37.	Explain what a ____________ distribution is. Give three examples, using the three test statistics that we care most about in AP Statistics.

38.	What is the estimated s.d. of a statistic called? What is its abbreviation?

39.*	What does LOLN stand for? State it correctly and in one of the many ways in which people misconstrue it.

40.	What does CLT stand for? State it correctly and in one of the many ways in which people misconstrue it.

41.	In experiments, probability arises at the end in the form of a ____________ computed from the ____________ statistic. Describe the three ______ __ ___ ________ _______ and briefly describe how you would implement them when designing an experiment of possible interest to you personally.

42-51	In your own words, define each of the following and describe how it is determined or computed.

42.	test statistic

43.*	P-value

44.	level

45.	P(Type I error)

46.	P(Type II error)

47.	power

48.	df

49.*	sampling error

50.	critical value

51.*	m.o.e.

52.	Explain the difference between confidence level and confidence interval.

53.	Which is usually preferred: a one-tailed test or a two-tailed test? When should the decision be made regarding the type of test? What is the relevant question to consider in determining whether to use a one-tailed or two-tailed test?

54.	Why is it usually a very bad idea to use the word probability in any sentence involving confidence intervals? Is it possible to make a true statement that combines these terms?

55.	Can H₀ ever be proved? Why or why not?

56.	Can H_a ever be proved? Why or why not?

57.*	What is meant by statistical significance?

58.*	The purpose of ____________ statistics is to ___ ____________ ___ ____________ ____________ . (This is a much more difficult and sophisticated skill than descriptive statistics, in which we assume that any reasonably intelligent person should be able to read a table or a graph, compute s.d., add a LSRL trend line, etc. Be sure you explain this to people if they pooh-pooh your having spent a year studying statistics. There is much more to the subject than learning about means, modes, and medians!)

59.	Describe each step in the PHA(S)TPC process.

60.	Explain what blocking is, what it does, and why we care. What is “blocking to the max” called? Finally, complete this analogy: BLOCKING : EXPERIMENTS : : ___________ : SURVEYS.

61.	The AP formula sheet gives two versions of the s.e. for a 2-prop. z situation (difference of ____________). Explain how to tell which one to use.

62.	True or false: If there are two columns of data in an experiment, then the situation calls for use of 2-sample procedures. Explain your answer.

63.	Define the term bias and give several examples of types of bias.

64.	It can be proved, after a page or so of messy algebra, that s² is an unbiased estimator of . (Curiously, though, s is not an unbiased estimator of .) Describe the two other unbiased estimators we learned about during the year.

65.	Describe your thought process when deciding upon the type of statistical test (or interval) to use in various problems: 1-sample t, 2-prop. z, g.o.f., etc.

66-74	Describe how you would check assumptions in each of the following situations:

66.	1-sample z (STAT TESTS 1, 7)

67.	1-sample t (STAT TESTS 2, 8)

68.	2-sample z (STAT TESTS 3, 9)

69.	2-sample t (STAT TESTS 4, 0)

70.	1-prop. z (STAT TESTS 5, A)

71.	2-prop. z (STAT TESTS 6, B)

72.	g.o.f. (CSDELUXE)

73.	2-way (CSDELUXE or STAT TESTS C)

74.	LSRL t-test (STAT TESTS E)

75.	Give the “approved wording” for a conclusion to a statistical test that shows significance.

76.	Give the “approved wording” for a conclusion to a statistical test that does not show significance.

77.	Give the “approved wording” for a conclusion to a confidence interval problem.

78.	Describe how to transform an “interval format” C.I. into an “estimate m.o.e.” format.

79.	Describe, in general terms, how the t statistic is calculated.

80.	Describe how to use the result of #79 to get a formula for the s.e. of b that is much simpler than the one given on the AP formula sheet.

81.*	Data from a small sample, from a person’s own experience, or from a ____________ sample should usually be dismissed on the grounds that they are ____________ . However, data from large samples (for example, responses to on-line surveys or magazine subscriber surveys) are also often worthless. Why?

82.*	Does the m.o.e. of a statistic depend on the size of the population? Explain briefly, giving an example if possible.

83.	Is the binomial parameter p the same as the P-value of a test? What symbol is commonly used as an equivalent for 1 – p? Would the AP graders understand this without further explanation?

84.	What do the letters SRS stand for, and what is an SRS?

85.	Which assumption is more important, normality (if applicable) or the assumption that data come from an SRS? Why?

86.	Explain marginal and conditional probabilities. With what data (quantitative or categorical) are marginal and conditional probabilities usually computed?

87.*	What is meant by the saying, “Statistical significance is not the same as practical significance”?

88.*	There is a popular saying involving correlation (more generally, association) and causation. What is the saying, and what does it mean?

89.*	How does one prove causation?

90.*	Explain what is meant by double blinding, and why it is so important in clinical trials.

91.	There are four types of employees at XYZ Corp., whom we will call pitchers, catchers, infielders, and outfielders for lack of a more creative idea. All categories of employees have recently had large cuts in their mean salaries, and yet total payroll costs have risen. Is such a thing possible? Explain.

92.	There are four types of employees at XYZ Corp., whom we will call pitchers, catchers, infielders, and outfielders for lack of a more creative idea. All categories of employees have recently had large cuts in their mean salaries, and yet the overall mean salary per employee has risen. Is such a thing possible? Explain.

93.*	Give several examples of ways in which people lie with statistics.

94.*	Give several examples of questions you should always ask when hearing or reading a statistic for the first time.

95.	It has been said that 79.4% of all statistics are made up on the spot, that 5 out of every 3 Americans are weak at mathematics, that smoking is the leading cause of statistics, and that a statistician is someone who follows an unwarranted assumption to a foregone conclusion. Which of these flippant remarks is most unfair?

96.	Who coined the saying, “There are three kinds of lies: lies, d_____d lies, and statistics”?

97.*	Explain how odds work. In particular, given a probability P(A) expressed as a fraction, explain how to compute the odds in favor of the event as well as the odds against the event. Explain why “casino odds” never equal the mathematical odds.

98.	Explain the following paradox: For a gambler to return from a casino as a winner is not rare, yet casinos are reliably profitable.

99.	Explain what is meant by confounding, and give an example from your own life. (This is sometimes referred to as a lurking-variable situation.)

100.	Is poker a game of chance?

Happy Fact	You may delete Chebyshev’s Theorem from your brain. You will never see it again unless you study more advanced statistics.