AP Statistics / Mr. Hansen
4/30/2002 [rev. 4/17/2009, 4/27/2019]

Name: ________________________

Study Guide for Formula Sheet Quiz

Objective: Learn how to mark up your formula sheet, transforming it from a confusing lump of paper into a useful tool that will help you on the AP examination.

On the quiz, as on the real exam, you cannot use written notes. You really need to know this material cold, because if you have to punch buttons on your calculator to remind yourself of how things work, you will be too slow.


There are 29 formulas and one table modification, plus an additional cluster of 3 formulas that you should insert after the conditional probability formula. This handout shows exactly what you need to know. Write the “short title” and perform the “other action” related to each formula. Information from the “additional knowledge” column is also worth knowing.

Page numbers shown below (1, 2, 3, and 6) refer to the first, second, third, and sixth pages of the AP Statistics formula sheet. Those page numbers are marked as 12, 13, 14, and 17 at
https://secure-media.collegeboard.org/digitalServices/pdf/ap/ap-statistics-course-description.pdf, which is the official AP course description published by the College Board. However, if you print that PDF file from your Web browser, you will need to add 4 to each page number when using the File Print command. For example, if you wish to print page 14, you must enter 18 in the “Pages” field of the File Print dialog.

Your Barron’s review book, near the end of the book, also has the same formulas in the same format.

Page and
Formula #

Short
Title

Other
Action


Additional Knowledge

p. 1 #1

sample mean

X out.

Surely you have known since you were a child how to compute a sample mean.

p. 1 #2

sample s.d.

X out.

Instead, use STAT CALC 1 to get s.

We divide by n – 1 instead of n because we want E(s2) to equal , and algebraically, dividing by n – 1 is the only way to do it. (If you are a glutton for punishment, you can actually prove this.) Curiously, E(s) does not equal .

p. 1 #3

pooled estimator of sample s.d.

X out.

In a pooled 2-sample t test, which we never use, one would use sp instead of  in the formula for s.e. (p. 3 #4). Ignore both of these formulas.

p. 1 #4

LSRL:  as predictor of y

Circle lightly.

Not useful by itself, but serves as “gateway formula” (term invented by former student Matt) for the LSRL formulas that follow.

p. 1 #5

b1 = LSRL slope

X out.

p. 1 #6

b0 = LSRL y-intercept

Circle it.

Remember: LSRL passes through .

p. 1 #7

linear correlation coefficient

X out.

Use STAT CALC 8 instead. Make sure your 2nd CATALOG Diagnostic is always set to “On.”

Remember: r2 (the coefficient of determination) equals the portion of variation in y that can be explained by variation in x. For example, the correlation between SAT scores and first-year college GPA is approx. r = 0.42, which means that about 17.6% of the variation in the GPA of college freshmen can be explained by the variation in their SAT scores. (Not a very strong predictor, is it?) Alternate wording: 17.6% of the variation in GPA is explained by the linear relationship between SAT scores and GPA.

p. 1 #8

LSRL slope again

Circle it.

You must know that sx and sy are the s.d.’s of the x and y values considered separately. (Use STAT CALC 2.)

p. 1 #9

s.e. of the LSRL slope

X out and replace.

A much better formula is  In English, this says that  (the standard error of the slope) equals the slope divided by the LSRL t statistic.

Note: If you know any two quantities from among , b1, and t, use  to solve for the third.

Do not confuse  with s or se, which your textbook uses to mean (approximately) the s.d. of the residuals. You do not need to know s or se for the AP exam; you need only know , and you would use  to get it, not the messy formula shown on p. 1 of the formula sheet.

p. 2 #1

General Union Rule—always true

Circle and draw a Venn diagram.

Here is a good example:

In this example, P(A) = .3, P(B) = .5,  = .2, and  = .3 + .5 – .2 = .6. You can see that the formula works and makes sense.

The formula works even if the events are disjoint (i.e., with no overlap):

If C and D are in the same universe and are disjoint as shown, the sum of their probabilities cannot exceed 1. P(C or D) =  = .9 by inspection and by formula.

p. 2 #2

conditional probability formula—always true if

Circle it.

In the first example above (with events A and B), do you see that  and  You can compute these with common sense, but you should also verify that the conditional probability formula works, since there are times when you it.

p. 2 #2½ (insert these)

ways of checking for indepen-
dence

Insert 3 checks.

Write the following:

A, B (non-null) are independent
          iff  = P(A) P(B)
          iff P(A | B) = P(A)
          iff P(B | A) = P(B)

Explanation: Two non-null events A and B are independent iff the probability that both occur equals the product of the probabilities, as shown above in mathematical notation. The other two checks, which are equivalent, come from substituting the first equation into the conditional probability formula.

Note that independence is not satistified in either of the diagrams above. In the first,  = .2, not .15, and in the second,  = 0, not .2. There is no quick way to “see” independence in a Venn diagram. Do not confuse independence with disjointness.

One of the most common student errors is to try to find  by multiplying P(A) with P(B). Students often “learn” this false formula in 7th or 8th grade, when textbook examples are all easy. The formula  = P(A) P(B) is true only if A and B are independent events. The formula is not true in general. When in doubt, use the general intersection rule. The general intersection rule (which can be easily derived from p. 2 #2) says

           = P(A) · P(B | A)
and      = P(B) · P(A | B).

Below is a Venn diagram that satisfies independence, but you can’t “see” independence merely by glancing. Do not confuse independence with disjointness. The independence in the diagram below is a property of how the numbers work out:

You should verify that P(A | B) = .4/(.4+.1) = .8 = P(A), and that P(B | A) = .4/(.4+.4) = .5 = P(B). Independence means that P(A) is not affected by whether B occurs or not, and P(B) is not affected by whether A occurs or not. The unconditional probability of A is .8 (since .8 is the total probability shown within ring A), which is the same as the conditional probability of A given B. The unconditional probability of B is .5 (as you can clearly see within ring B), which is the same as the conditional probability of B given A.

p. 2 #3

expected value (mean) of r.v. X

Circle it.

This is what we called the “sum of the pixies.” AP uses xipi instead of pixi, that’s all.

p. 2 #4

variance of r.v. X

Circle lightly.

Note that the notation , which we used in class, is more logically correct than , since we are talking about the variance of random variable X, not of a single x value. Both notations are correct. If you prefer, simply write var(X) as shown in the formula.

The formula itself is useful only if you are asked to show the computation of var(X) manually, which is exceedingly unlikely. You should normally use your TI-83 or TI-84 with 1-Var Stats instead. If you have forgotten how, here is a quick refresher course:

1. Enter payoff values (xi) in L1.
2. Enter probabilities (pi) in L2.
3. Punch STAT CALC 1 L1,L2 ENTER.

Since the frequencies are not integers, your calculator correctly infers that this is a probability distribution of a random variable. When you do STAT CALC 1, your calculator computes  and leaves sx blank, since there is no sample involved.

p. 2 #5

binompdf(n,p,k)

Circle it.

You may need this when showing work (and remember, you can’t write “binompdf” since that is considered illegal calculator notation). A binomcdf (cumulative probability) calculation will contain several terms of this type when you show work.

p. 2 #6

expected value of binomial r.v.

Circle it.

Common sense: multiply # of trials times probability of success on each trial. We expect an 80% free-throw shooter to hit 29.6 shots in 37 tries. Note that the expected value is often not an integer.

p. 2 #7

s.d. of binomial r.v.

Circle as .

p. 2 #8

expected value of sample proportion

Learn the concept.

Formula is not useful, but the concept needs to be learned:

“The sample proportion, , is an unbiased estimator of the population proportion, p.” In other words, the expected value (a.k.a. mean) of  equals p.

p. 2 #9

s.e. of sample proportion

Circle as .

This formula is repeated on p. 3 in a different context. However, the formula is true even if you are not running a 1-prop. z test.

p. 2 #10

expected value of sample mean

Learn the concept.

Formula is not useful, but the concept needs to be learned:

“The sample mean, , is an unbiased estimator of the population mean, .” In other words, the expected value of  equals .

p. 2 #11

s.e. of sample mean

Circle it.

This formula is repeated on p. 3 in a different context. However, the formula is true even if you are not running a 1-sample t test.

p. 2 #12

extension of the  concept we learned in the fall

Circle it.

The “statistic” in the numerator is our measured outcome (from 1 or 2 samples), and the “parameter” in the numerator is the value asserted by H0 (which is either a given, in the case of 1-sample and 1-prop. tests, or 0, in the case of 2-sample and 2-prop. tests). The parameter is also 0 in the case of a matched-pairs t test (which is really a 1-sample test on the column of differences) or a LSRL t test. Therefore, we can summarize all of this by writing
z or t test statistic = .

p.2 #13

C.I. = est.  (z* or t*)  s.e.

Circle it.

Also use a bracket to indicate that the second term, namely

(critical value)  (standard deviation of statistic),

equals the m.o.e.

p. 3 #1

STAT TESTS 2, 8

Write
“2, 8.”

Use s instead of , since  is unknown. Remember, we do not use STAT TESTS 1 or STAT TESTS 7.

p. 3 #2

STAT TESTS 5, A

Write
“5, A.”

When running a 1-prop. z test (STAT TESTS 5), use p0 and q0 (hypothesized values) for p and q. When running a 1-prop. z interval (STAT TESTS A), use your best information, namely  and , for p and q. Your calculator does all of this for you automatically. The only reason you need to know would be if you are showing your work (recommended if time permits).

p. 3 #3

STAT TESTS 4, 0

Write
“4, 0.”

Use  instead of , since the true standard deviations are unknown. Remember, we do not use STAT TESTS 3 or STAT TESTS 9.

p. 3 #4

Do not use.

X out.

This is where you would use sp (p. 1 #3) if you were doing a “pooled” 2-sample t test or a “pooled” 2-sample t interval, but we never do.

p. 3 #5

STAT TESTS B

Write
“B.”

For  use the best information you have, namely

p. 3 #6

STAT TESTS 6

Write
“6.”

This assumes that H0 asserts p1 = p2, which is almost always the case in a 2-prop. z test. In the extremely unlikely event that your H0 asserted something different, say, that p1 = p2 + 0.03, you would use p.3 #5 instead. That should never occur on the AP exam.

In the formula, p and q are to be approximated by the weighted sample proportions, where

.

The formula for p may look confusing, so let’s use common sense and a concrete example. In a 2-prop. z test to see whether the proportion of children with asthma is different for African-Americans vs. whites, n1+n2 = total # of children in study, and that is the denominator. The numerator adds up all the children with asthma:  = no. of African-American children with asthma in the study,  = no. of white children with asthma in the study. In practice, we simply use counts: p = total number with asthma divided by total number in study.

p. 3 #7

STAT TESTS C, STAT TESTS D

Write
“C, D.”

You should show your work—several terms—if performing a c2 test. Use this formula for all 3 types of c2: goodness of fit, independence, and homogeneity of proportions.

p. 6

[adjustment]

z*

Write z* on the row marked for df = .