AP Statistics / Mr. Hansen |
Name: _________________________ |
TI-83/84
STAT TESTS Summary
# |
Common name |
Assumptions |
Comments |
|
|
|
·
Rarely used,
since a 1-sample t test is more
realistic and accurate. (We never know ·
CLT lets us
relax normality assumption for larger samples (see p.251
of Barron’s, p.606 of Yates textbook, or bottom of
tan boxes on p.553 and p.595
of Peck/Olsen/Devore textbook). |
2 |
1-sample t test for mean (or for mean diff. of matched pairs) |
1. SRS |
·
CLT lets us
relax normality assumption for larger samples (see p.251
of Barron’s, p.606 of Yates textbook, or bottom of
tan boxes on p.553 and p.595
of Peck/Olsen/Devore textbook). |
|
|
|
[rare; use 2-sample t procedures instead] |
4 |
2-sample t test for difference of means |
1. two indep. SRS’s |
·
CLT lets us
relax normality assumptions when n1 and n2 are both ≥ 30, if no extreme skewness
or extreme outliers are evident. See p.251 of
Barron’s, p.606 of Yates textbook, or bottom of tan
boxes on p.553 and p.595
of Peck/Olsen/Devore textbook for details. Some textbooks say you can add
sample sizes together to reach your target, but play it safe and check for both
sample sizes at least 30. ·
Test is especially
robust when ·
When prompted by calculator for “Pooled,” always
answer “No.” The pooled (“equal
variances”) method was formerly listed on the AP formula sheet, but since the
College Board has dropped that formula, you should never use it. The pooled
method unrealistically assumes |
5 |
1-prop. z test (used for testing a single proportion against a benchmark) |
1. SRS |
·
Need large pop.
so that SRS resembles indep.
trials (sampling w/ replacement) to justify binomial
model. ·
Need np and nq (expected # of successes and failures) to be at least 10 as a
rule of thumb so that the z approx. to binomial is reasonable. ·
We use binomial
model as an approx., then z model as
an approx. of that. Thus there are 2 approximations of the true sampling
distribution. ·
Use
hypothesized proportions (p0 and q0)
to estimate p and q when checking assumptions 3-4 and
when computing s.e. by the
AP formula. |
6 |
2-prop. z test (test for difference between 2 props.) |
1. two indep. SRS’s |
·
Similar
comments as for #5. Some textbooks give 5 as the number of expected successes
and failures in each sample, but 10 is what our textbook uses. These are merely
rules of thumb to ensure that the z
approx. is reasonable. Important:
We nearly always have H0: p1 = p2.
In that case, do not use |
|
|
|
[rare; use 1-sample t procedures instead] |
8 |
1-sample t C.I. for mean (or for mean diff. of matched pairs) |
Same as #2. |
|
|
|
|
[rare; use 2-sample t procedures instead] |
0 |
2-sample t C.I. for diff. of means |
Same as #4. |
|
A |
1-prop. z C.I. for p |
Same as #5, except as noted at right. |
·
This time, do
not use the hypothesized values of p and q when checking
assumptions and calculating s.e. Instead, use the
observed values of |
B |
2-prop. z C.I. for p1 – p2 |
Same as #6, except as noted at right. |
·
This time,
always use the first s.e. formula on AP sheet, namely |
C |
|
1. SRS or census (for independence test); multiple SRS’s
(one per column) for homogeneity of proportions test 2.
all exp. counts |
·
Note that all
cells must be counts even
though H0
is stated in terms of probabilities. ·
Mechanics are
exactly the same for both matrix tests. Only the hypotheses change. ·
TI-83/84
computes each expected count cell by this formula: cell = rowtotal · coltotal
/ grandtotal. ·
Use df = (rows – 1)(cols – 1). ·
Expected counts
are usually not integers. ·
Some textbooks
have more complicated assumptions regarding expected counts. ·
Look at the
cells whose contributions to c2 are
the biggest when making inference about a rejected H0. |
D |
1. SRS |
·
Note that all
cells must be counts even
though H0
is stated in terms of probabilities. ·
Use df = (# of bins – 1). ·
Expected counts
are usually not integers. ·
Some textbooks
have more complicated assumptions regarding expected counts. ·
Look at the
cells whose contributions to c2 are
the biggest when making inference about a rejected H0. |
|
|
|
|
[beyond the scope of our course] |
F,G |
linear regression t test, confidence interval for LSRL slope |
Memory aid: LINER. |
·
Use df = n – 2, where n
= number of data points. ·
If the r2 value is reasonably close to 1, you could
theoretically satisfy both L (linear) and E (equal variance/homoscedasticity)
assumptions by stating the r2 value, commenting
that it is good since it is close to 1, and sketching the scatterplot and resid. plot. A good resid. plot will show that the LSRL fit is valid and that the fit
does not get noticeably worse as x changes. (Be sure to write
these observations so that the AP graders know that you know what you are
talking about.) However, beginning approximately 2021, the AP graders want to
see the E (equal variance/homoscedasticity) assumption split out separately. ·
The N
(normality) assumption cannot be verified, since you are never going to have
an acceptable number of residuals for
each x. Instead, we have to settle for looking at the residuals as a
group to see if they seem normally distributed. On the AP exam, all you have
to do to satisfy the N (normality) assumption is to show a histogram or stemplot of the residuals and assert that no gross
departure from normality is visible. Or, you can sketch the NPP of the
residuals (6th pictograph on 2nd STATPLOT Plot 1 menu). If the NPP shows a
relatively straight line, you know that the residuals are approximately
normal. A bend to the right (as you move your finger from left to right)
means right skewness, and a bend to the left means left skewness. Sketch your
NPP, and write a description of what the NPP is telling you. ·
The E (equal
variances) assumption should be listed separately, beginning approximately
2021. This is a College Board thing, since if you were paying attention,
satisfying the L (linearity) assumption and showing the resid.
plot already covers the E step. The only good news
is that in a 4/27/2022 Zoom call involving College Board–experienced
administrators and AP teachers from around the country, the consensus was that
any question on LSRL t-test or C.I.
for LSRL slope will probably state, “Assume that the assumptions for
inference have been met.” ·
Note the R
(randomness) assumption. As always, some degree of random selection is
required for valid inference. If the data points came from an experiment, you
would not choose an SRS; you would use all of your data and say that the
randomness assumption was satisfied by the design of the experiment, since if
the experiment was any good, it surely used random assignment of treatments. |
|
|
|
·
This is an optional
topic to cover after the AP exam. ·
ANOVA provides a
way to check for a statistically significant difference among several means. |