Monthly Schedule

(STAtistics, Period B)

T 3/1/011

HW due: As described in class. For people who missed class, or perhaps did not hear all of the details, some information is furnished below.

Group 1: Nick R.-S., Justin, Dominique
Group 2: Alex, Andrew, Phineas
Group 3: Daniel, Chick, Preston
Group 4: Jamie, Julien, Brennan
Group 5: Edward, Zeke, Jordan
Group 6: Nick S., Tip, Andrei, Ousmane

Group leaders (underlined above) are required to submit an experiment proposal, approximately one page handwritten or half a page typed. A hard copy is required. As always, if the group leader is absent for any reason, he must deputize someone else in the group to bring the required hard copy submission to class.

1. State your research question.

2. Write your proposed methodology. Human subjects are not required, but if you use human subjects, you must provide a consent form. All proposals will be reviewed and discussed before being approved. Mr. Hansen will serve as your IRB. If you use blinding (good) or double blinding (even better), be sure to specify how you will execute the logistical details.

3. Specify the statistical test(s) you intend to use (1-prop. z, 2-sample t,  g.o.f., etc.).

4. Predict the result you think you may see (the “effect size”).

5. Estimate the sample size you will need in order to demonstrate statistical significance. Do not worry about calculating this accurately; we will discuss how to do this in class.

An example is provided below.

1. Does Axe brand body spray change the acceptability of STA Third Formers to NCS 9th graders?

2. A panel of volunteer NCS freshmen will sit on the second tier of a set of bleachers. One by one, volunteer STA Third Formers will walk past, approaching within about 18 inches of each NCS panelist. After passing all of them, the STA student will turn around once at a designated mark on the floor, flash a thumbs-up sign, make eye contact with the panel, and then walk past them a second time and exit. Nothing shall be said. STA students will be sent at equally spaced intervals, and half of them (randomly determined) will have been spritzed with a measured amount of Axe brand body spray immediately before their strut. Panelists shall rate each STA volunteer on a Likert scale of 0 to 10 for each of 3 categories: neatness, athletic condition, and overall attractiveness.

3. Three 2-sample t tests will be used in order to compare the mean ratings of the control and treatment STA students for each of the categories: neatness, athletic condition, and overall attractiveness.

4. We predict that the ratings for athletic condition will show no statistically significant difference. We predict that the mean scores for neatness and overall attractiveness will be lower for the Axe-wearing students, with a mean effect size of −1.5 points on the Likert scale used.

5. A panel of 5 NCS students and 24 STA volunteers should be large enough to establish the predicted outcomes.

 

W 3/2/011

HW due: Answer the Group 2 power problem below. Then log your points.

Group 2 (Alex, Phineas, and Andrew) proposed yesterday to give quizzes to 2 groups of 17 volunteers, randomly divided into treatment and control. The treatment group will be informed in advance of the format and nature of the quiz, while the control group will take the quiz “cold” in a room in Marriott Hall. Scores will be computed based on time, with penalty time assessed for wrong answers. Alex and his group believe that the effect size (abbreviation: ES) that they will see will be a mean difference of about 2 seconds faster for the treatment group. For both groups, s = 6 seconds is a reasonable estimate of the s.d. of the scores.

In class, we decided that a 2-sample t test would be appropriate for testing the hypotheses

H0: True means are equal
Ha: Treatment group has a smaller mean.

Sketch the sampling distribution of the difference of means, subject to the H0 assumption. Then, determine (a) the probability of Type I error if  and (b) the probability of rejecting H0 if the true ES (not the ES that will necessarily be seen, but the true ES) is 2 seconds.

Probability (b) is called power and equals 1 minus the probability of Type II error. Sketches and supporting work are required.

Student question submitted by e-mail:

I really have no idea how to figure out what the [Type II] error value is, especially given that we do not actually have any data.  Could you point me in the right direction?

Answer:

If you have sketched the sampling distribution of the difference of sample means, you should have a bell-shaped (or actually t-shaped) curve centered on 0, with s.e. =

I presume you can sketch that. The s.e. formula comes from #2 in the tan box on p. 585.

If  we will reject H0 whenever the t statistic (based on control mean minus treatment mean) exceeds 1.7. (Where does the 1.7 come from? Look at the t* table at the very last back cover page of your textbook. With df = 32, which is what your calculator tells you when you run a 2-sample t test, you need to look in the .90 column, since that column gives you a right-tail area of 0.05. The t* value from the table is 1.7.)

OK, so now we know that whenever the t statistic is 1.7 or greater, we will reject H0. Since the s.e. is 2.058 seconds, a t* value of 1.7 means that we have to be (1.7)(2.058) = 3.4986 seconds above zero for our difference of sample means before our 2-sample t test will tell us that we have achieved statistical significance.

Remember that number: 3.4986. The question is, if the true ES, the true difference of sample means, equals 2 seconds, then how often will we achieve a difference of 3.4986 seconds or greater? Remember, that’s the goal, since if we do not surpass 3.4986, then we will not have enough evidence to reject H0.

I won’t do the whole problem for you, but at this point you are almost finished. You can use algebra to figure out how far (in t units) 3.4986 is above 2, and then you can use tcdf on your calculator to figure out the area under the curve to the right of whatever that answer was. The tcdf function gives you the answer to part (b), which we call power (statistical power of the test). If you want the probability of Type II error, which was not requested, you would subtract the power from 1.

 

Th 3/3/011

HW due: Read the supplementary material below; finish yesterday’s assignment; answer the problem below. Then log your points.

Supplementary reading:

Remember that the t statistic tells you the direction (positive or negative) of the ES, in standard error units. This is exactly the same as the situation we saw with the z test statistic, and going back as far as last fall, with the z score for univariate data.

Remember, if your teacher tells you that you scored z = −0.75 on a test whose mean was 78 and whose s.d. was 8, what does that mean? It means that you were three-quarters of a standard deviation unit below the mean. Since three-fourths of 8 is 6, your raw score must have been 72.

In a similar way, yesterday’s problem involved a t* score of 1.7. That value can be converted to seconds by multiplying by the s.e., namely 2.058, and adding the center value of the t distribution, namely 0. The result, in seconds, is 3.4986, and it is 3.4986 seconds that marks the “big fat red boundary line” that separates rejecting H0 from failing to reject H0.

To convert from seconds back into t units, simply apply the process in reverse: t = (difference in seconds)/s.e.

Supplementary problem:

(c) Show that the power of the test, the answer to yesterday’s part (b), is 0.236. Note that df = 32 (thanks to Daniel K. for the typo correction!).

(d) Compute the probability of Type II error, assuming that the true ES is 2 seconds.

(e) Explain why the power would more than triple if the true ES were 5 seconds. A carefully drawn sketch will suffice. Computations are welcome but are not required.

(f) Would the probability of Type II error increase or decrease if the true ES were increased from 2 seconds to 5 seconds? Explain briefly (no work required).

(g) The standard notation for the probability of Type I error is  and the standard notation for the probability of Type II error is  Explain why neither power nor  can be computed at all unless we first specify a particular instance of Ha, i.e., a particular value for the ES.

(h) Assume that Group 2 has re-thought the problem and has decided that the true ES is probably closer to 4 seconds. Also assume that Mr. Hansen has re-estimated the value of s to be 5 seconds. Now, compute the power of the test if Group 2 is willing to use 26 volunteers in each group, instead of 17. Is the power acceptable now? What do you think? Hint: Since df = 50 now, you can safely average the t* entries for df = 40 and df = 60 in order to obtain the new t* value.

 

F 3/4/011

HW due: Answer the problem below and revisit parts (e) through (h) of yesterday’s problem. It is presumed, at this point, that you already have completely correct, clear, legible answers to parts (a) through (d). Log your points based on the quality of (e) through (k).

Supplementary problem:

Carefully sketch a bell-shaped sampling distribution for a statistic in a two-tailed significance test, and show the big fat red lines that mark the boundaries of the “reject H0” regions. Label all three regions appropriately. Then, use very thin tissue paper (or, if you prefer, a piece of clear plastic) to show an alternative sampling distribution of the same size and shape. Move your alternative distribution (your “Ousmane,” so to speak) left and right in order to answer the following questions. Bring your sketch and your overlay to class.

(i) If everything else stays the same, but the center point of the alternative distribution is shifted from +4 units to +6 units, what happens to power and ? Do they increase or decrease?

(j) If everything else stays the same, but the center point of the alternative distribution is shifted from +4 units to −4 units, what happens to power and ? Do they increase or decrease?

(k) If everything else stays the same, but the  value is increased, what happens to power and ? Do they increase or decrease?

 

M 3/7/011

No additional HW due. This is your chance to get some sleep and/or catch up on older assignments. However, please read the material below if you would like to correct your notes from Friday.

Note: Last Friday, in the notes written on the board, the sample wording was presented in this format: “The power of the test is _____ against the _____ alternative.” The first blank is obviously a number, expressed either as a decimal or as a percentage. As for the second blank, students requested some sample content, and your fearless instructor provided the following:



Well, guess what? That cannot possibly be correct, since hypotheses (both null and alternative) must always refer to parameters, not to statistics. Therefore, the second blank could be filled in with something like the following:



or

true ES = 4.

Here are some examples of correct ways to word statements involving statistical power:

1. The power of the 2-sample t test in this situation is 88% against an effect size of 4 units.
2. The power of the 2-sample t test is 0.88 against the alternative hypothesis that .
3. The power of the 1-prop. z test (with H0: p = 0.38) is 0.65 against the alternative that p = 0.32.
4. The power of the LSRL t test (with H0: ) is 99.3% against the  alternative.

 

T 3/8/011

HW due: Read pp. 695-697 (omitting green box on p. 695), tan box on p. 704, middle of p. 706 to bottom of p. 707. Instead of the tan box at the top of p. 707, read the simplified assumptions for the LSRL t test on our STAT TESTS handout. Then, write the problem below and log your points.

Problem:

(l) Carefully sketch plausible sampling distributions related to H0 and Ha in some imaginary problem involving a one-tailed t or z test for which . Label the “reject H0” and “fail to reject H0” zones clearly, and show the boundary between them as a thick line. If you have a second color of pencil (or a colored pen) that you can use for the boundary line and the Ha sampling distribution, so much the better. Position the Ha sampling distribution in such a way that the power (i.e., the portion in the “reject” zone) is 30%.

Question: If the sample size quadruples while all other aspects of the problem stay the same, what happens to the power? Estimate a numeric value for your answer—but do not attempt to use formulas to compute an answer unless you are a real glutton for punishment. A second sketch is required. Score your work based on neatness.

Big hints:

1. All of our s.e. formulas involve n (or some variation of n1 and n2) in the square root of a denominator. Therefore, a quadrupling of sample size will result in a halving of s.e., since  Both distributions in the second sketch must therefore be taller and narrower.

2. The big fat red line will have to move as a result of hint #1. This is to be expected. Simply adjust the position of the big fat red line, in the second sketch, so that  is kept constant.

3. Leave ES unchanged in the second sketch. Remember, ES tells you the center of the Ha sampling distribution.

 

W 3/9/011

HW due: Write #13.1, 13.4acd, 13.62ab. Then log your points. Remember, if you claim 4 points, you are pledging not only that you made a solid effort for 35 or more minutes on the night of the assignment, but also that you will correct the work fully based on any later classroom discussion.

For #13.62a, the answer is not “yes” or “no”; you need to run a PHA(S)TPC test using

H0:
H
a:

Also, note that there is a typo in the regression equation, which should be

For #13.62b, the hypotheses become

H0:
H
a:

For both #13.62a and #13.62b, you will need to refer to the raw data on p. 236 in order to check assumptions. You may use either the book’s list on p. 707 or the simplified list on our STAT TESTS handout.

If you choose the simplified approach, you would check assumptions 1-3 as follows:

1. Say, “The LSRL is a good fit, so that the mean y value for each x value lies on the LSRL.” A reasonably high value of r2 (anything more than about 0.3) and a residual plot devoid of patterns or outliers will suffice. Be sure to show a sketch of the residual plot, and state the r or r2 value.

2. Assert that the variability of the residuals does not depend on x. This conclusion is supported if the residual plot does not show any “flange” outward (as in Figure 13.15(c) on p. 717) for especially large or small values of x. Write, “The residual plot shows that the variability of residuals does not seem to change as x changes.”

3. Assert that the residuals are normally distributed about the LSRL. Again, this is impossible to check thoroughly, but if the NQP of the residuals is reasonably straight, the assertion is reasonable. You will need to show a histogram or an NQP of the residuals as documentation.

If you choose the book’s approach, you would write out 4 assumptions and would deal with them as follows:

1. Assert that the true residuals (what your book calls error values, e) are centered about the LSRL. There is no way to prove this, but if the residual plot (which you will have to sketch) shows no patterns, this is a plausible claim. Remember, we do not know what the e values are; all we know are the sampled residual values, which may or may not have the same distribution as the true e values.

2. Same as #2 in the simplified list (see above).

3. Same as #3 in the simplified list (see above).

4. Assert that the true residuals (e values) are independent. There is no convenient way to check or prove this, and this assumption is frequently violated in practice.

For #13.62a, use your calculator’s STAT TESTS LinRegTTest feature after entering the data in L1 and L2. For #13.62b, you will have to calculate a new t value, since you cannot use the 15.26 in the table on p. 742. Hint: The t value you will use, again with df = 9, is −1.485, but you need to show your work. The computation of t = −1.485 is easy (a single step, simple algebra), but you need to show it.

 

Th 3/10/011

HW due: Finish #13.62b based on the large extra hint given in class; write #13.64abcdef. In part (f), use 95% for your confidence level. Then, log your points and take Mr. Hansen’s AP poll.

 

F 3/11/011

HW due: Prepare for our review day by writing at least one item that is both (a) a question for which you care to know the answer and (b) a question, or a fragment of a question, that could legitimately be asked on Monday’s test. Then, log your points.

If you wish to write more than one, that is fine, too.

IMPORTANT: More than half the class also still needs to take Mr. Hansen’s AP poll. If you do not answer the poll, there will be a small point penalty, and Mr. Hansen will hound you mercilessly over the weekend.

Example review question (good): Use sketches to show that if all other aspects of a 2-tailed statistical test remain unchanged, then an attempt to reduce the probability of Type I error will always reduce the test’s power.

Example review question (mediocre): A LSRL t-test involves 37 data points (i.e., 37 ordered pairs). Compute df.

Example review question (mediocre): A  test for independence involves two categorical variables, one with 5 different values (VG, G, A, P, VP) and one with 3 different values (red, blue, green). Compute df.

Example review question (poor): What name is given to the quantity

In class: Review for Monday’s test. We will go through as many of your questions as time permits.

 

M 3/14/011

Test on Chapters 12 and 13 ( tests, LSRL t-tests, and power), 100 points. As announced in class last Friday, your writeup of #13.64abcdef will be collected before the test starts. The assignment will be graded for correctness, neatness, and completeness (4 points for each). Neatness need not be excessive, but legibility and proper notation are required. There is no need to log the points, because this assignment will be collected from each student. If the only way you can get the correct set of answers is by copying Mr. Hansen’s work (posted on hwstore.org), then that is what you will have to do. Copying is not the best way to learn, but it is a possible way to learn, provided you pay attention to what you are writing.

There are 103 points on the test, but the test will be scored out of 100. Thus there are 3 bonus points built-in, plus a fourth bonus point if you remember your spare batteries. Format of the test is as follows:

Part I: Definitions (8 terms, 1.5 points each, 12 points total)
You will be provided with 8 terms and a list of 14 possible definitions. On the list of 14 possible definitions, 8 are correct and 6 are phony. Your task is to write a correct definition for each of the 8 given terms. Note: You will not be allowed to do the matching by using letters or drawing lines. You must rewrite the definition exactly as presented in the list of 14 choices. Suggested time, since the answers are provided and need not be fully retrieved from “deep memory,” is 4 minutes, 6 minutes for extended timers.

Part II: Power Sketching (8 points)
You will be given a scenario involving either a one-tailed or a two-tailed t or z test, with specified values for , s.e., and ES. You will be asked to estimate by how much the power will increase or decrease when some aspect of the test changes. Two reasonably neat sketches are required, with two sampling distributions for each. Suggested time is 4 minutes, 6 minutes for extended timers.

Part III: Calculation of an Expected Count in a 2-Way  Test (6 points)
You will be given a 2-way table and told to calculate the expected count at a certain position. Work is required. For example, if the table is a 3 x 5 table (3 rows, 5 columns), and you are asked to find the expected count at row 2, column 1, you would proceed as follows:
Original 2-way table:

Grand total is 281.
Row 2 total is 62.
Column 1 total is 41.



which is easily verified by running STAT TESTS and checking the “expected” matrix. Note that this matrix also satisfies one of the assumptions for a 2-way  test, namely that all expected counts are at least 5. (If necessary, you can relax the assumption and say that all expected counts are at least 1, and no more than 20% of the expected counts are less than 5, but the “all expected counts are at least 5” assumption is OK and is certainly easy to check.)

Suggested time is 3 minutes, 4.5 minutes for extended timers.

Part IV:  Tests (two statistical tests, one scored 3,3,6,3,3,6 for PHA(S)TPC, the other 0,3,6,3,3,6)
One of the tests will be a g.o.f. test, and one of the tests will be a 2-way test for independence or homogeneity of proportions. The tests will not be identified. Extended timers will do only one of these, but the choice will not be announced until the test is given, and the problem type will not be identified. Suggested time is 13 minutes for each test, or 26 minutes total for regular timers (same as AP standard).

Note: Calculation of expected counts need not be shown, since that skill was already tested in Part III. Just let your calculator do the work when finding expected counts. The only work you are expected to show is the calculation of the first two (2) terms of the  statistic.

There is no need to define parameters for a 2-way  test (independence or homogeneity of proportions), since that would be unacceptably tedious, especially for a large 2-way table.

Here is an example of a g.o.f. problem, with full writeup and work:

Problem: George claims that the M&M’s in his store have a distribution pattern of 15% for red, orange, yellow, and green, 20% for blue, and 20% for brown. His business partner, Georgette, pulls an SRS of 1000 candies and finds 15.1% red, 17.2% orange, 13.7% yellow, 14.2% green, 20.1% blue, and 19.7% brown. Is there evidence against what George has claimed? Use  = 0.05.

Solution:
    Let pred, porange, etc. = true proportions of red, orange, etc.
    H0: pred = .15, porange = .15, pyellow = .15, pgreen = .15, pblue = .20, pbrown = .20
    Ha: At least one proportion is not as claimed.
    Assumptions for  g.o.f. test:
        SRS? Yes, given.
        [All data converted to counts?  easily done by multiplying by n = 1000]
        All exp. counts  5? Yes, they are 150, 150, 150, 150, 200, 200, resp.
    Test statistic:


    P = 0.436 with df = 5
    Concl.: There is no evidence (n = 1000,  = 4.837, df = 5, P > 0.4) against George’s claims regarding the true distribution of M&M colors.


Part V: LSRL t Test (3,3,6,3,3,6 for PHA(S)TPC, 8 points for additional questions)
This problem will be similar to #13.62 or #13.64 in the textbook. Note that you may be required to deduce the value of  based on information in a computer-style printout or based on information disguised elsewhere in the problem.

You may also be required to calculate a C.I. for the LSRL slope, b1, and you may be asked to interpret the LSRL slope in the context of the problem. Interpretation must follow this template: “For each additional 1 unit of __________ , the model predicts an increase [or decrease] of __________ units of __________ .”

Suggested time is 13 minutes (same as AP standard).

Total time is 50 minutes, or 55.5 minutes for extended timers. Extended timers should plan on staying a few minutes late. Tardiness excuses to the next class will be provided for extended timers but not for regular timers.

 

T 3/15/011

HW due: Pick up the pieces on yesterday’s test, as described below. Your #13.64abcdef will also be collected. Do not log the points, since both assignments will be collected. Record your elapsed time for each problem. If you work with classmates, you must list their names. Working with classmates or tutors is permitted only if the written work on the page is your own work (not copied). Copying of someone else’s work would be an honor violation, except for #13.64, for which I stated earlier that you could copy from the version posted at hwstore.org if you had to.

The timings on the test were fairly accurate, but only for people who had practiced and kept moving relentlessly forward. Only a few students submitted papers that were essentially complete. Almost everyone omitted most or all of #13 from the test. Therefore, I expect everyone to redo the entire problem #13, beginning to end, except for Andrew, who needs only to use a correct value of  in part (c) when computing the confidence interval.

People who were sick on the day of the test also need to do #13 from the test, since it is quite similar to #13.64abcdef from the homework and will make a good review for the make-up test.

Alex, Chick, Daniel, Tip, Nick S., Jordan, Dominique, Edward, and the extra-timers all need to redo #12 from the test as part of their homework due today.

Phineas, Alex, Jordan, Preston, and Andrei all need to redo #10 and #11 from the test as part of their homework due today.

Finally, in the interest of fairness, anyone else who wishes to redo #10 and #11 or #12 may submit a version for “enhanced consideration.”

 

W 3/16/011

HW due: Finish your “patching up” from the test if you have not already done so. Starting today, start bringing your Barron’s AP review book to class instead of your course textbook.

 

Th 3/17/011

HW due: Prepare for a 2-question quiz based on the Must-Pass Quiz. You will be given 2 questions, a starred one and a non-starred one, both selected at random. If you miss any portion of the starred question, your score is 0. If you get the starred one correct, your score will be somewhere between 5 and 10.

Handwritten notes will be permitted for today’s quiz. However, when you take the MPQ for real, in May, no notes are allowed.

In class: There are still 3 students who need to select their spring break book to read. After all reading assignments are settled, we will finish going over the AP formula sheet, which is found near the end of your Barron’s review book. Be sure to bring the Barron’s book with you!

Chick is on the tote board with 44 (0.5 pt.), 2 (1 pt.), and 53 (0.6 pt.), 1.1 mulligans remaining. If he fails, there will be no penalty.

 

F 3/18/011

No additional HW due. However, a general HW check is likely. If you take a cut today or are absent for any reason, you will need to call Mr. Hansen over spring break and read randomly requested problems. (Or, scan and fax.)

 

 

Spring break.

 

 


Return to the STAtistics Zone

Return to Mr. Hansen’s home page

Return to Mathematics Department home page

Return to St. Albans home page

Last updated: 05 Apr 2011