Monthly Schedule

(STAtistics, Period D)

W 9/5/07

First day of class. What is a statistic? What is “statistics” (the subject)? What is a parameter?

 

Th 9/6/07

HW due: Send me an e-mail message.

In class: anecdotal data, bias, Type I error, Type II error, tree diagram analysis of lie detectors.

 

F 9/7/07

HW due: Prepare a list of the four main topic areas for AP statistics. This assignment should not take long, since we already stumbled upon two of them in class, namely exploratory data analysis (i.e., the gathering and analyzing of data when there is no particular hypothesis being advanced) and probability. Please note, the “HW guidelines” link describes the required format for all written assignments.

Regarding the textbook situation . . . please do not make any markings at all in the textbook that you were sold earlier this week. Keep your textbook in a safe place where it will not get beaten up or damaged in any way. New books are on order and will be issued in about a week. In the meantime, we will discuss various statistical topics from supplementary sources.

Additional HW due: Check your e-mail. If you did not receive a thank-you message from me, then you still need to send me an e-mail. At this writing, I am missing Alek, Willie, Lawton, and John Z.

 

M 9/10/07

HW due:

1. Carefully read this PowerPoint briefing and be prepared for a discussion of all the points discussed. The subject is cause and effect. On the slides that have vertical bar graphs, green bars indicate the correct answers, and percentages indicate the responses by the audience. Two abbreviations used in the briefing are m.o.e. (margin of error) and COI (conflict of interest), which I mentioned to the audience when presenting this talk in June. I did not type the full terms on the slides, not because abbreviations are easier to read, but because abbreviations tap into the audience’s need to fill in missing pieces (Gestalt theory) and, if used in moderation, can help keep people from falling asleep.

2. What, in your opinion, is causing the recent sharp increase in the incidence of overweight and obesity in America? (Approximately two thirds of American adults are overweight nowadays, and the rates of obesity among children and adolescents are zooming upward. The rates have climbed dramatically over the past 20 years.) Give reasons to justify your answer.

3. Do you think that carbon dioxide (CO2) buildup in the atmosphere causes global warming? Why or why not?

4. Explain briefly why it is not possible to prove (or, for that matter, disprove) your answers to #2 and #3. Two to three sentences should be sufficient.

 

T 9/11/07

HW due: Design an experiment to address a cause-and-effect question that is important to you personally.

1. Phrase your topic in the form of a research question. For example:

Does Red Bull improve test performance?
How much of a slowdown in task completion is caused by multitasking?
How much of a cognitive impairment to the driving task does cell phone usage cause?
How much does an increase in MSG improve the perceived taste of cafeteria food?


2. Describe how you would divide your test subjects into control and experimental groups.

3. Describe how you would administer treatments.

4. Is double blinding possible in your experiment? Why or why not?

 

W 9/12/07

HW due: Only the recurring weekly assignment (see top of schedule). If you would like a bonus point, please bring in a clipping from a recent article, less than one week old, that prominently features a discussion of statistical controversy. The article on SAT scores in Monday’s Post would not be a particularly good example, since it dealt mainly with what the statistics are on minority achievement “gaps” in SAT scores, not a controversy on the subject of whether the scores are being measured correctly or whether the metric itself is valid.

Here is a link to yesterday’s “Quick Study” article. Note that there are 3 parts: cholesterol medications, eating disorders, and breast cancer.

In class: Random selection protocols.

 

Th 9/13/07

HW due:

1. Bring in a recent newspaper or magazine article (not Web-based) that provides an example of either “gee-whiz” graphs or manipulation of statistics. If you cannot find one, then bring in an article that uses statistics in a responsible manner, and describe in a short paragraph how the article could easily have been distorted through “gee-whiz” graphs or other means.

2. Write out the steps (and number them, please) for a methodology that would provide an unbiased selection of project group leaders and group members from our class. There should be 3 groups of 3 students and a group of 4 students. You may use Wednesday’s class discussion as a guide.

 

F 9/14/07

HW due: Prepare a one-paragraph proposal and a draft timeline for your exploratory data analysis project (keeping in mind that the dates are all different this year). If your group leader is absent today and has failed to appoint a deputy to hand in the assignment, then everyone in the group gets a zero.

 

M 9/17/07

HW due: Work toward tomorrow’s deadline. If you finish over the weekend, you will have no additional HW for Tuesday. As you do your reading, have your notebook (for making reading notes) and your calculator by your side.

 

T 9/18/07

HW due: Read pp. 1-48; write #1.38, 39, 41, 43, 44, and the following fill-in-the-blank exercise:

Mean and median are measures of __________ __________ , while s.d. and __________ are measures of __________ . Of these four, the resistant measures are __________ and __________ .

Note: Reading notes are required, as always. I will usually not provide a reminder of the reading notes requirement.

 

W 9/19/07

HW due: #1.53. In part (a), make both a histogram and a modified box plot. Also, read the “Quick Study” column in the Tuesday (9/18) Washington Post Health section. Handwritten notes are encouraged. There will be an open-notes quiz today and on many other Wednesdays. (This is a recurring assignment.)

Addendum: For some reason, there was no “Quick Study” this week. Therefore, the quiz is canceled. However, next week, unless announced differently, there will be a “Quick Study” quiz on Wednesday.

 

Th 9/20/07

HW due: Start reading pp. 66-90, including the examples; write p. 81 #2.15, 2.16, 2.17. You can do all of these even if you have not finished the reading assignment. For #2.15, use the keystrokes given in #2.14.

 

F 9/21/07

Test (100 points) covering textbook to middle of p. 79 and all material discussed in class. For practice, you may wish to do problems 1-20 from the fall 2000 test, although the coverage of material represented on that test is incomplete. The following topics are among the many topics that were discussed in class and hence are “fair game” for the test:

Mathematics is the science of abstraction (i.e., patterns). The practice of statistics is a branch of applied mathematics and hence is not strictly part of math.

Terminology (partial list): statistic, statistics, parameter, protocol, anecdotal data, bias, response bias, voluntary response bias, nonresponse bias, overcoverage bias, undercoverage bias, COI, hidden bias, researcher or experimenter bias, placebo effect, wording of the question, blinding, double blinding, m.o.e., cause and effect, “post hoc ergo propter hoc,” Type I error, Type II error, PPV (positive predictive value), sensitivity, specificity, conditional probability, sample mean, sample standard deviation, population mean, population standard deviation, population variance, population standard deviation, quartiles, range, interquartile range, median, outlier, 5-number summary, percentile, data distribution, symmetry, left skewness, right skewness, gaps, clusters.

Computation and/or graphing calculator techniques: finding a z score from the formula that is not on your formula sheet, namely z = (x
m)/s; displaying and interpreting a normal quantile plot; displaying and interpreting stemplots, boxplots, modified boxplots, histograms, and tree diagrams; using the table at the beginning of the book to find the percentile that corresponds to a z score, or equally well, the z score that corresponds to a percentile; computing and interpreting sample mean, sample s.d., and 5-number summary for any data set; applying the 1.5 IQR rule of thumb; using a tree diagram to compute PPV.

Symbols that you must be able to apply and interpret:












The most challenging topic listed above is probably PPV. Although we did two complete examples in class, here is a third to give you some additional practice:

Problem: AIDS affects 0.5% of the residents of Chiclawgo, a large Midwestern city. Compute the PPV of a screening test that has 99% sensitivity and 98% specificity.

Solution: This question asks for the positive predictive value, i.e., the probability of AIDS given that a person has a positive reading. This probability is denoted P(AIDS | positive test), where the vertical bar should be read as the word “given.”

We are given that P(positive reading | AIDS) = 0.99 and P(negative reading | ~AIDS) = 0.98. Always make the first split in the tree based on disease status.



Now, we compute the conditional probability of interest. Of all the positive readings (24,850 people), most of them (19,900) are false positives. Only 4950 are true positives. PPV = P(true positive | positive) = 4950/24,850 = 0.199. Even though the screening test is highly accurate, a positive reading is not necessarily a cause for alarm. Fewer than 20% of the people who receive positive readings in the screening test are true AIDS victims.

Type I error occurs when a non-infected person receives a positive reading. This is a false positive. The unconditional probability of a false positive is 19,900/1,000,000 = 0.0199.

Type II error occurs when an infected person receives a negative reading. This is a false negative. The unconditional probability of a false negative is 50/1,000,000 = 0.00005. Although this is very small, the results of a false negative in a blood supply would be severe. That is why blood donation services rely on more than simple screening tests; they use a battery of invasive personal questions to make sure that the population of donors is not a random sample of people, but rather a sample that has overcoverage from the pool of people who are at extremely low risk for AIDS.

Here are some additional results we can glean from the tree diagram are the following unconditional probabilities:

P(negative reading) = (50 + 975,100)/1,000,000 = 0.97515
P(positive reading) = (4950 + 19,900)/1,000,000 = 0.02485
P(positive reading
Ç AIDS) = 0.00495

Finally, you need to know the alphabet. The letters I am most picky about are s, i, z, l, y, and the Greek letters mu and sigma. The single most important one for statistics class is z, which must always be crossed. No exceptions!

 

M 9/24/07

No additional HW due. Please enjoy your weekend. The assignment that was originally due today has been postponed one day.

 

T 9/25/07

HW due: Finish reading pp. 66-90; write p. 84 #2.20.

 

W 9/26/07

Quiz on the Washington Post “Quick Study” resumes this week. There are three interesting studies this time. Handwritten notes are allowed during the quiz.

HW due:


1. Read pp. 90-97.

2. Perform the work in Example 2.10 and leave your calculator set up as described there. I plan to go around the room, checking to see if everyone has the screen displays shown in the textbook on pp. 94-95.

3. Write #2.26b. Be sure to transcribe the normal probability plot (a.k.a. NQP, for normal quantile plot) onto your paper. Put your values in L2 so that they do not overwrite the values you stored in L1 for Example 2.10.

 

Th 9/27/07

HW due: Revised project proposal and revised draft timeline. This does not need to be in a final form yet, but it should be fairly close. If your group leader is absent today, he must appoint a deputy to deliver the proposal and timeline.

The quiz that was postponed yesterday will be held today.

In class: Discussion of control group, block design, matched pairs design, lurking variables, push polling.

 

F 9/28/07

HW due:

1. Read this article on push polling.
2. Some people have proposed outlawing push polling. How might “push polling” be defined from a law enforcement point of view?

 

 


Return to the STAtistics Zone

Return to Mr. Hansen’s home page

Return to Mathematics Department home page

Return to St. Albans home page

Last updated: 11 Oct 2007