Monthly Schedule

(AP Statistics, Period B)

M 10/3/05

HW due: Make sure that last Friday’s HW is complete. If not, then devote another 35 minutes. If it is complete, then remember that you have your group project to work on.

Warning: Anyone who does not have lists SALES and ADCST displayed in STAT EDIT mode (due Thursday 9/29) will not earn full credit on today’s HW check. You may need to recreate those lists if they were destroyed by another operation. Even if they were not destroyed, you may need to ask a classmate to show you how to add them to the STAT EDIT display so that they are shown as vertical lists. (I tried to show as many people as possible on Friday, but a few escaped before I could finish. The secret is the 2nd LIST menu.)

 

T 10/4/05

HW due: Read pp. 78-80. Also, for the several people who still did not have lists SALES and ADCST displayed in STAT EDIT mode yesterday, this will be checked yet again today (an easy 4 points for those who earned the points yesterday).

 

W 10/5/05

HW due: First, send me an e-mail with subject line beginning with double underscore (__) so that I can recognize the message as being non-spam. State your name, and send the message from the e-mail address that you check most frequently. IMPORTANT: STATE YOUR NAME. NO CREDIT WILL BE AWARDED FOR MESSAGES THAT ARRIVE WITH NO NAME. I will respond with data entry instructions. As you perform the data entry, you will need to make your best judgment concerning how the student would have entered the data if a computer had been provided.

Update as of 5:00 p.m.: I have received a grand total of two (2) e-mail addresses out of the 18 that I was expecting. So . . . below the dashed line are the instructions from Mr. Baad. Note: Sending me the e-mail message is still required as part of your HW assignment. When you have finished, send me a second e-mail with the serial numbers (found on top of first page) for the survey forms that you entered.

[Forwarded message from Mr. Baad follows.]
====================================
http://www.zoomerang.com/survey.zgi?p=WEB224LLTT6G44

      Here is the link. I’d like the boys to type in the text from the written answers as well as just clicking on the bubbles. There are about 15 surveys on which boys wrote comments on the back of the last page. If a boy gets one of those, make sure he types the comments in the box that is provided for question #30. Thanks again for your help.

 

Th 10/6/05

HW due: Work as many problems as you can on pp. 81-89. Note the following instructions.

1. Each time you get a problem correct, show your work. If there is no work to show, then write a suggested modification to the problem so that it would be more challenging.

2. Each time you get a problem incorrect, write a sentence explaining why you made the mistake you did and what (if anything) you have learned as a result.

3. All written work must be shown on a standard homework paper, not in the margin of your book.

 

F 10/7/05

No school.

 

M 10/10/05

No school.

 

T 10/11/05

No additional HW due. You have your group project to work on, and you should be sure you are fully caught up on existing HW. Also, although we were hobbled last Thursday by the unexpected absence of 39% (!) of the class, some worthwhile things were discussed, such as the Rule of 72. You are responsible, as always, for everything covered.

 

W 10/12/05

HW due: Group Project #1. Extensions are possible by special permission. Please note, last-minute extensions are generally not granted. (Reason: This is as good as an admission that you did not start working seriously until the last minute. If you had started earlier, you would have realized earlier that you needed to request an extension.)

 

Th 10/13/05

HW due: Send me an e-mail describing exactly what you did, or thought you did, during last week’s survey data entry. For example, where did you click on the “Submit” button? If you intentionally did not actually enter the forms whose serial numbers you gave me, state that. (Obviously, accidental omissions mean that you need to re-enter the data, but that is not an honor code violation.)

IMPORTANT: Please also comment on whether you think the following proposed combination of penalties for the students who confess to “pulling a Newman” would be too harsh, too lenient, or about right:

 

  • Zero on the homework
  • Zero on yesterday’s quiz (since, clearly, the answers were fanciful)
  • Both zeros undroppable
  • Additional 10-point penalty for the inconvenience and waste of time caused for Mr. Hansen and Mr. Baad
  • Must re-do data entry correctly for no points
  • Overall effect will be approximately half a letter grade lower on the quarter grade
  • No visit to the Honor Council

 

F 10/14/05

Big Quiz (60 pts.) on correlation, regression, residual plots, and curve fitting. This quiz will be quite similar to the Oct. 2000 Big Quiz. An answer key is available, but if you check the answer key before first attempting the problems, I will reach out through cyberspace and smite you with a reprimand.

You may also find the LSRL handout to be informative. It is a likely treasure trove of quiz questions.

 

M 10/17/05

HW due: Write #3 on p. 101 and the programmed learning exercise. A solution is provided to the p.101 problem, but you should answer the question on your own, comparing only at the end.

 

T 10/18/05

Form VI retreat (no class).

 

W 10/19/05

HW due: Consider the data set below and answer the questions that follow.

{(1, 5), (2, 6.1), (3, 8), (4.2, 9.5), (7, 12), (8, 13), (8.2, 13.5), (9, 16), (10, 19), (12, 22)}

1. Prove that the LSRL fit is inappropriate despite the high r value. What value for r did you get?

2. One possibility is that X and Y are related by an exponential function. If this is so, what function do you need to apply to Y so that a transformation to achieve linearity can occur?

3. Apply the transformation (i.e., inverse function) to Y and perform a LSRL fit between X and the transformed Y. Write your results: log y
» ________________ + ________________ x

4. Use precal techniques to solve the equation in #5 for y.

5. Perform STAT CALC 0 on your calculator to prove that your calculator can do the same thing as you just did, only faster.

6. In #5, what do the r and r2 values represent? ___________________________________

7. Another possibility, besides exponential, is that X and Y are related by a power function, i.e., a function of the form y
» axb where a and b are unknown constants. If this is true, then what do we get after taking the logarithm of both sides? (Simplify the equation, using precal techniques.)

8. If #7 is true, is it obvious to you that log y must be a linear function of log x? ________

9. Perform a LSRL fit with log x as explanatory variable and log y as response.

10. Summarize what you know: log y
» ________________ + ________________ log x, where ________________ plays the role of log a and ________________ plays the role of b.

11. Use precal techniques on the results of #10 to obtain a power-function mathematical model for y, i.e., an equation in the form yhat = axb.

12. Perform STAT CALC A on your calculator to prove that your calculator can do the same thing as you just did, only faster.

13. Which is the best model for the given data set: linear, exponential, or power? Why?

 

Th 10/20/05

HW due: Make sure yesterday’s assignment is ready to hand in, and add a paragraph (several sentences) of comment regarding our upcoming project to survey the dietary habits of STA students.

 

F 10/21/05

HW due: Read pp. 103-109.

 

M 10/24/05

HW due:

1. Explain Simpson’s paradox in your own words (approximately one paragraph). If possible, try to teach the paradox to a sibling who is old enough to understand percentages, or to a parent.

____________________________________________________________________

____________________________________________________________________

____________________________________________________________________

2. Then write a second paragraph explaining why the baseball statistics (see table below) illustrate Simpson’s paradox.

____________________________________________________________________

____________________________________________________________________

____________________________________________________________________

3. What is the lurking variable for the batters named Ahnold and Bloke? ____________________

4. Are the numbers in the BA columns correct? ______

5. Every pitcher is either right-handed or left-handed. Since Ahnold is a better batter than Bloke when facing right-handed pitchers, and a better batter than Bloke when facing left-handed pitchers, it seems logical that Ahnold must be a better batter than Bloke. How can it be that Bloke is better overall?

____________________________________________________________________

____________________________________________________________________

6. Consider two new batters named Cad and Dustbin. Let CR and CL denote the batting averages for Cad when facing right-handed and left-handed pitchers, respectively. Let DR and DL denote the batting averages for Dustbin when facing right-handed and left-handed pitchers, respectively. Invent 4 3-digit decimal values for CR, CL, DR, and DL, none of them ending in zero, such that CR > DR and CL > DL, but in such a way that no amount of fiddling with the lurking variable you identified in #3 could ever cause Dustbin’s overall batting average to be higher than Cad’s. In other words, invent 4 values that make the data “paradox-proof” against Simpson’s paradox. No work is needed. List your 4 values here:

CR = __________     CL = __________     DR = __________     DL = __________

 

 

 

Ahnold

 

 

Bloke

 

 

 

 

 

H

AB

BA

H

AB

BA

 

 

RH pitchers

112

 400

.280

162

600

.270

 

 

LH pitchers

150

 600

.250

 36

150

.240

 

 

Total

262

1000

.262

198

750

.264

 

 

Note for those who did not grow up reading baseball statistics: H = hits, AB = at-bats (i.e., number of trips to the plate, not counting walks or getting on base by being beaned), and BA = batting average. BA is defined as H/AB, expressed as a 3-place decimal.

 

T 10/25/05

HW due: Read pp. 121-122, and work all the way through the example in tomorrow’s calendar entry.

In class: Review day. There will be a Quick Quiz (graded and returned immediately) on bivariate data to see if you are on your toes. Two-way tables, Simpson’s paradox, and methods of data collection may be included.

 

W 10/26/05

No additional HW due today. This is your chance to get caught up on old HW. There may be an audit of past assignments today.

In class: Review.

 

Studying for the test (evening of 10/26)

The Big Quiz will be available for pickup Wednesday afternoon in the Math Lab beginning at about 3:00 p.m. For those who cannot make it and would like to have a good practice review, here are a blank copy and an answer key that you can use to challenge yourself.

Note: You should be able to complete the entire Big Quiz in 10-15 minutes (about 20 minutes for extended time). You might want to take it a second time to see if you can improve your speed and proficiency.

 

Th 10/27/05

Test on Bivariate Data, Including Two-Way Tables, Simpson’s Paradox, and Methods of Data Collection. For multiple-choice questions involving simple transformations to achieve linearity, you are allowed simply to punch buttons on your calculator (e.g., ExpReg or PwrReg). However, for free-response questions, you may be asked to show the derivation of the mathematical model. Here is a completely worked example:

Consider the following data set:

{(375, .1), (425, .21), (480, .4), (510, .51), (560, .7), (630, .88), (700, .97), (770, .995)}

Make a scatterplot. The relationship is the S-shaped curve we learned about in class. The official name is ogive (pronounced O-JIVE), and in fact you may already recognize this as being the cumulative normal distribution for a typical SAT Math test. [The calculator notation for this is normalcdf, but you are not allowed to write calculator notation except on scratch paper. If you accidentally write normalcdf somewhere on your test paper, make an “X” through it so that it will be ignored during grading.]

Next step: Transform Y by applying the inverse of the supposed relationship. The inverse of the cumulative normal distribution is called the inverse normal. Therefore, let f represent the cumulative standard normal distribution (
m = 0, s = 1), and let f –1 represent the inverse normal. [Note to student: You cannot write invNorm on your paper, since that is considered to be “calculator notation.” Points are deducted on the AP if you write words like normalcdf or invNorm.]

Let y = g(x) represent the “true” relationship between X and Y. Since we think g is a cumulative normal distribution for some unknown values of
m and s, we apply f –1(Y) to produce a new list that should, hopefully, be linearly related to X.

Perform a LSRL fit between X and f –1(Y). [Do it!] The LSRL model has extremely strong linear correlation (r = .99996), allowing us to say the following:

f –1(y)
» –4.943. . . + .0097. . .x

The only remaining question is what we should use for
m and s as we develop a model that relates the original X and Y. Plot the LSRL and note where the LSRL crosses the lines y = –1, y = 0, and y = 1. Because these indicate z scores of –1, 0, and 1, respectively [note to student: that’s the whole point of invNorm, to return a z score], we can use our calculator [2nd CALC INTERSECT] to find the crossing points, from which we deduce

m = 507.13. . .
s = 102.796. . .

Therefore, our model is

 = cumulative normal distribution with
m = 507.130, s = 102.796.

[It would certainly be more efficient to write =normalcdf(–99999,x,507.130,102.796), but alas, that is not allowed.]

Now, make a residual plot. [Calculator steps: Y= key, Y1=normalcdf(–99999,x,507.130,102.796) 2nd QUIT Y1(L1)
®L4 ENTER to create an L4 column that contains  values for each x value. Then, press
L2–L4
®L5 ENTER to store all the values of y, i.e., the residuals, into L5. Finally, set up a scatterplot of L1 versus L5 and press ZOOM 9.] The residual plot is suitably random, telling us that our cumulative normal distribution with the m and s values that we found would be an appropriate fit.

 

F 10/28/05

Make-up test, 7:00 a.m., Room R.

In class: No additional HW due, but some old HW may be checked or re-checked. Of course, a quiz on last Sunday’s “Unconventional Wiz” column in The Washington Post is always a possibility. (Search www.washingtonpost.com for Richard Morin if you forgot to read the column last week.)

The same writer, Richard Morin, also wrote a much longer article that ran on the same day (Sunday, Oct. 23) in The Washington Post Magazine. The subject was an opinion poll of teenagers in the D.C. area as compared with a sample of teenagers nationwide. This article may be worth discussing next week, but it will not be quizzed today.

 

M 10/31/05

HW due: Consider the following data set.

{(50, 9490), (51, 10400), (52, 11400), (53, 12500), (54, 13700), (55, 15000),
(56, 16400), (57, 17900), (58, 19500)}

1. State the linear model that best fits these points.
2. State the r value.
3. Sketch the scatterplot (with line overlaid) and the residual plot. What can you conclude?
4. Use the methods we learned to develop a fit from X to Y as a 7th power of a linear function.
5. Sketch a new scatterplot (with this new curved fit overlaid) and a new manually developed residual plot. What can you conclude?

 

 


Return to the STAtistics Zone

Return to Mr. Hansen’s home page

Return to Mathematics Department home page

Return to St. Albans home page

Last updated: 08 Nov 2005