AP Statistics / Mr. Hansen

Name: __________________________

Test #1, 9/30/1998

General Instructions: Raise your hand if you have a question. Write answers in the space provided. Use additional notebook paper or graph paper of your own if you need additional room.

Part I. Terminology.

Fill in the official name and standard notation for each of the following. The first one has been done for you as an example.

#

Description

Official name

Standard notation

1.

Total number of observations

sample size

n

2.

The median of the set from the maximum down to (but not including) the overall median

   

3.

Q3 – Q1

   

4.

The sum of observations in a sample, divided by n

   

5.

Take the deviations from the mean, square them, add them all up, and divide by n–1. Now take the square root of this result. What you have is a measure of the dispersion (i.e., variability) of your data.

   

6.

A value from –1 to 1, depending on whether the observations trend upward, downward, or have no linear trend at all

   

Part II. The False-True Challenge.

All of the numbered statements below are false. In each case, make a minor change (such as adding a word or two, crossing a few words out, or changing the wording) to make the statement true. However, you may not simply add or subtract the word "not" (since, obviously, that would be too easy). The first one has been done for you as an example. Important: Make changes that demonstrate your knowledge of the subject matter. Points will be deducted if you take the easy way out (for example, changing "useful" to "fairly useless" in #12).

[to make correction, cross out 2.5 IQR and write 1.5 IQR]
7. Although human judgment is better, the "2.5 IQR" ^ rule is useful in cases where automated determination of outliers is needed.

8. Approximately 95% of the values in a distribution will lie within plus or minus 2 standard deviations of the mean.

9. In a distribution that is symmetric, the mean lies significantly to the left of the median.

10. The mean is a resistant measure of central tendency.

11. The five-number summary, standard deviation, and IQR are affected by changes in units involving only a shift (e.g., Celsius to Kelvin).

12. A boxplot is a useful diagram for showing the essential characteristics of a two-peaked (or "bimodal") distribution.

AP Statistics / Mr. Hansen

Test #1, 9/30/1998 (continued)

Section III. Problem Solving.

Use your knowledge of statistical methods to solve the following problems. Your graphing calculator will probably be a big help to you on most of these.

Judge Jeremy Jones (known as the Law of Speed Trap, South Carolina) is known far and wide for the strictness of his speeding fines. The speed limit within the town of Speed Trap is 55 mph. Interstate 55 (get it?) passes through his jurisdiction and has been generating a reliable source of revenue for Judge Jeremy’s county for many years. Recently a researcher made a study of the speed of people ticketed on the Interstate highway and the fine that Judge Jeremy assessed in each case. A representative subset of the data (shown below) might make a tempting regression study.

Fine assessed

Speed (according to radar)

$200

72 mph

$215

73 mph

$145

65 mph

$175

69 mph

$175

70 mph

$150

65 mph

$135

62 mph

$210

68 mph

13. If we are writing an article for a motorists’ club, we might want to be able to predict the likely fine that would result from various levels of speeding in Judge Jeremy’s area. What would be the explanatory variable? ______________________ What would be the response variable? _____________________

 

14. Code the speeds as speeds in excess of the speed limit, and create a histogram with divisions ("bins") at every multiple of 5 mph. Sketch your histogram here. What conclusions, if any, can you draw about the distribution of speeds from this histogram? Provide at least one sentence of explanation.

 

 

 

 

15. Compute the five-number summary for the coded speed data. No need to show work.

 

 

16. Compute the standard deviation of the coded speed data. No need to show work.

 

 

17. What is the standard deviation of the original speed data (i.e., the values ranging from 62 to 73)? Important: For this problem, you must either show your work (ugh!) or provide a sentence of explanation.

18. Create a SCATTERPLOT and a suitable regression line that we might be able to use as a way of predicting Judge Jeremy’s fines. Draw your scatterplot and regression line here.

 

 

 

 

 

 

 

 

19. What is the slope of your regression line from #18? ___________________________

20. Is Judge Jeremy fairly predictable in his assessment of speeding fines? What can you say about the correlation between speed and fine? (Write your answer using AP-style wording.)

 

 

 

 

21. In our article for the motor club, what would be a reasonable estimate of the fine Judge Jeremy would slap on a speeder who was clocked doing 70 mph? Show your work.

 

 

 

 

22. What would Judge Jeremy fine a speeder who was clocked at 57 mph? Is it reasonable to make such a prediction? Why or why not?

 

 

 

 

23. The speeder in the sample data who was going 68 mph was given a surprisingly hefty fine (surprising, that is, in terms of the regression line). What term do we apply to this case? ________________________

24. Remove the speeder who was going 68 mph from your data set and recompute the regression line. What is the new slope? _____________________

25. Is the case of the speeder who was going 68 mph an influential observation? ___________ Why or why not? _______________________________________

26. Sketch the residual plot here for the original data set. (In other words, make sure the case of the 68-mph speeder is included.) Does your residual plot suggest any systematic problem with the regression line? _____________ Explain your answer. ____________________________________________________

Questions 27-29 refer to the following scenario. For each question, you must document your assumptions, show a sketch, and write a conclusion. Suppose that the College Board has decided to phase out the SAT and replace it with a new test (the STA?) that has a normal distribution of scores, with a mean of 1500 and a standard deviation of 100.

27. On this new test, what percentage of the test takers will score 1600 or above?

 

 

 

 

 

28. Suppose that Joe Bulldog scores 1440 on the new test. What is this in terms of a percentile?

 

 

 

 

 

29. What fraction of the students taking the new test will receive scores between 1300 and 1550?

 

 

 

 

 

30. During his 15 years with the Yankees, Babe Ruth’s home run production numbers (in order by value, not by year) were 22, 25, 34, 35, 41, 41, 46, 46, 46, 47, 48, 54, 54, 59, and 60. Using the normal quantile values below, produce and analyze the normal quantile graph for this data set. [Note: Because of time limitations, we did this as a class exercise.]

-1.834
-1.282
-.9674
-.7279
-.5244
-.3407
-.1679
0
.1679
.3407
.5244
.7279
.9674
1.282
1.834