STAtistics / Mr. Hansen

Name: _______________________________

9/21/2011

READ INSTRUCTIONS IN EACH PART! ______

 

Test #1 (100 points): Chapters 1-2 plus Class Discussions

General Instructions:

§  Calculator use is permitted throughout today’s test.

§  If you have spare batteries, raise your hand for a small bonus.

§  All final numeric answers should be correct to at least 3 decimal places. Do not round until the end.

 

 

Part I: Fill-Ins and Notation (2 pts. per blank, 30 pts. in all).
Write the name, word, or phrase that best fits. The blanks suggest the length of expected answers.

 

 

1.

Our course, statistics, involves the study and analysis of data. A statistic is a __________ computed from __________ . Statistics are used for the purpose of estimating _____________ , and the latter word (in the singular) means a number that ___________________________________________ . Statistics is not the same as mathematics but can be thought of as a branch of _______________ mathematics. Throughout the entire course, we will allow the symbol  (approximately equals) to be written as ____________ without any point deduction, quibbling, or moaning.

 

 

 

2.

A batting average for a baseball player’s season can be thought of as a ____________ mean (relative to his or her entire career). The notation for this type of mean is __________ .

 

 

 

3.

The mean height of women in America is approximately 65 inches, with a standard deviation of 2.5 inches. The true mean height of women is a __________________ (write “parameter” or “statistic”) and therefore cannot ever be known by human beings. However, we can refer to the true mean height and true standard deviation of women’s heights by the symbols ____________ and ____________ , respectively.

 

 

 

4.

The square root of the sample variance is the sample _______________________________ , denoted __________ .

 

 

5.

If the true relationship between two quantitative variables is linear, then the true linear correlation coefficient is denoted  (Greek lowercase rho), and the “statistic” version computed by our calculator is denoted ______ . The coefficient of determination is a statistic denoted by ______ .

 

 

 

Part II: Quickie Computations (6 pts. per numbered problem, 30 pts. in all).
Use your calculator to fill in the blanks. No work is expected to be shown. However, if you desire partial credit in the event of a mistake, you are going to have to show a little bit of explanation.

 

 

6.

Six randomly chosen men have heights of 66, 68, 69, 70, 70, and 71 inches. Their shoe sizes are, in order, 9, 9, 10, 11, 10.5, and 11.5. If x = height and y = shoe size, give the equation of the LSRL (least-squares regression line) that predicts shoe size as a function of height. Correct notation is required.

 

 

 

_________________________________________________________________________

 

 

7.

To the nearest tenth of a percentage point, what percentage of the variation in shoe size in #6 can be predicted from the variation in height? __________ What percentage of the variation in height can be predicted from the variation in shoe size? __________

 

 

8.

Compute the standard deviation of height in #6, and express that standard deviation using correct notation: ___ = ________ .

 

 

9.

Compute the 5-number summary of the shoe sizes in #6:  ____ , ____ , ____ , ____ , ____

Also compute the interquartile range (IQR), which equals Q3 – Q1:  _________

 

 

10.

Use correct notation to state the sample mean height in #6, as well as the sample median. The median is sometimes also denoted M.

Sample mean:  ___ = ________     Sample median:  Q2 = ________

The sample, although it is very small, exhibits ___________ skewness. With a large sample, we would expect the data to show a bell-shaped “__________” distribution.

 

 

 

Part III: Critical Thinking Essays (10 pts. each, 30 pts. in all).
Complete sentences are not required. You may use bulleted lists, abbreviations, and sentence fragments as long as your meaning is clear. However, legibility and correct spelling are expected, and small point deductions may occur if your words are illegible or blatantly misspelled.

 

 

11.

Give a sketch of a scatterplot showing (a) r close to –1, and (b) r close to 0. Then answer this question: (c) If r is close to 0, can we conclude that the scatterplot has no patterns? ______ If yes, explain how you can conclude that there are no patterns. If no, explain why there may be some patterns (and you may also want to give another scatterplot or two in your answer to (b)).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

12.

Hospital A has a mortality rate of 0.03 for inpatients and 0.005 for outpatients. Hospital B has a mortality rate of 0.04 for inpatients and 0.009 for outpatients. Which hospital is better?

(In answering this question, discuss what the meaning of the word “better” could be or should be. Indicate that you have enough understanding of statistics to tackle the question intelligently. No baloney, please. Longer answers are not necessarily better, and beyond a certain point, you may lose points for excessive verbosity.)

 

 


 

13.

In our class, we define bias as any situation in which a statistic is systematically drawn to the high or low side of the true parameter value. Three common types of bias are nonresponse bias, in which the people who fail to respond to a survey are fundamentally different in some way from those who choose to respond; voluntary response bias (the flip side, if you will, of nonresponse bias), in which the people who respond tend to be those with the strongest opinions (which are usually not representative of the population); and response bias (a.k.a. lying). Another type of bias can be called “experimenter bias,” sometimes called “conflict of interest bias.” Explain why a medical researcher, especially if he or she holds stock in a pharmaceutical company, should not interact with patients in a single-blind clinical trial to evaluate the effectiveness of a new drug. How would you correct this situation? (Your answer should include a definition of “single-blind” and whatever strategy you propose to make the situation better.)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Part IV: Head-Scratchers (5 pts. each, 10 pts. in all).

 

 

14.

A cluster sampling procedure is proposed in which Mr. Hansen, instead of taking an SRS of students in the refectory, will instead randomly choose 5 numbers from 1 to 36, and then use those 5 tables (everyone at the table, that is) as subjects in his survey. The survey is designed to estimate the average value of the 9th digit of people’s social security numbers (SSNs), which as you know, are assigned sequentially by the Social Security Administration, normally shortly after birth. Is there bias in Mr. Hansen’s survey design? _____ If so, list the type(s) of bias, with no explanation required. If not, explain why not.

 

 

 

 

 

 

 

 

 

 

15.

Why is Mr. Hansen absolutely fanatical about requiring the lowercase z to be crossed?