STAtistics / Mr. Hansen

Name: _______________________________________

9/16/2008

 

 

Test #1 (100 points): Calculator permitted throughout

 

1.(a)

(4 pts.) What is a statistic? ___________________________________________

 

 

(b)

(3 pts.) Give several examples of statistics. _________________________________

 

 

 

_____________________________________________________________

 

 

2.

(4 pts.) What is a parameter? _____________________________________________

 

 

3.

(3 pts.) The second half of our course, known as “inferential statistics,” can be summarized by the following sentence that uses some words from #1 and #2:

 

 

 

________________________________________________________________

 

 

4.

(2 pts.) The first half of our course, from now until midterm exams, is concerned with exploratory data analysis, experimental design, and probability. When we run our group surveys this week and next, which of these three topic areas will we be exploring? _________________________________ Hint: An experiment is a special type of study in which we impose some type of treatment—beyond mere data gathering—on our test subjects. As you know, a good experiment should always also have a control group that receives no treatment or only a dummy (placebo) treatment.

 

 

5.

(28 pts.) Carefully draw lines to match each letter with its official name and its description.

 

 

 

Letter

Official Name

Description

 

 

 

 

 

s

sample standard deviation

total of sampled data values, divided by n

 

 

 

 

 

linear correlation coefficient

mean squared deviation from the population mean

 

 

 

 

 

n

sample size

square root of sample variance

 

 

 

 

 

population standard deviation

square root of population variance

 

 

 

 

 

population variance

number of data points in the sample

 

 

 

 

 

population mean

number between –1 and +1 that indicates strength and direction of linear fit in a scatterplot

 

 

 

 

 

r

sample mean

equals population median for a symmetric distribution

 

 

6.

(6 pts.) Fill in the blanks: IQR, which stands for __________________________ , is a ____________ measure of dispersion, which means that it is not affected by outliers. However, s.d., which stands for _____________  _____________ , is strongly affected by outliers.

 

 

 

 

 

 

 

 

7.

(6 pts.) Describe how you would determine whether a certain data value, say, x, is an outlier in a data set. Be sure to use the terms Q1 and Q3 in your description. You do not need to define what they mean.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8.

(4 pts.) What is the percentile for the Q3 value in a data set? ______ What is the z score (approximately) that corresponds to this percentile? _______ (No work is expected for either answer.)

 

 

9.

(6 pts.) Briefly state the distinction between a scientific theory and an unscientific theory. An explanation is not required.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

10.

(3 pts.) What number or value tells us how many standard deviations a point lies to the left or right of the mean? _________

 

 

11.

Guinea pig lifespans are skew right. Make a sketch of a phony histogram and a phony normal quantile plot to illustrate this right skewness. Label your axes (name, numbers, and units if applicable) in the histogram but not in the NQP. MAKE UP THE NUMBERS. DO NOT USE YOUR CALCULATOR.

 

 

(a)

(4 pts.) Histogram:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(b)

(4 pts.) NQP:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(c)

(6 pts.) The 5-number summary for human lifespans in the Republic of Oxonia is 0, 63, 77, 84, 108. Is the distribution skew left, skew right, or symmetric? _______________ Sketch a regular boxplot, making a reasonable effort to show scaling correctly. (In other words, do not randomly slap down a box in which the distance from 63 to 77 looks the same as the distance from 77 to 84.)

 

 

 

 

 

 

 

 

 

 

12.

In the 2004 Presidential election, exit pollsters incorrectly predicted that Kerry would win in Ohio, based on their interviews with (mostly) randomly chosen voters who were, in some cases, asked to wait in line for a while because of the heavy voter turnout. Pollsters exercised some personal discretion in their choice of polling subjects during periods when multiple voters were leaving simultaneously and when there was ambiguity in the results of the random number generation. Pollsters employed on election day were primarily college-aged people looking for extra income.

 

 

(a)

(6 pts.) Explain briefly what was wrong with the methodology and why it led to erroneous results.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(b)

(6 pts.) If the pollsters’ ages (in years) were normally distributed with mean 20 and standard deviation 2, compute the following.

 

 

 

Percentage of pollsters who were less than 21 years old = ____________

z score for a 24-year-old = ____________    Percentile for a 24-year-old = ____________

 

 

13.

(7 pts.) (Work is optional for this problem. However, there is no partial credit unless you show your work. Use the blank region if you wish to show your work.) Mr. Hansen’s uncle, a German teacher, is known as an easy grader. If student test scores are normally distributed in Mr. Hansen’s uncle’s classes, with mean 88 and standard deviation 8, find the score to the nearest tenth needed to be at . . .

the 50th percentile: ______


the 40th percentile: ______


the 85th percentile: ______