AP Statistics / Mr. Hansen
5/5/2003 [rev. 3/4/04, 4/23/05, 5/11/05, 5/22/06, 4/21/08, 5/7/010, 4/7/15, 4/21/16, 5/8/21]

Name: _________________________

The “Must-Pass” Quiz: Partial Answer Key

1.*

A number computed from data. [You should provide examples.]

 

 

2.*

A number that describes a population. [You should provide examples.]

 

 

3.*

An “adjustable constant” that defines the nature of a mathematical model, much as a tuning knob or volume slider adjusts the output of a television or radio.

 

 

4.

Uniform: min and max [also need to know whether distrib. is discrete or continuous]
Normal:  and
Binomial: n and p
Geometric: p
t: df
: df

 

 

5.

Uniform: flat line in relative frequency histogram
Normal: classic continuous bell-shaped curve, satisfies 68-95-99.7 rule
Binomial: discrete (“stairsteppy”); skew right if p < .5, skew left if p > .5, symmetric if p = q = .5
Geometric: discrete (“stairsteppy”), always skew right
t: continuous, bell-shaped; virtually normal for large df, except with more “flab” in the tails
: continuous, always skew right

 

 

6.

Range is a single number for the spread of values in a column of data: range = max – min. People who say things like “the range is from 28 to 75” are misusing the term in its statistical sense.

 

 

7.

IQR (interquartile range) = Q3Q1. Use STAT CALC 1 to get 5-number summary, then
VARS 5 PTS 9 – VARS 5 PTS 7. You could write a program to do this if you wished.

 

 

8.  (a)

Easiest way is to make modified boxplot, then TRACE to see the points (use arrow keys). Outliers are more than 1.5IQR below Q1 or more than 1.5IQR above Q3.

(b)

No rule of thumb—just judge visually. Outliers have “large” residuals.

 

 

9.

Explanatory, response.

 

 

10.

Mean squared error = pop. variance (mean squared deviation from the mean). Sample variance is different, since denom. is n – 1 instead of n.

 

 

11.

Pop. s.d. () and sample s.d. (s) are measures of data dispersion (“spread”). Use STAT CALC 1 to compute, never the formula on AP formula sheet. Technically,  equals the square root of MSE (pop. variance), and s equals the square root of sample variance.

 

 

12.

In a normal distribution (required), the distribution curve is bell-shaped, satisfies the 68-95-99.7 rule, and has inflection points at .

 

 

13.

Lack of symmetry. Right skewness means the central hump dribbles out to the right, forcing mean > median, since mean is less resistant to extreme values. Left skewness is the opposite, forcing mean < median. Easy ways to detect skewness involve looking at histogram, boxplot, or stemplot to see where the tail is longer. If you use NPP, trace dots from left to right; if they bend to left, plot shows left skewness, but if they bend to right, plot shows right skewness.

 

 

14.

Easiest way is look for a pattern that is not straight in NPP. If you are a glutton for punishment (as on #4 from the Chap. 13-14 free response), you can use  g.o.f. to test for departures from expected bin counts. There are also several standard “canned” tests that are beyond the scope of AP Statistics.

 

 

15.*

Linear least-squares. [It is not sufficient to say linear, because the LSRL is not the only type of linear regression. For example, there is the median-median line, which is useful in some situations and which is more resistant than the LSRL.]

 

 

16.

Slope, since it estimates how many response units will increase (or decrease) for each additional explanatory unit. Intercept is less crucial, even meaningless in some contexts.

 

 

17.

Linear correlation coefficient. Signed strength of linear pattern (–1 = pure negative linear association, 0 = no linear association, +1 = pure positive linear association.) Use STAT CALC 8 and make sure your Diagnostics are on (2nd CATALOG DiagnosticOn).

 

 

18.

Coefficient of determination. Tells what portion of the variation in one variable can be explained by variation in the other. If r = .8, then 64% of the variation in y (or x) can be explained by variation in x (or y). It is also acceptable to say that 64% of the variation in the response variable (y) is explained by the LSRL model. [The other 36% is due to randomness or other factors.]

 

 

19.

No; yes.

 

 

20.

No; yes.

 

 

21.

STAT CALC 8, or with formulas 6 and 8 on first page of AP formula sheet. (Never use formula 5.)

 

 

22.

[See LSRL Top Ten.]

 

 

23.

Resid. =  (i.e., actual y – predicted y). Resid. plot is scatterplot with RESID on y-axis and either the x or y variable on the x-axis. (It doesn’t matter, since x and y are linearly related.) In beginning statistics courses, we usually make resid. plot with x on the x-axis and RESID on the y-axis, but there was at least one AP exam that had y values on the x-axis of the resid. plot. Don’t let that bother you.

 

 

24.

[See LSRL Top Ten.]

 

 

25.

Regression outlier and influential observation are not synonyms. A point can be a regression outlier (large residual), but if it is near the center of the x values, it is usually not influential. Similarly, a point can be influential (large effect on slope or r if removed) but have only a small residual, meaning the point is not an outlier. It is also possible for a point to be both influential and an outlier.

 

 

26.

b0 = value of response if explanatory variable (x value) is set to 0
b1 = estimate of how many response units will increase (or decrease) for each additional explanatory unit

For example, suppose that a clinical trial of a diet pill shows that the mean weight change after a year is 2 – 3x lbs., where x = daily dosage (# of pills). Then b0 = 2, since a person taking 0 pills can expect to gain 2 lbs. in a year, and b1 = –3, since each additional pill in the daily dosage is associated with a weight of about 3 lbs. less after a year.

 

 

27.

Random variable (discrete or continuous). [You should provide examples.]

 

 

28.

r.v., , mean, expected value

 

 

29.

r.v., variance, Var(X), , square root, Var(X),

 

 

30.

sum, sum, means; yes; mean of difference equals difference of means

 

 

31.

sum, sum, variances; true only for independent r.v.’s; variance of difference (assuming indep. r.v.’s) equals sum of variances

Other consequences: s.d. of sum = square root of sum of variances (similar to Pythagorean Theorem), s.d. of difference = square root of sum of variances (same comment). Both are true only if the r.v.’s are independent.

 

 

32.

scalar (i.e., a constant), scalar, ; yes

 

 

33.

r: no change
: affected by both translation and dilation (fancy way of saying that  = lin. fcn. of )
: affected by dilation (i.e., multiplication by scalar) but not by translation (shift left or right)
IQR: affected by dilation but not by translation
range: affected by dilation but not by translation

 

 

34.

Standardized (dimensionless) representation of a data point, in s.d.’s.
Can always be computed, even if data set is non-normal.
Use formula .
Tells how many s.d.’s a data value is above or below the mean.

 

 

35.

events, mutually exclusive; independence; no; independence of A and B means P(A|B) = P(A), which is not at all the same as

 

 

36.*

The aspect of probability that we care most about is sampling distributions. If we understand the sampling distribution of a statistic, we can determine how statistically significant a result is. Without this, we would never know whether experiments or clinical trials of new drugs were showing anything of value or were merely “flukes.”

 

 

37.

Sampling distribution of  or diff. of means: Follows z if  is known (rare), otherwise t.

Sampling distribution of : Really binomial, but almost normal if pop. is large,
np  10, and nq  10.

Sampling distribution of difference of proportions: Almost normal if pops. are large,
n1p1  5, n1q1  5, n2p2  5, n2q2  5.

Sampling distrib. of : Follows , with df given either by
(# of bins – 1) for g.o.f., or by (rows – 1)(cols. – 1) for 2-way tables.

 

 

38.

standard error, s.e.

 

 

39.*

Law of large numbers.

CORRECT: As ,  approaches p. (Sometimes stated as “ as .”)

WRONG: If , then the proportion of successes will start to increase until we “catch up.” (Or, if , the proportion of successes will start to decrease until we are “back down to the correct value.”) These are both wrong, because what really happens is that the effect of any finite collection of observations becomes diluted as . A coin has no memory, no desire to set things right, and no ability to iron out past discrepancies. Nevertheless, the proportion of heads—even if the coin is biased—will, over time, approach whatever the true probability is.

 

 

40.

Central Limit Theorem.

CORRECT: Consider any population, not necessarily normal, having finite . As , the sampling distribution of  approaches .

WRONG: “Everything is normal.” (Not true: Sampling distributions of s are certainly not normal. Geometric and  distributions are certainly not normal.) “Any sampling distribution of  is normal.” (Not true: Sampling distributions of  approximately follow a t distribution if  is unknown.) “Sampling distribution of  is not normal unless n is large.” (False: Sampling distribution of sample mean is normal, regardless of sample size, provided pop. is normal with known .)

 

 

41.

P-value, test; principles of good experimental design; [add your personal description, incorporating control, randomization of assignment, and replication; if you wish, add blocking (a form of control that reaches its ultimate expression in the case of matched pairs)]

 

 

42-52.

[Research on your own, please.]

 

 

53.

Two-tailed, since if the experiment goes the wrong way (as sometimes occurs in science), there will still be the possibility of making an inference. All decisions regarding methodology are supposed to be made before any data-gathering occurs. (Otherwise, people could say that the methodology was tailored toward achieving a low P-value. In theory, the experiment should be repeatable, so that anyone following the same methodology would likely reach a similar conclusion.)

The one-tailed/two-tailed decision should be based on the research question posed. If the researcher is wondering whether there is “a difference,” direction unspecified, then plan for a two-tailed test. If the researcher is wondering whether treatment X increases hair strength, decreases yellowness of teeth, or whatever, then plan for a one-tailed test.

 

 

54.

It is possible to write a true sentence using the words probability and confidence interval. However, it is also very easy to make an error along the way. That is why it is much better to say, “We are 95% confident that the true proportion of voters favoring Smedley is between 48% and 54%,” not anything involving probability. Probability is a technical term meaning long-run relative frequency, and it cannot be haphazardly misused in the way laypeople misuse it.

It would be correct to say, “If we repeatedly generated confidence intervals with samples of this size and with m.o.e. of 3%, then the probability that a future confidence interval will bracket the true proportion of voters favoring candidate Smedley is 95%; that is, 95% of the confidence intervals generated by this process will bracket the true value.” However, you cannot make a probability statement about a confidence interval once it has been generated, because then you are not making a statement about the process (which is legitimate), but rather about this one-shot confidence interval. There is no “long run” in a one-shot confidence interval!

 

 

55.

We cannot prove H0. All we can do is judge whether the evidence against it is “sufficient to reject” or “insufficient to reject.”

 

 

56.

We can sometimes gather overwhelming evidence that H0 can be rejected in favor of Ha. In the real world, even in a court of law, that is good enough. (Of course, in the world of mathematics, that is not considered a proof—one of the reasons that mathematicians and statisticians do not consider themselves to be equivalent.)

 

 

57.*

[You’d better know this by now!]

 

 

58.*

inferential, use statistics to estimate parameters

 

 

59.

[I think everyone can do this.]

 

 

60.

Blocking is a form of control in which similar experimental units are grouped (for example, by age or gender) before being randomly assigned to treatment groups.

What it does: Blocking reduces variation between the experimental units in the same block.

Why we care: Reducing this variation makes it more likely for the experimental effect, if any, to stand out from the background noise.

If we “block to the max” and shrink each block down to the size of 1 experimental unit, then each experimental unit must serve as its own control. For example, we could test sunscreen against a placebo cream by using blocks of size 1, where each person applies placebo cream to one arm (randomly chosen) and the real sunscreen to the other arm. The technical name for this experimental design is matched pairs.

Another place where we frequently see matched pairs is in the context of “before and after” scores for each test subject. If each subject is measured before and after some treatment, then the mean of the differences will be exactly the same number as the mean difference you would get by incorrectly treating the “before” and “after” scores as independent samples. However, the s.e. for the matched-pairs differences will be much lower than the s.e. you would get by using the formula for 2-sample s.e.

Why do we care about smaller s.e. values? Smaller s.e., means a smaller m.o.e. (since m.o.e. is the product of s.e. and some critical value we look up in a table). Smaller s.e. also means a smaller P-value, meaning that proving that the change from “before” to “after” is statistically significant will be much easier with matched pairs.

It is also possible in some situations to have matched triples, matched quadruples, and so forth. However, that would require multiple treatment protocols or dosage levels for each test subject, and that is much less common than matched pairs.

Remember that in a matched-pairs t test, the statistic of interest is the sample mean of the pairwise differences.

Final question: BLOCKING is to EXPERIMENTS as STRATIFICATION is to SURVEYS.

 

 

61.

The first one (unequal proportions) is for a 2-prop. z confidence interval, and the second one is usually for a 2-prop. z test.

There are rare situations in which you would use the first formula for a 2-prop. z test. The rule is this: Look at what H0 is claiming. If H0 is claiming that the proportions differ by a constant (for example, p1 = .025 + p2), then you would use the first formula (unequal proportions). However, a much more common situation, shown in virtually all of the practice AP problems, involves H0 claiming that p1 = p2, and in such a case you would use the second formula (equal proportions). If all of this is too confusing, you would not go too far wrong if you simply always used the second formula for 2-prop. z tests, leaving the first formula only for 2-prop. z intervals.

One additional note: In the second formula, you have to know how to estimate p: Take total # of successes divided by total # of subjects.

 

 

62.

False: If there are matched pairs, you have only one sample (namely, a column of differences).

 

 

63.

Bias = any situation in which the expected value of a statistic does not equal the parameter being estimated. Selection bias refers to a methodology that produces samples that are systematically different from the population in a way that causes a parameter to be systematically underestimated or overestimated. An SRS is not biased; although an SRS often fails to match the population, the differences are random differences, not systematic differences.

“Systematic” means that there are methodological flaws that may become evident over a period of time, because the flaws are built into the design of the process. For example, if we try to poll the STA parent body on the question, “How many days per year does your son spend traveling?” we will get a statistic that is biased on the high side if we use an SRS of all parents. (That is because students with stepparents, who may well travel more than the average, will be more likely to have a parent chosen as part of the SRS.) If the SRS were based on students instead of parents, the question should be able to avoid selection bias.

Common types of bias include selection bias (undercoverage or overcoverage), response bias (a.k.a. lying), nonresponse bias, voluntary response bias, hidden bias, experimenter bias, and wording of the question.

 

 

64.

 is an unbiased estimator of ; i.e.,  

 is an unbiased estimator of p; i.e.,

 

 

65.

[I hope you have thought about this. This is a personal matter, but what I do is first to decide whether there are proportions involved or not. Then, do we have 1 sample, matched pairs (also 1 sample), or 2 real samples? Or is this a  problem? And if so, are we comparing against fixed proportions (g.o.f.) or looking for differences across a 2-way table?]

 

 

66-74.

[See TI-83 STAT TESTS Summary.]

 

 

75.

“Since P = ____ , which is less than , there is good evidence that ...” It is a good idea to list the test statistic, along with n or df. Be sure to phrase the conclusion in the context of the problem.

Here is a completely filled-in example that you cannot use (since that would violate the rule against using canned answers):

Since P = 0.02, which is less than = 0.05, there is good evidence (t = 2.6, df = 15) that the true mean melting point of the diet cola being tested is not equal to 32 degrees Fahrenheit.

 

 

76.

“Since P = ____ , which is greater than , there is no evidence that ...” It is a good idea to list the test statistic, along with n or df. Be sure to phrase the conclusion in the context of the problem.

Here is a completely filled-in example that you cannot use (since that would violate the rule against using canned answers):

Since P = 0.392, which is greater than = 0.10, there is no evidence (t = –0.874, df = 20.29) that the true mean lifespan of irradiated worms differs from the true mean lifespan of non-irradiated worms.

 

 

77.

“We are XX% confident that the true ... is between YY and ZZ.” Be sure to phrase the “...” in the context of the problem, e.g., “true mean boiling point,” “true difference in voter preference proportions,” “true mean improvement in test scores,” etc.

 

 

78.

Compute C.I. using TI-83. Then punch upper–lower, i.e., VARS 5 TEST I – VARS 5 TEST H, divide result by 2 and STO into M (for m.o.e.). You can then write your C.I. as est.  M. Depending on the problem, “est.” will be , , , or .

 

 

79.

[See AP formula sheet.]

 

 

80.

Since  in the LSRL t-test, .

 

 

81.*

convenience, anecdotal; voluntary response bias

 

 

82.*

Not really. For example, the m.o.e. (at a 95% confidence level) of a 1300-person poll will be about 3 percentage points, regardless of whether the poll is taken in California or in Wyoming. You do not need a larger sample to get the same accuracy in California, even though the population of California is about 39 million, more than 65 times larger than that of Wyoming.

Think of visiting a plant where M&M’s are made. Imagine taking a scoop of M&M’s out of a huge tub of randomly distributed M&M’s. Your goal, let’s say, is to estimate the proportion of blue M&M’s in the tub. What affects the accuracy of your estimate? Clearly, your m.o.e. will be large if you take a small scoop, and your m.o.e. will be smaller if you take a really big scoop. However (and this where many people have trouble), the m.o.e. does not depend on the size of the tub. The m.o.e. depends only on how large a scoop you take (i.e., your sample size).

The reason that m.o.e. does not depend on population size is that m.o.e. is always equal to s.e. multiplied by a critical value. The s.e. (and, for t distributions, the critical value as well) will depend on n (sample size). However, s.e. and critical value do not depend on N (population size), as long as the population is “large” relative to the sample.

For small populations (e.g., the population of the Upper School), it is technically true that m.o.e. depends on population size. There is a formula known as the “finite population correction” that makes this relationship clear. However, this is not an AP topic. For the common situation where someone is trying to estimate a parameter from a large population (e.g., the proportion of voters in a state who support Smedley), the size of the population simply does not matter. The only thing that matters in that case is the size of the SRS.

So, anyway, I hope that one of you will run for President or Vice President. Suppose your political consultant tells you that you need to spend more than $1 million conducting a poll of Californians, as compared to a few thousand dollars for a smaller poll of Wyomingites. I hope that you will fire that sorry consultant and send me a percentage of the money you saved. (Just kidding.) (Well, maybe not entirely.)

 

 

83.

No; q; yes.

 

 

84.

Simple random sample; a sample in which every possible subset is equally likely to be selected.

 

 

85.

SRS, since bias can invalidate the results quite easily. Normality of population is not an issue in large samples (courtesy of CLT), since normality of the sampling distribution rescues us.

 

 

86.

Marginal probabilities = fractions involving row or column totals divided by grand total. Conditional probabilities = fractions involving individual cells divided by a row or column total. Both are usually concerned with categorical data in 2-way tables.

 

 

87.*

Just because an effect is not plausibly caused by chance alone does not mean that it is large enough to be of any real-world significance. The reverse situation is also possible. In the presidential election of 2000, a 0.01% difference in vote totals in Florida (a margin of no statistical significance whatsoever) was enough to permit George W. Bush to defeat Al Gore in Florida and thereby in the overall election, even though Bush lost to Gore by more than half a million votes nationwide. George W. Bush’s election was thus a chance outcome, not indicative of any general trend in voting preferences, but it had a huge effect on U.S. history.

 

 

88.*

[I think everyone knows this. In fact, you probably knew it before you took the course.]

 

 

89.*

Only a controlled experiment is considered convincing. In situations (e.g., smoking in humans) where it is not ethical to run a controlled experiment, various types of observational and correlative studies can suggest, but not prove, a cause-and-effect link.

 

 

90.*

[Everyone probably knows this. Remember to discuss placebo effect and hidden bias.]

 

 

91.

Yes; perhaps many new employees have been hired.

 

 

92.

Yes; the relative mix of employee categories could be a lurking variable. Perhaps there are now proportionally more employees in the higher-paid job categories, so that the weighted average salary has increased even while each category has had cuts in mean salaries. This would be an example of Simpson’s Paradox.

 

 

93.*

Using deceptive (“gee-whiz”) graphs, changing the subject, confusing correlation with causation, using inappropriate averages (e.g., mean with highly skewed distributions), citing anecdotal data, using biased samples, concealing the wording of a survey question, computing absurd precision with qualitative data (e.g., “74% more beautiful skin!”), etc., etc.

 

 

94.*

Who says so? How do they know? What’s missing? Did somebody change the subject? Does it make sense? (For example, a claim that a child is kidnapped every 30 seconds in America is absurd, since that would be more than a million children per year.)

 

 

95.

The last one. Statisticians are mostly from mathematical or scientific backgrounds, which means we are on a quest for truth. Our clients may mangle, misuse, and abuse our conclusions, but we try very hard not to do that ourselves.

 

 

96.

Nobody knows. The statement is usually attributed to Mark Twain, although he himself credited it to Benjamin Disraeli.

 

 

97.*

Odds in favor = ratio of favorable to unfavorable outcomes.
Odds against = ratio of unfavorable to favorable outcomes.
For example, if p = P(A) = 4/13, then the odds in favor of event A are 4 to 9, and the odds against A are 9 to 4.

Because casinos are in the business of making money, they never quote payoff odds that equal the mathematical odds. For example, in roulette, there are 36 numbers you can bet on, half of which are red and half of which are black. In Las Vegas, there are also a zero and a double zero for which nobody is paid off; in other words, the house takes all the bets if the ball falls in slot 0 or 00. A successful bet on black pays off at 1:1 odds (i.e., net profit of $1 for each $1 placed at risk). However, since the player’s probability of success is 18/38, the mathematical odds against are actually 20:18, which is slightly more than 1:1. The excess represents the casino’s profit margin in the long run, as proved by the LOLN.

 

 

98.

For the individual player who plays a few dozen hands of blackjack or pulls the arm on a slot machine a few dozen times, the sampling distribution of net outcomes is relatively short and wide, meaning that a good portion of the sampling distribution can spill into positive territory even though the mean is negative. This is why it is not rare for people to return home from Las Vegas as winners. (Fewer than half are this lucky, but since the lucky ones are usually the only ones who say anything, it is easy to get a false impression that winning is common.)

From the casino’s point of view, however, the sampling distribution for the vast numbers of dollars that are wagered in a day is completely different. The casino’s sampling distribution is extremely tall and pointed, with a mean that virtually guarantees a profit as a fixed percentage of the amount wagered. (This percentage varies depending on whether the game is roulette, or blackjack, or slot machines, or whatever.)

 

 

99.

Confounding means that group membership affects the response variable in a way that makes determining cause and effect difficult or impossible. Because the designers of a study or an experiment are often unaware of this relationship, confounding variables (a.k.a. confounding factors or confounders) are sometimes called “lurking variables.” Confounding makes us unable to see what portion of the response, if any, can be attributed to the explanatory variable and not to the lurking variables.

A classic example is the strong correlation between the number of Methodist ministers in the U.S. and the number of murders in the U.S. during the 19th century. The correlation between the two is extremely strong, close to +1. Does this mean that the increasing number of Methodist ministers caused more murders to occur? Or, perhaps the effect is the reverse: Did more murders cause a societal backlash leading to more people to become Methodist ministers? Actually, neither is true. Being a member of the group of Methodist ministers is closely tied to the population of the country: more people, more ministers. Of course, an increasing population also means a rising number of homicides: more people, more murders. The lurking variable in this example is population.

In many situations, a good way to eliminate a lurking variable is simply to divide by it. When we compare the number of Methodist ministers per 100,000 residents with the number of homicides per 100,000 residents, we do not see the two variables moving linearly in lock-step with each other.

In business, it is much more informative to compare ratios than it is to compare raw numbers. There is an entire branch of study, ratio analysis, that you can specialize in if you get an MBA. Is a business healthy? You can’t answer the question simply by looking at gross sales, or number of customers, or even gross profit. All of the informative answers require ratios of some sort or other: internal rate of return (expressed as a percentage), market share (ditto), growth rate, turnover percentage, customer acquisition cost (per capita), etc.

Another example of a lurking variable involves the supposed benefit of hormone supplements. Doctors noticed that their postmenopausal female patients on hormone replacement therapy (HRT) not only experienced fewer hot flashes but also had overall better health outcomes: fewer heart attacks, greater energy, greater flexibility, and so on. Many doctors started recommending HRT for ongoing treatment lasting years, and their female patients seemed happy to receive it. CONFOUNDING ALERT!! Were the HRT women, as a group, different from the non-HRT women in some way that affected the response variable? Unfortunately, yes. The HRT women were wealthier on average, not to mention healthier to begin with. That difference in group membership, it turned out, explained the better health outcomes—not the HRT itself. When the NIH funded an expensive multiyear controlled experiment (total budget of about $625 million) to find out, among other things, whether HRT really had a benefit for postmenopausal women, they learned that HRT was actually harmful when used over an extended time period.

What happened? Had the doctors been cooking the data? Were the doctors dishonest? No! The doctors simply observed that their HRT patients did better, on average, than their non-HRT patients. Without random assignment of treatments, there was no way to know what portion of the improved health was due to HRT and what portion was due to other confounding factors. When the NIH did a proper double-blind controlled experiment, the real truth emerged: HRT is actually harmful when used beyond its narrow approved use for treating hot flashes.

Here is how to summarize these examples, using correct wording:

    In the U.S. in the 19th century, the number of Methodist ministers and the
    number of homicides were both confounded with population.

    The better health outcomes for women on hormone supplements were
    confounded with group membership when treatment groups were self-
    selected instead of being randomly assigned.

How do we fight confounders? In observational studies, one has to be on high alert, since confounders can pop up all over the place. One strategy is to compare rates instead of raw numbers. Another strategy is to do what the NIH did: Invest the money to run a proper controlled experiment instead of an observational study. Random assignment of treatments is the ultimate cure for confounding variables.

 

 

100.

It is true that poker has chance elements. However, almost all games (such as golf, for example) have chance elements: a puff of wind, a sneeze by a spectator, a bad bounce off a yardage marker. In the long run, however, the most skillful players of poker and of golf come out on top. In a poker tournament involving many rounds, everyone’s luck will be approximately the same. The skill required to bluff convincingly and wager appropriately will guarantee that many of the same elite poker players end up battling each other at the World Series of Poker each year.