AP Statistics / Mr. Hansen |
Name: ____________________________ |
Practice Quiz 13.2A
|
|
0. |
Make a two-way table suitable for a chi-square analysis. |
1. |
Is high blood pressure associated with a higher death rate? Calculate and compare percentages to answer this question. |
2. |
Make an appropriate graph to display the association. |
3. |
Write null and alternative hypotheses for a chi-square analysis of these data. |
4. |
Calculate the expected count for the entry in row 2, column 1. |
5. |
Write the third term (i.e., the entry for row 2, column 1) of the chi-square statistic. What are the degrees of freedom? |
6. |
The chi-square statistic is 8.86. Complete the chi-square analysis of the blood pressure data and write your conclusion. |
7. |
Do heights of adult males follow a normal distribution? Use a suitable c2 test to analyze the following data from an SRS of 559 men. |
height |
# of men |
£ 5'0" |
3 |
5'0" < ht. £ 5'2" |
13 |
5'2" < ht. £ 5'4" |
38 |
5'4" < ht. £ 5'6" |
53 |
5'6" < ht. £ 5'8" |
70 |
5'8" < ht. £ 5'10" |
121 |
5'10" < ht. £ 6'0" |
108 |
6'0" < ht. £ 6'2" |
71 |
6'2" < ht. £ 6'4" |
58 |
6'4" < ht. £ 6'6" |
20 |
> 6'6" |
4 |
Answers (Dont Peek Too Soon!)
0. The tricky part here is that you have to separate the population into categorical variables in two dimensions. The column for "total" cannot be used as it stands because it is not a subdivision of the population. Here is a suitable two-way table:
Died (count) |
Survived (count) |
|
Low blood pressure (< 140 mm Hg) |
21 |
2,655 |
High blood pressure (³ 140 mm Hg) |
55 |
3,283 |
1. The death rate among low blood pressure subjects was 21/2676, or 0.785%. Among high blood pressure subjects, the rate was much higher, namely 55/3338, or 1.648%. [These results are actually conditional probabilities, namely P(death | low blood pressure) and P(death | high blood pressure).]
2. You could make two side-by-side pie graphs, but the difference between a 0.785% slice in one pie and a 1.648% slice in the other pie would not be very visible. Much better is to make two vertical bars of equal width, side by side, with a gap in between. Make one bar slightly over 3/4 of an inch tall, and label it "low blood pressure = 0.785%." Make the other bar slightly over 1 5/8 inches tall and label it "high blood pressure = 1.648%." Put a large bold title over the top of the graph: "5-year death rates for men, by blood pressure type."
3. H0: There is no association between blood pressure and death rate.
Ha: There is an association between blood pressure and death rate.
4. You must show your work for this. The row 2, column 1 expected count is (rowtotal)(coltotal)/(grandtotal) = (55+3283)(21+55)/(21+2655+55+3283) = 42.1829. [Of course, you can easily check your work by looking at the matrix [B] that your STAT TESTS C produces.]
5. You must show your work for these. The row 2, column 1 contribution to c2 is (obs. exp.)2/exp. = (55 42.1829)2/42.1829 = 3.894, and df = (rows 1)(cols 1) = (1)(1) = 1. [Of course, you can easily check your work by using STAT TESTS C followed by the CONTRIB program. Always be sure to use STAT TESTS C first.]
6. There is strong evidence (c2 = 8.864, df = 1, P = 0.0029) of an association between blood pressure and death rate, and we should add that the association is positive: high blood pressure is associated with a higher death rate among men similar to those followed in the study. [Note that the assumptions for c2 are satisfied with all expected counts ³ 5. We do not know to what larger population the conclusions apply, because these men were probably not an SRS.]
7. Because we do not have the raw data, we will have to accept much more granularity than would otherwise be the case. However, we can still estimate the mean and standard deviation to be 69.6 inches and 3.96 inches, respectively, by treating the 3 shortest men as being 59 inches tall, the 4 tallest men to be 79 inches tall, and everyone else to be at the midpoint of his respective height category. By using a z table or the cumulative normal feature of our calculator, we can estimate the following percentages for each height category, assuming the N(69.6, 3.96) distribution:
height |
approx. % of men at this height if distribution is N(69.6, 3.96) |
£ 5'0" |
0.767% |
5'0" < ht. £ 5'2" |
1.981% |
5'2" < ht. £ 5'4" |
5.118% |
5'4" < ht. £ 5'6" |
10.299% |
5'6" < ht. £ 5'8" |
16.144% |
5'8" < ht. £ 5'10" |
19.714% |
5'10" < ht. £ 6'0" |
18.753% |
6'0" < ht. £ 6'2" |
13.898% |
6'2" < ht. £ 6'4" |
8.023% |
6'4" < ht. £ 6'6" |
3.608% |
> 6'6" |
1.695% |
TOTAL |
100.0% |
Multiply each of these percentages by 559 to get the expected number of men in each height category, and put those expected counts in list L2. As a check on your work, compute sum(L2) and make sure it equals 559. Although not all expected counts are ³ 5, we easily pass the more relaxed criteria given in the box on p.709. Thus we elect to continue with the c2 goodness-of-fit test on the following hypotheses:
H0: Population distribution is N(69.6, 3.96).
Ha: Population distribution is not N(69.6, 3.96).
By calculator [CHISQGOF], we have c2 = 17.46, df = (# of categories 1) = 10, and P = 0.065. We should subject our data to a sensitivity analysis to see how much the P-value fluctuates when the raw data are less granular, but it appears that there is no strong evidence (c2 = 17.46, df = 10, P = 0.065) to cast the normality of the population in doubt.