STAtistics / Mr. Hansen |
Name: _______________________________________ |
11/18/2009 |
|
Simulation Ideas
(Pick one, or develop your own)
1. |
When a passenger aircraft is
being boarded, passengers will board either “conventionally” (i.e., priority
boarding, then first class, then rows 35-40, then 25-40, then 15-40, and
finally all rows) or “Southwest style.” In the Southwest boarding style,
there is no first class, and all passengers are divided into boarding groups
(A, B, or C) based on time of arrival at the gate. The A group will board
first, then the B group, then the C group. Passengers may choose any open
seat. Determine which method is faster, on average, and what the histograms
of times look like under each scenario. You will need to make a large number
of simplifying assumptions in order to make this problem tractable, but you
can work with Mr. Hansen to develop a project that is both feasible and
meaty. |
|
|
2. |
Hitting streaks are notable
in professional baseball, with the longest on record being Joe DiMaggio’s 56-game
streak in 1941. Using parameters with probability distributions for number of
at-bats in each game and batting average in each game, determine relative
frequency histograms for length of hitting streaks (in number of games)
subject to various pairings of centerline parameter values. |
|
|
3. |
Compare pseudorandom data
(either numeric or text) generated by a computer program with “attempted
random” data generated by human test subjects using no electronics, no coins,
no dice, no assistance of any kind. Develop markers (e.g., number of repeated
characters, maximum length of repeated characters, maximum number of
sequential characters separated by a certain fixed spacing) that make it easy
to distinguish the computer results from the human results, and then test
your markers for reliability by asking a naive group of test subjects to
generate “attempted random” sequences of the target length. |
|
|
4. |
Devise a college admissions
scenario that is similar in spirit to Example 6.31 on pp. 340-341. In other
words, determine your probability of being admitted to H. U. (Hansen
University) given that there are several random elements (probability of
being selected given legacy status, probability of being selected given
sports talent, etc.) and a maximum of 250 slots that will be offered to
incoming freshmen. Twins and triplets who apply together should always be
accepted together. Again, you will need to work with Mr. Hansen to determine
suitable simplifying assumptions to make this problem tractable. |
|
|
5. |
Devise a yield estimation
strategy for STA to determine how many freshmen should be offered admission. Each
freshman applicant has a number of characteristics, each of which has a
conditional probability of offer acceptance associated with it. (You can make
these parameters up, or you can ask Mr. Hansen to provide them as givens.) The
goal is to send out enough offer letters so that the probability of having at
least 72 acceptances is at least 95%, and the probability of having more than
84 acceptances is less than 10%. |
|
|
6. |
Students who do not know the answer to a multiple-choice
question should first eliminate the impossible choices, then make a
completely random guess from the choices that remain. However, they seldom do
this. Instead, they play hunches, make educated guesses, and usually fall into
pitfalls that the test designers have laid for them. At least, that is what
Mr. Hansen’s data from several years ago suggested. Design an experiment to
see whether subjects taking multiple-choice test questions for which they do
not know the answers will score higher or lower than chance alone would
predict. Sample question: Which of the following movies did Mr. Hansen see in
2006, given that the answer is not
D? |
|
|
7. |
Design and execute a
simulation to prove that switching in the Monty Hall game is a
better strategy than sticking. Since that is fairly easy, also design and
execute simulations to answer both versions of the chest of drawers problem. |
|
|
8. |
Activity 6.3 on pp. 347-348. |
|
|
9. |
Use a simulation to solve
the classic Type I/Type II error problem of disease screening. In other
words, given a screening test having a certain sensitivity and selectivity,
run 100,000 simulated test subjects through the probability tree and
determine the PPV (positive predictive value) of the test. Repeat for several
additional pairings of sensitivity and selectivity values. Determine whether
the simulated values match the a priori
computations within an acceptable margin. |
|
|
10. |
Simulate 5 million hands of
poker (5-card draw, no discards) and determine whether the observed counts of
royal flushes, full houses, straights, etc. match the a priori probabilities within an acceptable margin. Here, nearly
all of the points will be awarded for your detailed methodology describing
your simulation, since the results are all readily available online. |
|
|
11. |
Resolve the following
puzzles: (1) If early Cum Laude membership were
assigned by chance, what is the probability that 5/8 of the selectees would have surnames from the second half of the
alphabet? Compute both a priori and
empirical (simulated) results. (2) If a fair coin is flipped 250 times, what
is the probability of (a) at least one run of 6 heads in a row, (b) at least
one run of THTHTH or HTHTHT, (c) at least one run of HHHTTTHHH or TTTHHHTTT? |