STAtistics / Mr. Hansen

Name: _______________________________________

11/18/2009

 

 

Simulation Ideas
(Pick one, or develop your own)

 

1.

When a passenger aircraft is being boarded, passengers will board either “conventionally” (i.e., priority boarding, then first class, then rows 35-40, then 25-40, then 15-40, and finally all rows) or “Southwest style.” In the Southwest boarding style, there is no first class, and all passengers are divided into boarding groups (A, B, or C) based on time of arrival at the gate. The A group will board first, then the B group, then the C group. Passengers may choose any open seat. Determine which method is faster, on average, and what the histograms of times look like under each scenario. You will need to make a large number of simplifying assumptions in order to make this problem tractable, but you can work with Mr. Hansen to develop a project that is both feasible and meaty.

 

 

2.

Hitting streaks are notable in professional baseball, with the longest on record being Joe DiMaggio’s 56-game streak in 1941. Using parameters with probability distributions for number of at-bats in each game and batting average in each game, determine relative frequency histograms for length of hitting streaks (in number of games) subject to various pairings of centerline parameter values.

 

 

3.

Compare pseudorandom data (either numeric or text) generated by a computer program with “attempted random” data generated by human test subjects using no electronics, no coins, no dice, no assistance of any kind. Develop markers (e.g., number of repeated characters, maximum length of repeated characters, maximum number of sequential characters separated by a certain fixed spacing) that make it easy to distinguish the computer results from the human results, and then test your markers for reliability by asking a naive group of test subjects to generate “attempted random” sequences of the target length.

 

 

4.

Devise a college admissions scenario that is similar in spirit to Example 6.31 on pp. 340-341. In other words, determine your probability of being admitted to H. U. (Hansen University) given that there are several random elements (probability of being selected given legacy status, probability of being selected given sports talent, etc.) and a maximum of 250 slots that will be offered to incoming freshmen. Twins and triplets who apply together should always be accepted together. Again, you will need to work with Mr. Hansen to determine suitable simplifying assumptions to make this problem tractable.

 

 

5.

Devise a yield estimation strategy for STA to determine how many freshmen should be offered admission. Each freshman applicant has a number of characteristics, each of which has a conditional probability of offer acceptance associated with it. (You can make these parameters up, or you can ask Mr. Hansen to provide them as givens.) The goal is to send out enough offer letters so that the probability of having at least 72 acceptances is at least 95%, and the probability of having more than 84 acceptances is less than 10%.

 

 

6.

Students who do not know the answer to a multiple-choice question should first eliminate the impossible choices, then make a completely random guess from the choices that remain. However, they seldom do this. Instead, they play hunches, make educated guesses, and usually fall into pitfalls that the test designers have laid for them. At least, that is what Mr. Hansen’s data from several years ago suggested. Design an experiment to see whether subjects taking multiple-choice test questions for which they do not know the answers will score higher or lower than chance alone would predict. Sample question: Which of the following movies did Mr. Hansen see in 2006, given that the answer is not D?

     (A)   The Devil Wears Prada
     (B)    Dave Chappelle’s Block Party
     (C)    A Prairie Home Companion
     (D)    Click
     (E)    Happy Feet

 

 

7.

Design and execute a simulation to prove that switching in the Monty Hall game is a better strategy than sticking. Since that is fairly easy, also design and execute simulations to answer both versions of the chest of drawers problem.

 

 

8.

Activity 6.3 on pp. 347-348.

 

 

9.

Use a simulation to solve the classic Type I/Type II error problem of disease screening. In other words, given a screening test having a certain sensitivity and selectivity, run 100,000 simulated test subjects through the probability tree and determine the PPV (positive predictive value) of the test. Repeat for several additional pairings of sensitivity and selectivity values. Determine whether the simulated values match the a priori computations within an acceptable margin.

 

 

10.

Simulate 5 million hands of poker (5-card draw, no discards) and determine whether the observed counts of royal flushes, full houses, straights, etc. match the a priori probabilities within an acceptable margin. Here, nearly all of the points will be awarded for your detailed methodology describing your simulation, since the results are all readily available online.

 

 

11.

Resolve the following puzzles: (1) If early Cum Laude membership were assigned by chance, what is the probability that 5/8 of the selectees would have surnames from the second half of the alphabet? Compute both a priori and empirical (simulated) results. (2) If a fair coin is flipped 250 times, what is the probability of (a) at least one run of 6 heads in a row, (b) at least one run of THTHTH or HTHTHT, (c) at least one run of HHHTTTHHH or TTTHHHTTT?