AP Statistics / Mr. Hansen |
Name: _________________________ |
Project #1: Exploratory Data Analysis
|
This assignment is due in final form at the start of class
Wednesday, |
1. |
Log in under one partner’s name, open the spreadsheet (FAKEDATA.xls), and save it to your personal area under \\BONEBOX\USERS on the network. |
2. |
Use the Excel on-line help facility to figure out how to set “worksheet titles” so that the top 3 rows remain fixed and do not scroll when you scroll the rest of the screen. Raise your hand when you have accomplished this task. Mr. Hansen’s initials: ______ |
3. |
Create a formula in cell E4 that sums the two cells to the left (C4 and D4). Copy this formula, highlight to the end of the spreadsheet (SHIFT+CTRL+END), and paste. |
4. |
Highlight the cells in column E and use either F11 or the Chart Wizard to create a time series (a.k.a. time plot, to use your textbook’s term). Mr. Hansen’s initials: ______ |
5. |
Change the format of the time series from bars to a line without markers. Mr. Hansen’s initials: ______ |
6. |
The time series looks quite random, doesn’t it? Paste a
second column onto the chart, formed by generating random values between 2
and 13. (The |
7. |
Use the FREQUENCY function and the Chart Wizard to create a histogram of column E. You will need to consult the on-line help for the FREQUENCY function in order to adapt the examples shown to your needs. Note that an “array formula” must be entered by highlighting an entire range and then pressing SHIFT+CTRL+ENTER (instead of the usual ENTER) to create the formula. Show Mr. Hansen your histogram before proceeding (initials: _____). AFTER YOUR HISTOGRAM HAS BEEN APPROVED, please make a printout. |
8. |
Use the terminology we discussed in class to describe the distribution of values in column E. Write your description directly on the printout you made in step 7. |
9. |
Click the tab for Sheet 2 of your workbook. On Sheet 2,
create formulas to compute each of the statistics shown below for the data in
column E. For example, you can use the MEDIAN function to find the median and
other functions under the “Statistics” category to find the rest. Label your
formulas in some neat, clear fashion. Mr. Hansen’s initials: ______ |
10. |
Draw a modified boxplot of the data. Attach numeric labels
to the outliers, Q1, M, and Q3. Do a rough sketch in the area below, and then
recopy it neatly on the reverse side of your histogram at home or after class.
Neatness counts. |
11. |
In cell F3 of Sheet 1, type the words “Day of Week.” Create a date serial number formula in cell F4 by typing the formula =DATE(2002,[entry from col. A],[entry from col. B]). Apply the custom format dddd to cell F4 before proceeding. Mr. Hansen’s initials: ______ |
12. |
In cell G4, type the formula =MOD(F4,7) in order to generate a number between 0 and 6 that denotes the day of the week. If the MOD formula displays as a day of the week instead of a number, reset its cell format to General. In cell H4, type the formula =E4 in order to create a duplicate copy of the sleep totals. Now copy the formulas in cells F4, G4, and H4 all the way to the bottom of the data set. If you have forgotten how to do this, refer back to the instructions for step 3. Mr. Hansen’s initials: ______ |
13. |
Replace all the cells in columns G and H with values. (Copy, followed by Edit / Paste Special / Values.) Mr. Hansen’s initials: ______ |
14. |
Compute the mean trace, median trace, and IQR, by day of
week, for the data in column H. You may find it useful to sort columns G and
H. Summarize your findings in the table below. |
15. |
Perform “smoothing” on the original data in column E by computing the 7-day and 14-day moving averages. Plot these two time series on the same set of axes, and make a printout. Mr. Hansen’s initials: ______ |
16. |
Write a paragraph or two (below, or on the reverse side of your smoothed plots) in which you describe the nature of the data. Discuss any trends or cyclical patterns that you observed. Refer to statistics in your writing. Clarity, spelling, and grammar count. |