Purpose: 

Excel 1 helps you to make sure that you have understood the presentation on Descriptive Statistics for Decision Making.  Recall this presentation had two parts: (1) The Math and (2) Using Excel to do the Math.

This lab is a very skills-based lab. That is, the issue of how descriptive statistics can be used for decision making is not addressed.  This will be the focus of Excel 2.

The data sets you will work with in Excel 1 are very tiny, so that you can do the relevant calculation both by hand and also using Excel.  The reason for this is that by doing the computations two ways, one method can be used to check the other.

Math Concepts you will need to understand:

For a given data set: mean, median, mode, standard deviation, frequency table, histogram.

Excel Concepts you will need to understand:

How to do within Excel many things you can do  "BY HAND"  In particular, entering functions, including, MEAN, MEDIAN, MODE and the  "array" function, FREQUENCY.  You should also be able to use Excel's graphing capabilities to use the frequency values you computed to create a Histogram.

In addition you should know how to enter a section of a COLUMN of numbers [as in B4:B9 for rows 4 through 9 of column B] as the argument of a function [as in =MEAN(B4:B9), which computes the mean of the numbers from B4 through B9].

Assignment

OVERVIEW: Among other things, in this problem we will compute a number descriptive statistics of a certain data set by two distinct methods: BY HAND and using Excel.  The reason that we are doing this twice is to make sure each is correct.  Only when there is agreement with both methods should you have confidence that you did the Lab correctly.  The data set was chosen to be rather small just so that you can do the computations BY HAND.

When you do Excel 2, the data sets will be enormous, and so you'll only be able to do the required analysis with Excel.  If you were successful with Excel 1, then you should be confident that your Excel analyses are accurate when you do Excel 2.

Here, then, is the assignment for this lab: Consider the follow data set consisting of the seven numbers: 
             5, 6, 6, 6, 8, 3, 1

(1) By HAND, compute (and show all of your work and intermediary steps) the Mean [=Average], Median, Mode, and (Population) Standard Deviation of this data set.

(2) BY HAND, Using the bins (- infinity,3], (3,6], (6,9], (9,12], (12,infinity), construct the frequency table for this data.

(3) BY HAND, draw the Histogram for this frequency table.  Label each axis and each column carefully.

(4) Consider the last entry, 1, of this data set.  Change its value so that the mean is larger than median. Explain why your answer is correct. [Hint, make a guess. If you are right, then you are done; if you are not right, make another guess.  With some luck you'll either get lucky or figure out how to figure it out.]

(5) Using the original data set, {5, 6, 6, 6, 8, 3, 1}, consider, again, the last entry, 1, of this data set.  Change its value so that the mean is smaller than median.  Explain why your answer is correct

(6) [Harder]  Using the original data set, { 5, 6, 6, 6, 8, 3, 1}, consider, again, the last entry, 1, of this data set.  Change its value so that the median is equal to the mean.  Explain why your answer is correct

(7) Using the original data set, { 5, 6, 6, 6, 8, 3, 1}, consider, again, the last entry, 1, of this data set.  Change its value so that the (population) standard deviation is more than 10.  Explain why your answer is correct

(8) Create an Excel Spreadsheet with the following three columns:

  1. The data set [Excel calls this a Data_array] listed above (so it will be a column of length 7).  [You just need to TYPE the entries of this column.]
  2. A "Bins_array" for the bins (infinity,3], (3,6], (6,9], (9,12], (12,infinity).  HINT: The column starts with 3 and ends in 12.  [Again, you just need to TYPE the (appropriate) entries of this column.]
  3. Using Excel's FREQUENCY function, have Excel create a :"Frequency" Column right next to the Bins_array of exactly one row longer [so it will have 5 rows.].  

NOTE:  Excel, in THIS case, will put a 0 in this fifth row of the frequency column: WHY?  Also, label each column with a text label to document the contents of each column.

NOTE:  The resulting Frequency Column you get should agree with your answer for part (2).  If not, either you made a mistake in the BY HAND part or in the Excel part.

(9) Use Excel's built-in functions: MEAN, MEDIAN, MODE, STDEVP to compute the mean, median, mode and standard deviation of this data set.  Make sure you use some text entries to label each of these four numbers.

NOTE: The answers you get should, as above, agree with your answers to part (1).  If not, either you made a mistake in the BY HAND part or in the Excel part.

(10) Using the resulting Frequency Column, use Excel's graphing capabilities draw a HISTOGRAM for you.  Put his histogram on the same Excel sheet as the data.

How to Hand In

For questions 1-7, hand in your results on sheets of paper, to your seminar instructor.

[NOTE: When you write up these homework problems, you MUST use words and sentences.  One cannot explain mathematics with just strings of symbols.  You need to say things like: "The MEAN is defined to be the sum of the data values divided by the number of data values.  So, in our case, we get ...... (show the steps) the answer _____."]

Since questions 8, 9, and 10 ask you to create a spreadsheet, you'll need to get this digital spreadsheet to your instructor.

You'll have to EITHER give your instructor a disk or e-mail this spreadsheet (as an attachment) to your seminar instructor.  FIND OUT FROM YOUR PARTICULAR INSTRUCTOR THE METHOD TO USE.

Make sure you include on the spreadsheet:

Here is a sample format of your spreadsheet:

Point Values

Excel 1 is worth 20 points.


How To Use Excel to Make a Frequency Table

It's a bit of a trick to enter the FREQUENCY function, as you have to 

  1. Put the data that your are analyzing in a column.  This column is called the Data_array.

  2. Create column of numbers that has the bin boundaries in it.

  3. Highlight (with the mouse) the column NEXT to that column but of length ONE LONGER, so that the highlight starts level with the top of the bin list but extends one more row down.

  4. Go to the menus on top of the Excel page and choose:  Insert, Function and pick Statistical from the left section of the dialog box.  Then pick FREQUENCY from the right dialog box and choose ok.

  5. A new dialog box opens up: type the range for your data in the "Data_array" spot and type the range for your bins in the "Bins_array" spot.  Remember, the format for a range is A4:A23, if, for example, the data starts in cell A4 and extends through cell A23.

  6. Here is the tricky part:  You DON'T choose OK next.  

  7. Instead you use the keys "Control-Shift-Enter."  [Press, first, the Control and Shift keys and with them held down, press Enter.  When you've done that, you'll get the frequencies entered right next to their appropriate bins, just where you highlighted.]

Sample Frequency Table

For a bin of the form (a,b]  - that is "bigger than a and less than or equal to b" - having N items in that bin:

recall: the "by hand" table should look like:

Bin Boundaries

Number of Items in Bin
[Frequency]
Bigger than Less than or equal to
... ...  
a b

N

... ...  

and that "Excel" expects this table to be transformed into the following form:

Bins Frequency
...  
a  
b

N

...

 

[The point is, Excel does not understand the "bigger than" and "the less than or equal to" columns in the "by hand" version.  Please look at the "All but the Histogram"  tab on THIS spreadsheet (that was used in the presentation)  for a sample excel frequency table.]

NOTE: There is a bit of a complication with the first and last rows of the frequency table.

The FIRST ROW of the BY HAND table should look like:

Bin Boundaries Number of Items in Bin
[Frequency]
Bigger than Less than or equal to 
- infinity x number of values less than or equal to x

The corresponding FIRST ROW of the table Excel makes should look like:

Bins Frequency
x

number of values less than or equal to x

The LAST ROW of the BY HAND table should look like:

Bin Boundaries Number of Items in Bin
[Frequency]
Bigger than Less than or equal to 
y infinity number of values greater than y

The corresponding LAST ROWS of the table Excel makes should look like:

Bins Frequency
y  
[leave blank!!]

number of values greater than y

Sample of Spreadsheet to Hand In

Fill in the x's.  Some of the x's you just type in (like the Data Set) and some Excel computes (like the frequencies, mean, median, mode and standard deviation.)  Of course, Excel draws the histogram.

SAMPLE SPREADSHEET FOR Excel 1

NAME: x
INSTRUCTOR: x
Data Set Bins_Array Frequency
x x x
x x x
x x x
x x x
x x
x
x
mean x
median x
mode x
std dev x