Project 1: Stats

Finding the mean and median of numerical data

Educational Objectives: After successfully completing this assignment, the student should be able to accomplish the following:

Operational Objectives: Create a project that computes the mean and median of a sequence of integers received via standard input.

Deliverables: Files: stats.h, stats.cpp, main.cpp, log.txt Note that with the supplied makefile these files constitute a self-contained project.

Assessment Rubric: The following will be used as a guide when assessing the assignment:

build test.x                          [0..4]:   x
test.x < data1.in  [supplied main()]  [0..4]:   x
test.x < data2.in  [supplied main()]  [0..4]:   x
build stats.x                         [0..4]:   x
stats.x < data1.in [student main()]   [0..4]:   x
stats.x < data2.in [student main()]   [0..4]:   x
code quality                        [-20..6]:  xx  # note negative points awarded during assessment
dated submission deduction [(2) pts per]:     (xx) # note negative points awarded during assessment
                                               --
total                                [0..30]:  xx

Please self-evaluate your work as part of the development process.

Background

Given a finite collection of n numbers:

  1. The mean is the sum of the numbers divided by n, and
  2. The median is the middle value (in case n is odd) or the average of the two middle values (in case n is even).

Note that to find the median of a collection of data, it is convenient to first sort the data, that is, put the data in increasing (or non-decreasing) order. Then the median is just the middle datum in the sorted sequence (or the average of the two middle data, if there are an even number).

One of the more intuitive sort algorithms is called Insertion Sort, which operates on an array a[0..n-1] of elements. The idea is to "insert" the value of a[i] into the sub-array a[0..i-1] at the largest possible index that results in the expanded sub-array a[0..i] sorted. We insert at the highest possible index in order not to place the value ahead of any previously inserted elements with the same value. The subarray a[0..i-1] is assumed to be sorted at the beginning of each insertion step. The base case consists of a one-element array a[0..0], which is always sorted.

Here is a "pseudocode" description of the algorithm:

for (i = 1; i < n; ++i)
  t = a[i];            // remember the value in a[i]
  starting at j = i
  while j > 0 and t < a[j - 1]
    copy a[j-1] to a[j]
    j = j - 1
  end while            // now we have t >= a[j]
  copy t into a[j]
end for

The inner loop copies all elements in a[0..i-1] up one index until the correct place for t is found. Then put t in that place.

Procedural Requirements:

  1. Begin a log file named log.txt. This should be an ascii text file in cop3330/proj1 with the following header:

    log.txt # log file for UIntSet project
    <date file created>
    <your name>
    <your CS username>
    

    This file should document all work done by date and time, including all testing and test results.

  2. Create and work within a separate subdirectory cop3330/proj1. Review the COP 3330 rules found in Introduction/Work Rules.

  3. Copy these files

    LIB/proj1/makefile
    LIB/proj1/deliverables.sh
    LIB/scripts/submit.sh
    

    from the course distribution library into your project directory.

  4. Create three more files

    stats.h
    stats.cpp
    main.cpp
    

    complying with the Technical Requirements and Specifications stated below.

  5. Turn in four files stats.h, stats.cpp, main.cpp, and makefile using the submit script.

    Warning: Submit scripts do not work on the program and linprog servers. Use shell.cs.fsu.edu to submit projects. If you do not receive the second confirmation with the contents of your project, there has been a malfunction.

  6. After submission, take Quiz 1 in Blackboard. This quiz covers these areas:

    1. Casting; integer and floating point arithmetic.
    2. Function calls
    3. Loops
    4. This assignment
    5. Course Syllabus

    Note that the quiz may be taken several times. The highest of the grades will be recorded and count as 20 points (40 percent of the assignment).

Technical Requirements and Specifications

  1. The project should compile error- and warning-free on linprog with the command make stats.x.

  2. The number of integers input by the user is not known in advance, except that it will not exceed 100. Numbers are input through standard input, either from keyboard or file re-direct. The program should read numbers until a non-digit or end-of-file is encountered or 100 numbers have been read.

  3. Once the input numbers have been read, the program should calculate the mean and median and then report these values to standard output.

  4. The source code should be structured as follows:

    1. Implement separate functions with the following prototypes:
      float Mean   (const int* array, size_t size); // calculates mean of data in array
      float Median (int* array, size_t size);       // calculates median of data in array
      void  Sort   (int* array, size_t size);       // sorts the data in array
      
    2. I/O is handled by function main(); no other functions should do any I/O
    3. Function main() calls Mean() and Median()
    4. Function Median() calls Sort()

  5. The source code should be organized as follows:

    1. Prototypes for Mean, Median, and Sort should be in file stats.h
    2. Implementations for Mean, Median, and Sort should be in file stats.cpp
    3. Function main should be in file main.cpp

  6. The Sort() function should implement the Insertion Sort algorithm.

  7. When in doubt, your program should behave like the distributed executable example stats_i.x in area51. Identical behavior is not required, but the general I/O behavior should be the same. In particular, the data input loop should not be interupted by prompts for a next datum - this will make file redirect cumbersome. Just ask for the data one time, then read until a non-digit or end of file is encountered.

Hints