Draft now open for comment in the discussion forum.

Homework 5: String Objects

Educational Objectives: After completing this assignment the student should have the following knowledge, ability, and skills:

Operational Objectives: Supply missing code to complete the implementation of the class fsu::String cl.Fully test the class with fstring.cpp, fstringb.cpp, and stringsort.cpp.

Deliverables: Two files xstring.cpp and stringsort.cpp.

Expandable String Objects

The class String is the official fsu string object tool. The new type fsu::String will have these features:

Implementation Plan

The implementation plan for fsu::String uses two private variables:

size_t size_;
char * data_;

The class has the responsibility for maintaining these variables in a consistent manner and for allocating and de-allocating resources (i.e., memory assigned to data_). In general, a String object will have size_ + 1 bytes allocated and will maintain null-termination of data_, which leaves size_ elements of data_ where client characters can be stored.

There is a safe allocator method intended to be used by constructors, copy constructor, assignment operator, the input operator, and any other methods or functions that need to create a new object:

static char* NewStr(size_t size)

This method returns a pointer to allocated memory after (1) checking that the allocation request was fullfilled and (2) null-terminating the allocated character array. The destructor de-allocates this resource.

I/O Operator Expectations

The overloaded input operator for a type T has this prototype:

std::istream& operator >> (std::istream& is, T& t);

and its implementation should always conform to these expectations:

  1. First skip clear space until a visible character is encountered
  2. Read visible characters until either the end of file is encountered or the next character cannot be interpreted as part of an object of type T. In the case of strings, this translates to: read visible characters until either end of file or a clearspace character is encountered. The string so read is called a "token".
  3. Construct a T object from this token and place it in t
  4. Return the stream is without removing any characters past the last token character

Similarly, the overloaded output operator for a type T has this prototype:

std::ostream& operator << (std::ostream& os, const T& t);

and its implementation should output a token representation of the object of type T.

The two I/O operators should be mutually consistent in the sense that outputting the object t to a token string with

os << t;

followed by inputting the token string with

is >> t;

will result in reconstructing the object t as it was before output, and inputting to a token string with

is >> t;

followed by outputting with

os << t;

will result in the same token representation of a T object.

MergeSort Algorithm

The file mergesort.h contains a template function with the following interface (prototype):

void fsu::g_merge_sort(RAIterator beg, RAIterator end);

This is a generic algorithm in the technical sense that it is a template function whose template arguments are intended to be "iterators" (in this case, random access iterators). The template parameter RAIterators is chosen as a self-documenting reminder of the type expected by the algorithm.

You already have some limited experience with two kinds of random access iterators: (1) pointers and (2) std::vector iterators.

  1. Pointers are often used to "iterate" through the elements of an array. For example:
    size_t size = 20; int a [size]; ... for (int* i = a; i != a + size; ++i) { // whatever }
    is a loop that traverses the array a. Here the loop counter i is a pointer and the increment ++i is "pointer arithmetic". Note also that a is the (beginning address of) the array and that a + size is the address that is "one past the end" of the array. All this comes together in how you would call merge sort to sort the array:
    // code setting up the array a ... fsu::g_merge_sort(a, a + size); // sorts the range [a, a+size) // now a is sorted ...
    Note how we pass a "range" in the half-open convention to the generic algorithm, which then makes its magic on the range you give it. The client programmer (i.e., you) has the responsibility to ensure that the expectations of the generic algorithm are met. The compiler will enforce this on you.
  2. The other experience you have with random access iterators is from the MergeSort assignment. Here we used a standard vector to hold data to be sorted and handed iterator arguments to merge sort.
    // code setting up the vector v ... fsu::g_merge_sort(v.begin(), v.end()); // sorts the range [v.begin(), v.end()) // now v is sorted ...
    In this usage the standard vector has methods begin() and end() that return "iterators" pointing to the beginning and one-past-the-end of the vector range. Again we invoke the generic algorithm by passing a range in the half-open convention and allow the algorithm to work on the range specified.

A String Sort Application

We are getting to the point that we have significantly useful technology. The fsu::String class has the less-than operator overloaded (to mean lexicographical order), and fsu::g_merge_sort() is a generic sort algorithm that should apply to both vectors and arrays of such objects. We can now create a very simple application to sort strings in a file and write the result to another file:

stringsort.x infile outfile

reads the strings from infile, sorts them, and then writes them in order to outfile. Depending on user preference, this program could be changed to use OS redirect at the command line instead of command line arguments, operating like this:

stringsort.x < infile > outfile

Either way, stringsort.x can be a useful desktop tool.

Procedural Requirements

  1. Create and work within a separate subdirectory cop3330/hw5. Review the COP 3330 rules found in Introduction/Work Rules.

  2. Begin by copying the following files from the course directory /home/courses/cop3330p/fall08/ into your hw5 directory:

    hw5/xstring.h              // complete header file for fsu::String 
    hw5/xstring.cpp-partial    // partial implementation file
    hw5/fstring.cpp            // command-line functionality test
    hw5/fstringb.cpp           // command-file functionality test
    hw5/mergesort.h            // g_merge_sort
    

    These files contain the header file in which class fsu::String is defined, a start for the implementation file xstring.cpp, two functionality test clients for class fsu::String (one is designed to take a file of commands), and the generic sort algorithm discussed above.

  3. Copy the distribution file xstring.cpp-partial to xstring.cpp.

  4. Modify the file xstring.cpp according to the requirements and specifications below.

  5. Create a string sort application stringsort.cpp using the ideas above. This should read strings from standard input into an array, sort the array, then write the strings to standard output. Use one call to g_merge_sort to sort the array. Be sure the program works command line file names as shown above. Keep it simple!

  6. Test your code thoroughly using fstring.cpp, fstringb.cpp, and stringsort.cpp.

  7. Turn in two files xstring.cpp and stringsort.cpp using the hw5submit.sh submit script.

    Warning: Submit scripts do not work on the program and linprog servers. Use shell.cs.fsu.edu to submit projects. If you do not receive the second confirmation with the contents of your project, there has been a malfunction.

Code Requirements and Specifications: xstring.cpp

  1. Your implementation file xstring.cpp should use the distributed header file xstring.h (not a modified version).

  2. Your implementation file should use the function headers as distributed in xstring.cpp-partial.

  3. Your code should compile without errors or warnings using the command:

    g++ -c -I. -Wall -Wextra xstring.cpp
    

  4. Your code should function correctly with any (legal) sequence of commands for the test harnesses.

Code Requirements and Specifications: stringsort.cpp

  1. The command:

    stringsort.x infile outfile
    

    should result in outfile being a sorted permutation of the strings in infile.

  2. Stringsort should call the template sorting function g_merge_sort from mergesort.h using one of these two calls:

    #include <mergesort.h>
    ...
    fsu::g_merge_sort(a, a + n);             // array case
    fsu::g_merge_sort(v.begin(), v.end());   // std::vector case
    ...
    

    where a is an array of strings or v is a std::vector of strings, respectively. The choice of using arrays or vectors is yours.

Hints