Project 5: ADT-Based Iterators

ADT-Based Iterators for BST-based OAA and Map.

Revision dated 06/24/18: signatures for InorderIterator support corrected.

Educational Objectives: After completing this assignment, the student should be able to accomplish the following:

Describe and explain in detail the concept forward and bidirectional iterators on Set and Map container classes.
Implement Stack-based bidirectional "in-order", "pre-order", and "post-order" iterators for binary trees
Implement Queue-based forward "level-order" iterators for binary trees
Implement threaded BSTs and associated threaded iterators
Explain the utility of iterators on Sets and Maps
Give examples of Set and Map applications that are simple with iterators but difficult without iterators.
Explain what the runtime and other efficiency considerations are between RBLL Trees and BSTs.
Explain what the runtime and other efficiency considerations are between ADT-based iterators and thread-based iterators.

=======================================================================
Rubric used in assessment
-----------------------------------------------------------------------
builds
 student makefile                                          [0..5]:   x
 assess  makefile                                          [0..5]:   x
tests:
 fmap.x  [OAA operations Put, Get, Retrieve, Erase]        [0..8]:   x
 fmap.x  [Iterator operations, Includes]                   [0..8]:   x
 fmap.x  [standard traversals, Analysis]                   [0..8]:   x
 wb3.x   [english text only]                               [0..8]:   x
 mmap.x  [hammer test]                                     [0..8]:   x
other:
 log & testing report                                    [-50..0]: ( x)
 requirements and specs                                  [-50..0]: ( x)
 software engineering                                    [-50..0]: ( x)
 dated submission deduction                          [2 pts each]: ( x)
                                                                    --
total:                                                    [0..50]:  50
=======================================================================

Background Knowledge Required: Be sure that you have mastered the material in these chapters before beginning the assignment:
Iterators, Introduction to Sets, Introduction to Maps, Binary Search Trees, Balanced BSTs, and BST Iterators.

Operational Objectives: Implement class templates ConstInorderMapIterator, InorderMapIterator, and LevelorderMapIterator and use these classes to complete the implementations of the class template Map_BST.

Deliverables:

map_bst_rec_adt.h # Map_BST class template
mapiter_adt.h     # Map_BST::Iterator class templates
wordbench3.h      # refactors wordbench2 using fsu::Map_BST
wordbench3.cpp    #     "         "        "        "
wordify.cpp       # same file as used for previous project - submitted for convenience
makefile.wb3      # makefile builds fmap.x, mmap.x, and wb3.x
log.txt           # your project work log

Discussion

This project explores the addition of Iterator classes associated with the Ordered Associative Array (OAA) class that was the subject of the previous project. Recall in the previous project, we did not have iterators, but nevertheless created a servicable Map-like container supporting the associative array API (Put, Get, Retrieve). We had to go to extraordinary lengths to obtain a useful traversal, and we were handicapped by the lack of equality operator among OAA objects. We had no way to make sense out of the fundamental Table operation Includes. (But Retrieve is a useful work-around.) Features we will easily obtain using iterators include the following:

Includes method, returning an Iterator object
Operators ==() and !=() defined among Map objects
The Standard Traversal is operational

And we may put aside the special methods for traverals and Display and their recursive implementations. The two tools are compared in the following table.

Associative Array API
&data = Get(key) // returns &data stored at key; ensures key exists in table Put(key,data) // unimodal insert of (key,data) pair; alias for Insert &data = operator[](key) // AA bracket operator; alias for Get

Table API
Insert(key,data) // unimodal insert of (key,data); alias for Put iter = Includes(key) // returns iter to key if found, End() otherwise (const and non-const) Begin() and End() // supporting bidirectional iterators (const and non-const versions)

The APIs have these in common:
bool Retrieve(key,&data) const // if true, &data is a copy of data stored with key Remove(key) or ( Erase(key) and Rehash() ) // Erase implements "lazy removal" (dead/alive flag), Rehash reconstructs the tree with dead nodes removed

Both APIs equipped with the usual container boiler plate: Empty, Size, Clear, Constructors, Destructor, operator=

In practice, one has either OAA nicknamed "Table Lite", the iterator-free associative array, or Map nicknamed "Full Table" or "Dictionary", which includes both Associative Array and Table APIs.

One may wonder about code bloat - throwing unwanted operations into the executable code. A big advantage of using templates is that a function template is not translated to object code unless it is actually called in the program. For class templates, this means that unused member functions are not compiled. This makes it sensible and convenient to have both the Associative Array API and the Table API supported by the Map container.

Varieties of Iterators

With a sequential structure such as a list, there is one obvious way for an iterator to go through the elements, from front to back. In a Set or Map structure, the "correct" order is neither unique nor obvious. When using any of the BST implementations, any kind of traversal might be used to define iterators, and we have at least 4: Inorder, Preorder, Postorder, and Levelorder. We will use all of these defined as external or "ADT-based" iterator classes. The native iterator class will be thread-based and have the advantage over stack-based Inorderiterator in that it requires +O(1) memory and is very fast. Please be familiar with the chapter on tree iterators before diving deeper into this project.

Map implementations also present the issue of what it means to de-reference an iterator. Sometimes we want the key, sometimes the data, and sometimes both. The solution is to package the key,data in a single object called an entry. An Entry object is similar to a Pair, except: (1) the names of the data are key_ and data_ (rather than first_ and second_), and (2) the key_ is a constant, so that it can never be changed. Mimicking the terminology in the standard library, we name the internal type to be returned by a dereferenced iterator "ValueTyp".

The two varieties of iterator we will implement embody two of the fundamental algorithms of computer science: Inorder iterators follow Depth First Search [DFS] from the root of the tree, and Levelorder iterators follow Breadth First Search [BFS] from the root of the tree. We implement DFS with a stack controller and BFS with a queue controller.

Procedural Requirements

The official development/testing/assessment environment is specified in the Course Organizer.
Create and work within a separate subdirectory cop4530/proj5.

Begin by copying all files in the directory LIB/proj5 into your proj5 directory. At this point you should see these files (at least) in your directory:

deliverables.sh
fmap.cpp               # test harness for Map
mapiter_adt.start      # start for mapiter_adt.h
map_bst_adt_tools.cpp  # slave file for map_bst_rec_adt.h
main_wb3.cpp           # 3rd time's a charm for wordbench
mmap.cpp               # hammer tester
rantable.cpp           # create test files to be loaded by fmap.x

Then copy these relevant executables from LIB/area51/:

fmap_i.x
mmap_i.x
wb3_i.x

Create the deliverables

map_bst_rec_adt.h
mapiter_adt.h
wordbench3.h
wordbench3.cpp
makefile.wb3
log.txt

satisfying the requirements and specifications below.

Test thoroughly, using the area51 executables as benchmarks. (See Hints on testing.)
Submit the assignment using the command submit.sh.

Warning: Submit scripts do not work on the program and linprog servers. Use shell.cs.fsu.edu to submit assignments. If you do not receive the second confirmation with the contents of your assignment, there has been a malfunction.

Development Guide

Getting started.
1. Begin by copying several of your proj4 files, with name changes, as follows (assuming you are in directory proj5):
```
    source                  destination
    ------                  -----------
cp ../proj4/oaa.h           map_bst_rec_adt.h
cp ../proj4/wordbench2.h    wordbench3.h
cp ../proj4/wordbench2.cpp  wordbench3.cpp
cp ../proj4/wordify.cpp     .
cp ../proj4/foaa.cpp        .
```
  All subsequent modifications should be to these files in proj5.
2. Replace the include statements "#include<oaa.h>" in foaa.cpp and wordbench3.h with "#include<map_bst_rec_adt.h>". foaa.cpp will serve as a temporary test for map_bst_rec_adt until the transition to packaged data is complete.
3. Next, in map_adt, in the definition of Node, replace the two naked data fields with a packaged pair as follows:
```
    class Node
    {
      ...
      // const KeyType   key_;
      //       DataType  data_;
      fsu::Entry<K,D>    value_;
      ...
    };
```
  Also add "#include <entry.h>" to your include stataments near the top of the file. This change in the way data is housed in a link will require some minor modifications in the code (for example, replacing "nptr->key_" with "nptr->value_.key_"). Make the necessary changes so that your foaa compiles and functions normally. You can also build wb3.x from wordbench3 source code and make sure it functions as wordbench2 did for project 4. Use the 4530 compile macros for debugging.
4. Now change the class name from OAA to Map_BST throughout the remaining files in the directory. Use " grep 'OAA' * " to find these places.
Once these steps have been completed, you have a working "map" class that has the Associative Array API implemented. We now work toward adding the Table API.

Adding Iterator support in file map_bst_rec_adt.h

Remove both prototype and implementation of the following entities:
- Inorder, Preorder, Postorder (template methods)
- Display (method)
- PrintNode (class)
- CopyNode and/or InsertNode, function classes used in the implementation of Rehash
These will not be needed once we have iterators.

Modify the beginning of your class definition to be as follows:

  class Map_BST
  {
  public:

    // family ties
    friend class ConstInorderMapIterator < fsu::Map_BST<K,D,P> >;
    friend class InorderMapIterator      < fsu::Map_BST<K,D,P> >;
    friend class LevelorderMapIterator   < fsu::Map_BST<K,D,P> >;
    friend class PreorderMapIterator     < fsu::Map_BST<K,D,P> >;
    friend class PostorderMapIterator    < fsu::Map_BST<K,D,P> >;

    // terminology support
    typedef K                                      KeyType;
    typedef D                                      DataType;
    typedef P                                      PredicateType;
    typedef fsu::Entry<K,D>                        ValueType;
    typedef InorderMapIterator < Map_BST <K,D,P> > Iterator;
    typedef ConstInorderMapIterator < Map_BST <K,D,P> > ConstIterator;

    // iterator support
    Iterator      Begin  ();
    Iterator      End    ();
    Iterator      rBegin ();
    Iterator      rEnd   ();
    ConstIterator Begin  () const;
    ConstIterator End    () const;
    ConstIterator rBegin () const;
    ConstIterator rEnd   () const;

    // special iterators
    typedef LevelorderMapIterator   < Map_BST <K,D,P> > LevelorderIterator;
    LevelorderIterator BeginLevelorder   () const;
    LevelorderIterator EndLevelorder     () const;

    typedef InorderMapIterator      < Map_BST <K,D,P> > InorderIterator;
    InorderIterator BeginInorder         ();
    InorderIterator EndInorder           ();
    InorderIterator rBeginInorder        ();
    InorderIterator rEndInorder          ();

    typedef ConstInorderMapIterator < Map_BST <K,D,P> > ConstInorderIterator;
    ConstInorderIterator BeginInorder    () const;
    ConstInorderIterator EndInorder      () const;
    ConstInorderIterator rBeginInorder   () const;
    ConstInorderIterator rEndInorder     () const;

    typedef PreorderMapIterator     < Map_BST <K,D,P> > PreorderIterator;
    PreorderIterator BeginPreorder       () const;
    PreorderIterator EndPreorder         () const;
    PreorderIterator rBeginPreorder      () const;
    PreorderIterator rEndPreorder        () const;

    typedef PostorderMapIterator    < Map_BST <K,D,P> > PostorderIterator;
    PostorderIterator BeginPostorder     () const;
    PostorderIterator EndPostorder       () const;
    PostorderIterator rBeginPostorder    () const;
    PostorderIterator rEndPostorder      () const;

    // structural iterator support
    ConstIterator BeginStructuralInorder () const;
    ...

Most of this is new. There is more elaborate terminology support and a lot of iterator support prototyping. You can see from this what is coming: (1) develope Interator and ConstIterator classes; (2) develop LevelorderIterator class; and (3) develop structural traversal for special ops. ("Structural" means including tombstones, as they are part of the tree structure, but not part of the map data.)

The "InorderIterator" versions of iterator and iterator support are (in this implementation of Map) identical to the native "Iterator". This amounts only to having alternate terminilogy. But it also leaves the door open to change the native iterator type, to "threaded" for example, while retaining the ADT-based Inorder versions.

In the definition of Node, add friendships as follows:

    class Node
    {
      friend class Map_BST<K,D,P>;
      friend class ConstInorderMapIterator < fsu::Map_BST<K,D,P> >;
      friend class InorderMapIterator      < fsu::Map_BST<K,D,P> >;
      friend class LevelorderMapIterator   < fsu::Map_BST<K,D,P> >;
      friend class PreorderMapIterator     < fsu::Map_BST<K,D,P> >;
      friend class PostorderMapIterator    < fsu::Map_BST<K,D,P> >;

      // const KeyType   key_;
      //       DataType  data_;
      fsu::Entry<K,D>    value_;
      ...
    };

Add these prototypes above the class definition:

  template < typename K , typename D , class P >
  class Map_BST;

  template < typename K , typename D , class P > 
  bool operator == (const Map_BST<K,D,P>& map1, const Map_BST<K,D,P>& map2);

  template < typename K , typename D , class P > 
  bool operator != (const Map_BST<K,D,P>& map1, const Map_BST<K,D,P>& map2);

The global operators will need to be implemented below the class definition. The implementation of equality uses iterators:

  template < typename K , typename D , class P > 
  bool operator == (const Map_BST<K,D,P>& map1, const Map_BST<K,D,P>& map2)
  {
    typename Map_BST<K,D,P>::ConstIterator i,j;
    for (i = map1.Begin(), j = map2.Begin(); i != map1.End() && j != map2.End(); ++i, ++j)
    {
      if ( (*i).key_ != (*j).key_ || (*i).data_ != (*j).data_ ) return 0;
    } 
    if (i != map1.End() || j != map2.End()) return 0;
    return 1;
  }

and the not-equal operator always follows the same pattern of implementation:

  template < typename K , typename D , class P > 
  bool operator != (const Map_BST<K,D,P>& map1, const Map_BST<K,D,P>& map2)
  {
    return !(map1 == map2);
  }

Note that we are now using a Standard Traversal, even before figuring out how to implement the iterators required to perform it.

Implement the various iterator support methods. Keep in mind, these are container class methods. They are typically implemented by invoking an initializing method in the iterator class. Some examples, from which the remainder can be extrapolated:

  template < typename K , typename D , class P >
  typename Map_BST<K,D,P>::Iterator Map_BST<K,D,P>::Begin()
  {
    Iterator i;
    i.Init(root_);
    return i;
  }

  template < typename K , typename D , class P >
  typename Map_BST<K,D,P>::Iterator Map_BST<K,D,P>::End()
  {
    Iterator i;
    return i;
  }

  template < typename K , typename D , class P >
  typename Map_BST<K,D,P>::Iterator Map_BST<K,D,P>:: rBegin()
  {
    Iterator i;
    i.rInit(root_);
    return i;
  }

  template < typename K , typename D , class P >
  typename Map_BST<K,D,P>::ConstIterator Map_BST<K,D,P>::Begin() const
  {
    ConstIterator i;
    i.Init(root_);
    return i;
  }

  template < typename K , typename D , class P >
  typename Map_BST<K,D,P>::LevelorderIterator Map_BST<K,D,P>::BeginLevelorder()
  const
  {
    LevelorderIterator i;
    i.Init(root_);
    return i;
  }

  template < typename K , typename D , class P >
  typename Map_BST<K,D,P>::ConstIterator Map_BST<K,D,P>::BeginStructuralInorder() const
  {
    ConstIterator i;
    i.sInit(root_);
    return i;
  }

Note we are now committed to these methods in our Iterator classes:

  Init    // initialize an iterator - Inorder and Levelorder
  rInit   // reverse-Initialize a bidirectional iterator - Inorder
  sInit   // structurally initialize an iterator - Inorder

Implementing InorderIterator in file mapiter_adt.h
1. A large amount of design work and boiler plate code is provided in mapiter_adt.start. Copy this file to mapiter_adt.h ONE TIME. Fix the file header NOW.
2. There is guidance on implementing the various operations in the class lecture notes.
3. Throughout the implementation, keep in mind that the stack stk_ always contains the unique path from root_ to the current node in the tree, with root_ at the bottom of the stack and the current node at the top. It may be helpful for the visually inclined to imagine stk_ as a guide rope for climbing around in the tree. Pop takes you up by shortening the rope. Push takes you down by lengthening the rope. The end of the rope (top of the stack) is where you are in the tree. (In this visual, the stack, like the tree itself, is "upside down" with the top of the stack down.)
4. Note the subtle distinction between Init and sInit. Init sets the iterator at the first element of the map, which is to say the first live node. Whereas sInit sets the iterator at the first node in the tree, dead or alive.
5. Similarly: Increment() marches the iterator around in the tree to the next node in BST order. Operator++ is implemented by calling Increment() once, for sure, and then repeatedly until a live node is found (a perfect place for a do-while loop.)
6. The Increment algorithm has 2 branches: if the current node has a right child, go to that child and then "slide left" to the bottom of the tree; else ascend until you were a left child and stop there.
7. rInit, Decrement, and operator-- are left-right mirror images of Init, Increment, and operator++, respectively.
8. The postfix versions of ++ and -- always use the best practice pattern illustrated in various places in the lecture notes.
9. Work only on the base class ConstInorderMapIterator. Aside from class boiler plate for proper type and equality operators, the derived (non-const) class InorderMapIterator adds only one functionality: a non-const version of the dereference operator*. The non-cnst version is completely implemented in the start file.
10. This is an unusual situation here: having a non-const operator* means that we will allow client programs to modify data in a pointed-to object with an assignment, as in *i = e for some Entry object e or (*i).data_ = d for some DataType object d. The latter poses no risk, but the former might: if e contained a different key, such an assignment would vandalize the carefully maintained BST structure used by Map_BST.
  
  But we have planned solid defense against such things. The assignment operator for class Entry does not copy the key_ data. Instead, it checks that the keys are equal, and only then goes forward with the data assignment. Similarly, an assignment such as (*i).key_ = k; is made illegal by the const modifier for Entry<K,D>::key_.
  
  We have gone to such trouble to make assignment into data legal because it is extremely useful and efficient. For example, consider this client code snippet:
```
Iterator i = map.Includes(key); // binary search
if (i != map.End())             // search was successful
  (*i).data_ = newdata;         // update data in constant time
```
  Without the ability to directly over-write the data, we would have to make the call map.Put(key,data), which requires another binary search with logarithmic runtime.
Implementing LevelorderIterator in file mapiter_adt.h
1. Once Inorder iterators are implemented, there should be no difficulties or surprises working with the Levelorder case. Levelorder is simplified further by being only a forward iterator and only following the ConstIterator pattern.
2. The Increment algorithm is quite straightforward and requires no loop: Push the children of the Front element, and Pop.
Implementing the Includes method in file map_bst_rec_adt.h
1. Implementing Includes is a delightful exercise in binary search and a test of understanding how the stack in an iterator works. The idea is to perform the usual binary search from the root of the tree, recording the path by pushing the non-null nodes as you search. If the key is found, make sure the node is alive before returning the iterator. Otherwise, return End().
  
  We expect a stand-alone implementation of Includes without a call to other similar (but more complicated) methods such as LowerBound.
2. There is a const and a non-const version of Includes.
Implementing Rehash
1. Build a new (locally declared) tree by inserting only the alive nodes. A straightforward (iterator-based) preorder travarsal encounters the nodes in correct order for rebuilding.
2. Swap the root pointers.
3. As the locally declared tree goes out of scope, its data will dissappear. (That's the old tree due to the Swap.)
Implementing WordBench3
1. The only difference between WordBench2 and WordBench3 is that WordBench2 uses OAA<String,size_t> and WordBench3 uses Map_BST<String,size_t>.
2. You will have to re-implement WriteReport using a standard traversal of the underlying Map object to replace the clunky Display method of OAA.

Code Requirements and Specifications

WordBench3 must use fsu::Map_BST and behave in a manner identical to your WordBench2.
Map_BST must implement all of the Associative Array and Table APIs.
Map_BST must exhibit all characteristics of unimodal ordered associative containers.
Map_BST methods Get, Put, Erase, Insert, and Includes must have runtime O(h), where h is the height of the BST.
In general, behavior should match that of the benchmark programs in area51.

Hints

The supplied file map_bst_adt_tools.cpp is a slave file for map_bst_rec_adt.h. It contains an integrity checker to verify all of the BST order properties. You incorporate the tools by putting prototypes of the methods in the class with public access and #including the slave file at the end of your code, but inside the fsu namespace.
The (Const)Iterator types Preorder and Postorder are fully implemented in mapitr_adt.start. These serve as models for the remaining types Levelorder and Inorder. A fair amount of code for these latter two is also supplied. This cuts down on the code writing work but adds to the code reading task ...
Testing is fun and important. The supplied "rantable.cpp" compiles to a random generator of <string,int> data written to a file. Such data files can be "Loaded" into the table with the fmap 'L' option, and also a command file can be used to perform a sequence of commands after the data is loaded. This feature of fmap allows repeatable tests comparing the results between your executable and a benchmark.
Note also that your wb3 should perform in a manner itentical to that of YOUR wb2.x since they use the same wordify.cpp code.
fmap has a number of interesting tests: 4 varieties of Dump (see caution bullet), structural test with 3 levels of output (again see caution), and all 4 options for traversal: F = Forward, R = Reverse, L = Levelorder, and ! = reciprocity test). And of course fmap makes the AA and Table APIs accessible.
Caution. Some of the choices available from fmap may be inadvisable for large tables. Any kind of Dump or Traverse with a table of 100,000 entries is going to occupy your screen for a while, as will the level 2 output from structural testing. Try these out on modest size tables.