Project 5: ADT-Based Iterators

ADT-Based Iterators for BST-based OAA and Map.

Revision dated 06/24/18: signatures for InorderIterator support corrected.

Educational Objectives: After completing this assignment, the student should be able to accomplish the following:

Background Knowledge Required: Be sure that you have mastered the material in these chapters before beginning the assignment:
Iterators, Introduction to Sets, Introduction to Maps, Binary Search Trees, Balanced BSTs, and BST Iterators.

Operational Objectives: Implement class templates ConstInorderMapIterator, InorderMapIterator, and LevelorderMapIterator and use these classes to complete the implementations of the class template Map_BST.

Deliverables:

map_bst_rec_adt.h # Map_BST class template
mapiter_adt.h     # Map_BST::Iterator class templates
wordbench3.h      # refactors wordbench2 using fsu::Map_BST
wordbench3.cpp    #     "         "        "        "
wordify.cpp       # same file as used for previous project - submitted for convenience
makefile.wb3      # makefile builds fmap.x, mmap.x, and wb3.x
log.txt           # your project work log

Discussion

This project explores the addition of Iterator classes associated with the Ordered Associative Array (OAA) class that was the subject of the previous project. Recall in the previous project, we did not have iterators, but nevertheless created a servicable Map-like container supporting the associative array API (Put, Get, Retrieve). We had to go to extraordinary lengths to obtain a useful traversal, and we were handicapped by the lack of equality operator among OAA objects. We had no way to make sense out of the fundamental Table operation Includes. (But Retrieve is a useful work-around.) Features we will easily obtain using iterators include the following:

  1. Includes method, returning an Iterator object
  2. Operators ==() and !=() defined among Map objects
  3. The Standard Traversal is operational

And we may put aside the special methods for traverals and Display and their recursive implementations. The two tools are compared in the following table.

 Associative Array API
 &data = Get(key)        // returns &data stored at key; ensures key exists in table
 Put(key,data)           // unimodal insert of (key,data) pair; alias for Insert
 &data = operator[](key) // AA bracket operator; alias for Get
 
 Table API
 Insert(key,data)     // unimodal insert of (key,data); alias for Put
 iter = Includes(key) // returns iter to key if found, End() otherwise (const and non-const) 
 Begin() and End()    // supporting bidirectional iterators (const and non-const versions)
 
 The APIs have these in common:
 bool Retrieve(key,&data) const             // if true, &data is a copy of data stored with key 
 Remove(key) or ( Erase(key) and Rehash() ) // Erase implements "lazy removal" (dead/alive flag), Rehash reconstructs the tree with dead nodes removed
 
 Both APIs equipped with the usual container boiler plate: Empty, Size, Clear, Constructors, Destructor, operator=

In practice, one has either OAA nicknamed "Table Lite", the iterator-free associative array, or Map nicknamed "Full Table" or "Dictionary", which includes both Associative Array and Table APIs.

One may wonder about code bloat - throwing unwanted operations into the executable code. A big advantage of using templates is that a function template is not translated to object code unless it is actually called in the program. For class templates, this means that unused member functions are not compiled. This makes it sensible and convenient to have both the Associative Array API and the Table API supported by the Map container.

Varieties of Iterators

With a sequential structure such as a list, there is one obvious way for an iterator to go through the elements, from front to back. In a Set or Map structure, the "correct" order is neither unique nor obvious. When using any of the BST implementations, any kind of traversal might be used to define iterators, and we have at least 4: Inorder, Preorder, Postorder, and Levelorder. We will use all of these defined as external or "ADT-based" iterator classes. The native iterator class will be thread-based and have the advantage over stack-based Inorderiterator in that it requires +O(1) memory and is very fast. Please be familiar with the chapter on tree iterators before diving deeper into this project.

Map implementations also present the issue of what it means to de-reference an iterator. Sometimes we want the key, sometimes the data, and sometimes both. The solution is to package the key,data in a single object called an entry. An Entry object is similar to a Pair, except: (1) the names of the data are key_ and data_ (rather than first_ and second_), and (2) the key_ is a constant, so that it can never be changed. Mimicking the terminology in the standard library, we name the internal type to be returned by a dereferenced iterator "ValueTyp".

The two varieties of iterator we will implement embody two of the fundamental algorithms of computer science: Inorder iterators follow Depth First Search [DFS] from the root of the tree, and Levelorder iterators follow Breadth First Search [BFS] from the root of the tree. We implement DFS with a stack controller and BFS with a queue controller.

Procedural Requirements

  1. The official development/testing/assessment environment is specified in the Course Organizer.

  2. Create and work within a separate subdirectory cop4530/proj5.

  3. Begin by copying all files in the directory LIB/proj5 into your proj5 directory. At this point you should see these files (at least) in your directory:

    deliverables.sh
    fmap.cpp               # test harness for Map
    mapiter_adt.start      # start for mapiter_adt.h
    map_bst_adt_tools.cpp  # slave file for map_bst_rec_adt.h
    main_wb3.cpp           # 3rd time's a charm for wordbench
    mmap.cpp               # hammer tester
    rantable.cpp           # create test files to be loaded by fmap.x
    

    Then copy these relevant executables from LIB/area51/:

    fmap_i.x
    mmap_i.x
    wb3_i.x
    

  4. Create the deliverables

    map_bst_rec_adt.h
    mapiter_adt.h
    wordbench3.h
    wordbench3.cpp
    makefile.wb3
    log.txt        
    

    satisfying the requirements and specifications below.

  5. Test thoroughly, using the area51 executables as benchmarks. (See Hints on testing.)

  6. Submit the assignment using the command submit.sh.

    Warning: Submit scripts do not work on the program and linprog servers. Use shell.cs.fsu.edu to submit assignments. If you do not receive the second confirmation with the contents of your assignment, there has been a malfunction.

Development Guide

  1. Getting started.

    1. Begin by copying several of your proj4 files, with name changes, as follows (assuming you are in directory proj5):

          source                  destination
          ------                  -----------
      cp ../proj4/oaa.h           map_bst_rec_adt.h
      cp ../proj4/wordbench2.h    wordbench3.h
      cp ../proj4/wordbench2.cpp  wordbench3.cpp
      cp ../proj4/wordify.cpp     .
      cp ../proj4/foaa.cpp        .
      

      All subsequent modifications should be to these files in proj5.

    2. Replace the include statements "#include<oaa.h>" in foaa.cpp and wordbench3.h with "#include<map_bst_rec_adt.h>". foaa.cpp will serve as a temporary test for map_bst_rec_adt until the transition to packaged data is complete.

    3. Next, in map_adt, in the definition of Node, replace the two naked data fields with a packaged pair as follows:

          class Node
          {
            ...
            // const KeyType   key_;
            //       DataType  data_;
            fsu::Entry<K,D>    value_;
            ...
          };
      

      Also add "#include <entry.h>" to your include stataments near the top of the file. This change in the way data is housed in a link will require some minor modifications in the code (for example, replacing "nptr->key_" with "nptr->value_.key_"). Make the necessary changes so that your foaa compiles and functions normally. You can also build wb3.x from wordbench3 source code and make sure it functions as wordbench2 did for project 4. Use the 4530 compile macros for debugging.

    4. Now change the class name from OAA to Map_BST throughout the remaining files in the directory. Use " grep 'OAA' * " to find these places.

    Once these steps have been completed, you have a working "map" class that has the Associative Array API implemented. We now work toward adding the Table API.

  2. Adding Iterator support in file map_bst_rec_adt.h

    1. Remove both prototype and implementation of the following entities:

      • Inorder, Preorder, Postorder (template methods)
      • Display (method)
      • PrintNode (class)
      • CopyNode and/or InsertNode, function classes used in the implementation of Rehash

      These will not be needed once we have iterators.

    2. Modify the beginning of your class definition to be as follows:

        class Map_BST
        {
        public:
      
          // family ties
          friend class ConstInorderMapIterator < fsu::Map_BST<K,D,P> >;
          friend class InorderMapIterator      < fsu::Map_BST<K,D,P> >;
          friend class LevelorderMapIterator   < fsu::Map_BST<K,D,P> >;
          friend class PreorderMapIterator     < fsu::Map_BST<K,D,P> >;
          friend class PostorderMapIterator    < fsu::Map_BST<K,D,P> >;
      
          // terminology support
          typedef K                                      KeyType;
          typedef D                                      DataType;
          typedef P                                      PredicateType;
          typedef fsu::Entry<K,D>                        ValueType;
          typedef InorderMapIterator < Map_BST <K,D,P> > Iterator;
          typedef ConstInorderMapIterator < Map_BST <K,D,P> > ConstIterator;
      
          // iterator support
          Iterator      Begin  ();
          Iterator      End    ();
          Iterator      rBegin ();
          Iterator      rEnd   ();
          ConstIterator Begin  () const;
          ConstIterator End    () const;
          ConstIterator rBegin () const;
          ConstIterator rEnd   () const;
      
          // special iterators
          typedef LevelorderMapIterator   < Map_BST <K,D,P> > LevelorderIterator;
          LevelorderIterator BeginLevelorder   () const;
          LevelorderIterator EndLevelorder     () const;
      
          typedef InorderMapIterator      < Map_BST <K,D,P> > InorderIterator;
          InorderIterator BeginInorder         ();
          InorderIterator EndInorder           ();
          InorderIterator rBeginInorder        ();
          InorderIterator rEndInorder          ();
      
          typedef ConstInorderMapIterator < Map_BST <K,D,P> > ConstInorderIterator;
          ConstInorderIterator BeginInorder    () const;
          ConstInorderIterator EndInorder      () const;
          ConstInorderIterator rBeginInorder   () const;
          ConstInorderIterator rEndInorder     () const;
      
          typedef PreorderMapIterator     < Map_BST <K,D,P> > PreorderIterator;
          PreorderIterator BeginPreorder       () const;
          PreorderIterator EndPreorder         () const;
          PreorderIterator rBeginPreorder      () const;
          PreorderIterator rEndPreorder        () const;
      
          typedef PostorderMapIterator    < Map_BST <K,D,P> > PostorderIterator;
          PostorderIterator BeginPostorder     () const;
          PostorderIterator EndPostorder       () const;
          PostorderIterator rBeginPostorder    () const;
          PostorderIterator rEndPostorder      () const;
      
          // structural iterator support
          ConstIterator BeginStructuralInorder () const;
          ...
      

      Most of this is new. There is more elaborate terminology support and a lot of iterator support prototyping. You can see from this what is coming: (1) develope Interator and ConstIterator classes; (2) develop LevelorderIterator class; and (3) develop structural traversal for special ops. ("Structural" means including tombstones, as they are part of the tree structure, but not part of the map data.)

      The "InorderIterator" versions of iterator and iterator support are (in this implementation of Map) identical to the native "Iterator". This amounts only to having alternate terminilogy. But it also leaves the door open to change the native iterator type, to "threaded" for example, while retaining the ADT-based Inorder versions.

    3. In the definition of Node, add friendships as follows:

          class Node
          {
            friend class Map_BST<K,D,P>;
            friend class ConstInorderMapIterator < fsu::Map_BST<K,D,P> >;
            friend class InorderMapIterator      < fsu::Map_BST<K,D,P> >;
            friend class LevelorderMapIterator   < fsu::Map_BST<K,D,P> >;
            friend class PreorderMapIterator     < fsu::Map_BST<K,D,P> >;
            friend class PostorderMapIterator    < fsu::Map_BST<K,D,P> >;
      
            // const KeyType   key_;
            //       DataType  data_;
            fsu::Entry<K,D>    value_;
            ...
          };
      
    4. Add these prototypes above the class definition:

        template < typename K , typename D , class P >
        class Map_BST;
      
        template < typename K , typename D , class P > 
        bool operator == (const Map_BST<K,D,P>& map1, const Map_BST<K,D,P>& map2);
      
        template < typename K , typename D , class P > 
        bool operator != (const Map_BST<K,D,P>& map1, const Map_BST<K,D,P>& map2);
      

      The global operators will need to be implemented below the class definition. The implementation of equality uses iterators:

        template < typename K , typename D , class P > 
        bool operator == (const Map_BST<K,D,P>& map1, const Map_BST<K,D,P>& map2)
        {
          typename Map_BST<K,D,P>::ConstIterator i,j;
          for (i = map1.Begin(), j = map2.Begin(); i != map1.End() && j != map2.End(); ++i, ++j)
          {
            if ( (*i).key_ != (*j).key_ || (*i).data_ != (*j).data_ ) return 0;
          } 
          if (i != map1.End() || j != map2.End()) return 0;
          return 1;
        }
      

      and the not-equal operator always follows the same pattern of implementation:

        template < typename K , typename D , class P > 
        bool operator != (const Map_BST<K,D,P>& map1, const Map_BST<K,D,P>& map2)
        {
          return !(map1 == map2);
        }
      

      Note that we are now using a Standard Traversal, even before figuring out how to implement the iterators required to perform it.

    5. Implement the various iterator support methods. Keep in mind, these are container class methods. They are typically implemented by invoking an initializing method in the iterator class. Some examples, from which the remainder can be extrapolated:

        template < typename K , typename D , class P >
        typename Map_BST<K,D,P>::Iterator Map_BST<K,D,P>::Begin()
        {
          Iterator i;
          i.Init(root_);
          return i;
        }
      
        template < typename K , typename D , class P >
        typename Map_BST<K,D,P>::Iterator Map_BST<K,D,P>::End()
        {
          Iterator i;
          return i;
        }
      
        template < typename K , typename D , class P >
        typename Map_BST<K,D,P>::Iterator Map_BST<K,D,P>:: rBegin()
        {
          Iterator i;
          i.rInit(root_);
          return i;
        }
      
        template < typename K , typename D , class P >
        typename Map_BST<K,D,P>::ConstIterator Map_BST<K,D,P>::Begin() const
        {
          ConstIterator i;
          i.Init(root_);
          return i;
        }
      
        template < typename K , typename D , class P >
        typename Map_BST<K,D,P>::LevelorderIterator Map_BST<K,D,P>::BeginLevelorder()
        const
        {
          LevelorderIterator i;
          i.Init(root_);
          return i;
        }
      
        template < typename K , typename D , class P >
        typename Map_BST<K,D,P>::ConstIterator Map_BST<K,D,P>::BeginStructuralInorder() const
        {
          ConstIterator i;
          i.sInit(root_);
          return i;
        }
      

      Note we are now committed to these methods in our Iterator classes:

        Init    // initialize an iterator - Inorder and Levelorder
        rInit   // reverse-Initialize a bidirectional iterator - Inorder
        sInit   // structurally initialize an iterator - Inorder
      
  3. Implementing InorderIterator in file mapiter_adt.h

    1. A large amount of design work and boiler plate code is provided in mapiter_adt.start. Copy this file to mapiter_adt.h ONE TIME. Fix the file header NOW.

    2. There is guidance on implementing the various operations in the class lecture notes.

    3. Throughout the implementation, keep in mind that the stack stk_ always contains the unique path from root_ to the current node in the tree, with root_ at the bottom of the stack and the current node at the top. It may be helpful for the visually inclined to imagine stk_ as a guide rope for climbing around in the tree. Pop takes you up by shortening the rope. Push takes you down by lengthening the rope. The end of the rope (top of the stack) is where you are in the tree. (In this visual, the stack, like the tree itself, is "upside down" with the top of the stack down.)

    4. Note the subtle distinction between Init and sInit. Init sets the iterator at the first element of the map, which is to say the first live node. Whereas sInit sets the iterator at the first node in the tree, dead or alive.

    5. Similarly: Increment() marches the iterator around in the tree to the next node in BST order. Operator++ is implemented by calling Increment() once, for sure, and then repeatedly until a live node is found (a perfect place for a do-while loop.)

    6. The Increment algorithm has 2 branches: if the current node has a right child, go to that child and then "slide left" to the bottom of the tree; else ascend until you were a left child and stop there.

    7. rInit, Decrement, and operator-- are left-right mirror images of Init, Increment, and operator++, respectively.

    8. The postfix versions of ++ and -- always use the best practice pattern illustrated in various places in the lecture notes.

    9. Work only on the base class ConstInorderMapIterator. Aside from class boiler plate for proper type and equality operators, the derived (non-const) class InorderMapIterator adds only one functionality: a non-const version of the dereference operator*. The non-cnst version is completely implemented in the start file.

    10. This is an unusual situation here: having a non-const operator* means that we will allow client programs to modify data in a pointed-to object with an assignment, as in *i = e for some Entry object e or (*i).data_ = d for some DataType object d. The latter poses no risk, but the former might: if e contained a different key, such an assignment would vandalize the carefully maintained BST structure used by Map_BST.

      But we have planned solid defense against such things. The assignment operator for class Entry does not copy the key_ data. Instead, it checks that the keys are equal, and only then goes forward with the data assignment. Similarly, an assignment such as (*i).key_ = k; is made illegal by the const modifier for Entry<K,D>::key_.

      We have gone to such trouble to make assignment into data legal because it is extremely useful and efficient. For example, consider this client code snippet:

      Iterator i = map.Includes(key); // binary search
      if (i != map.End())             // search was successful
        (*i).data_ = newdata;         // update data in constant time
      

      Without the ability to directly over-write the data, we would have to make the call map.Put(key,data), which requires another binary search with logarithmic runtime.

  4. Implementing LevelorderIterator in file mapiter_adt.h

    1. Once Inorder iterators are implemented, there should be no difficulties or surprises working with the Levelorder case. Levelorder is simplified further by being only a forward iterator and only following the ConstIterator pattern.

    2. The Increment algorithm is quite straightforward and requires no loop: Push the children of the Front element, and Pop.

  5. Implementing the Includes method in file map_bst_rec_adt.h

    1. Implementing Includes is a delightful exercise in binary search and a test of understanding how the stack in an iterator works. The idea is to perform the usual binary search from the root of the tree, recording the path by pushing the non-null nodes as you search. If the key is found, make sure the node is alive before returning the iterator. Otherwise, return End().

      We expect a stand-alone implementation of Includes without a call to other similar (but more complicated) methods such as LowerBound.

    2. There is a const and a non-const version of Includes.

  6. Implementing Rehash

    1. Build a new (locally declared) tree by inserting only the alive nodes. A straightforward (iterator-based) preorder travarsal encounters the nodes in correct order for rebuilding.

    2. Swap the root pointers.

    3. As the locally declared tree goes out of scope, its data will dissappear. (That's the old tree due to the Swap.)

  7. Implementing WordBench3

    1. The only difference between WordBench2 and WordBench3 is that WordBench2 uses OAA<String,size_t> and WordBench3 uses Map_BST<String,size_t>.

    2. You will have to re-implement WriteReport using a standard traversal of the underlying Map object to replace the clunky Display method of OAA.

Code Requirements and Specifications

  1. WordBench3 must use fsu::Map_BST and behave in a manner identical to your WordBench2.

  2. Map_BST must implement all of the Associative Array and Table APIs.

  3. Map_BST must exhibit all characteristics of unimodal ordered associative containers.

  4. Map_BST methods Get, Put, Erase, Insert, and Includes must have runtime O(h), where h is the height of the BST.

  5. In general, behavior should match that of the benchmark programs in area51.

Hints