The general subject of this chapter is "tables". You may find this surprising, since the name of the chapter is Introduction to Maps. Here's what's going on with terminology: it's all over the map.
In computer science, generally, most professionals will understand what is meant by a table: a way to store key,data pairs in an associative manner, so that a pair, or its data, can be looked up, inserted, or removed by key value: a data retrieval system. Other terms have evolved over the years, including these which seem to be the most prevalant:
Table Probably the most generic language-independent usage Dictionary Synonym for Table Mapping Term arising from (discrete) math describing a function from one set to another Map In math, shorthand for "mapping"; adopted specifically by the C++ STL to mean an ordered table Associative Array A table with a special bracket operator
A good way to start thinking about tables technically is to envision a Set of Pairs, where a Pair object consists of a key and a datum:
typedef Set < Pair < KeyType , DataType > > Table;
This is a somewhat crude way to define tables, but it can be made to work quite effectively, and it definitely works as a valid analogy. The main reason for not just adopting the "table is a set of pairs" definition completely is that there are interesting and useful things that are specific to tables that don't arise naturally from sets.
Importantly, tables have unimodal semantics that is key-centric. This means:
The table API is very similar to that of Set: it is an associative container with primary functionality Insert, Remove, and some form of look-up. For a Set, look-up takes the form of finding an object in the set and returning an interator pointing to the object. De-referencing the iterator is the way the data stored in the object is obtained.
For Tables, there is the advantage that the structural data is separated as a KeyType value, while the data is separately stored in a DataType value. Therefore a natural from of look-up in a Table is Retrieve(key_value,d), in which d becomes a valid reference to the data stored with key_val. Here are the three primary operations for Table, along with the variations found in Associative Array [aka Symbol Table]:
Associative Container APIs (unimodal) associative operation Table AA Set insert Insert(k,d) Put(k,d) Insert(t) look-up Retrieve(k,&d) &d = Get(k) i = Includes(t) remove Remove(k) Erase(k) Remove(t)
We will give precise descriptions of the behavior of each of these
operations. Note also that one can define analogs across the table of any of the
operations. For example, we could define Set::Get as &x = Get(x), meaning
that Get(x) returns a reference to the stored value x and Table::Includes
returning a table iterator to the stored pair. But the operations stated in the
table are considered to be the primary defining operations for the Set, Table,
and Associative Array abstract data types.
We will look at the Associative Array semantics below.
Just as with Set, the ordered Table has additional search operations LowerBound
and UpperBound, that locate the item using the tableorder structure.
The principal additional functionality distinguishing the ordered table is that a standard traversal encounters stored pairs in sorted order by key, whereas in an unsorted, or hashed, table, a traversal encounters stored pairs in pseudo-random key order.
Associative Arrays, sometimes also called Symbol Tables, are essentially the same as tables with a redifined API. Aside from different terminology, the Associatve Array brings a novel kind of operation, a kind of bracket operator. But unlike the bracket operator of ordinary arrays, vectors, and deques, the AA bracket operator takes something of the client-defined KeyType as its argument. The prototype of this operator is:
DataType& operator[] (KeyType k); // AA bracket operator DataType& operator[] (size_t i); // array bracket operator
(Note: often the parameters shown passed by value are coded to pass by const reference, which is a programming efficiency without functional distinction.) The seemingly simple change from size_t to KeyType has profound consequences, the most striking being
For example, KeyType might be type String.
Exercise. Given two keys of type string, and using lexicographical ordering, explain how there are infinitely many keys between the two.
Among other things, these facts mean that there is no simple analog of the ordinary array traversal loop and, even though a table may contain many entries, the number of keys NOT in the table is often much larger than the table size. We will define the semantics of the AA Put, Get, and bracket operator shortly, in terms of the basic table operations.
We will discuss the semantics of Associatve Array operations in more detail later on. Our last point to discuss here is the "syntactic sugar" nature of the AA bracket operator in terms of a named operation Get:
DataType& operator[] (KeyType k) { return Get(k); }
where the Get operation has the prototype
DataType& Get (KeyType k);
We continue with a discussion of class Entry which formalizes what we mean by a
"pair" in a table.
Here we illustrate typical client code using an Associative Array to store table entries. It takes some getting used to.
AssociativeArray<KeyType,DataType> aa; // declare an (empty) associative array object
Note that aa[k] may appear on either side of an assignment, and either way there is a lot going on:
aa[k] = d;
The more obvious effect is that the data associated with k is set to d. Less obvious is that the invocation aa[k] itself ensures that the pair (k,some_data) is actually in the table.
d = aa[k];
Again, it is fairly obvious that d is set to refer to the data associated with k in the table. Less obvious is that the invocation aa[k] ensures that there is such a data item associated with k in the table.
Exercise. Consider the loop implementing a form of sequential search:
AssociativeArrayaa; ... String s = stringBegin; while (s != stringEnd) { if (aa[s] == searchValue) return true; s = NextString(s); } return false;
Explain why this is a bad idea.
We already mentioned that a Table can be usefully thought of as a set of pairs. Here we define exactly what kind of "pair" to store in a table.
struct Entry // a key,data pair { const KeyType key_; DataType data_; Entry (const KeyType& k, const DataType& d) : key_(k),data_(d) {} }; bool operator == (const Entry& e1, const Entry& e2) { return (e1.key_ == e2.key_); } Entry& operator= (const Entry& that) { if (this->key_ == that.key_) { this->data_ = that.data_; } else { ERROR(); } return *this; } bool operator < (const Entry& e1, const Entry& e2) { return (e1.key_ < e2.key_); } bool operator != (const Entry& e1, const Entry& e2) { return !(e1 == e2); // best practices pattern }
Notes:
We now consider a table as a Set of Entry pairs and use the Set operations to
define the Table and Associative Array counterparts.
Table::Insert(k,d) wraps k,d in an Entry object e and calls Set::Insert(e).
Table::Retrieve(k,&d) is a little more convoluted, but still simple: create any entry object e with key k and find it in the set with i = Set::Includes(e). The iterator i now points to the table entry with key k, so *i.data_ is the data stored with the key.
Table::Remove(k) follows the same pattern: create an entry object e with key k, and remove e from the set.
Note how the key-centric semantics of the Entry comparison operators ensure that unimodal Set
semantics transfer to unimodal Table semantics.
The only distinct operation Associative Array has (other than nomenclature) is
Get and its alias, the AA bracket operator. To implement Get(KeyType k) in the
Set<Entry> model, define an entry object with key k and default data:
Entry e(k,DataType()). First search to see if e is in the set, and if it is not
then insert it. In either case, return a reference to the data stored with k.
The analysis for ordered Table is based on the Set<Entry> model, where we use the most efficient implementation of Set that we have covered (so far) - the ordered vector. These esitiamtes carry over straight from the Introduction to Sets chapter.
Exercise. Analyze the AssociativeArray::Get operation using the ordered
vector implementation of Set.