Tables

The general subject of this chapter is "tables". You may find this surprising, since the name of the chapter is Introduction to Maps. Here's what's going on with terminology: it's all over the map.

In computer science, generally, most professionals will understand what is meant by a table: a way to store key,data pairs in an associative manner, so that a pair, or its data, can be looked up, inserted, or removed by key value: a data retrieval system. Other terms have evolved over the years, including these which seem to be the most prevalant:

TableProbably the most generic language-independent usage
DictionarySynonym for Table
MappingTerm arising from (discrete) math describing a function from one set to another
MapIn math, shorthand for "mapping"; adopted specifically by the C++ STL to mean an ordered table
Associative ArrayA table with a special bracket operator

A good way to start thinking about tables technically is to envision a Set of Pairs, where a Pair object consists of a key and a datum:

typedef Set < Pair < KeyType , DataType > > Table;

This is a somewhat crude way to define tables, but it can be made to work quite effectively, and it definitely works as a valid analogy. The main reason for not just adopting the "table is a set of pairs" definition completely is that there are interesting and useful things that are specific to tables that don't arise naturally from sets.

Importantly, tables have unimodal semantics that is key-centric. This means:

  1. Duplicate keys are not allowed in a table.
  2. Insert operations have a "dual personality": Insert(key_value,data_instance) adds the pair to the table if key_value is not in the table, but overwrites the existing data field with data_instance otherwise.
  3. Key values may not be modified in the table. If some data_instance needs to be associated with a different key new_key, a new pair (new_key, data_instance) must be inserted into the table.


Table API

The table API is very similar to that of Set: it is an associative container with primary functionality Insert, Remove, and some form of look-up. For a Set, look-up takes the form of finding an object in the set and returning an interator pointing to the object. De-referencing the iterator is the way the data stored in the object is obtained.

For Tables, there is the advantage that the structural data is separated as a KeyType value, while the data is separately stored in a DataType value. Therefore a natural from of look-up in a Table is Retrieve(key_value,d), in which d becomes a valid reference to the data stored with key_val. Here are the three primary operations for Table, along with the variations found in Associative Array [aka Symbol Table]:

Associative Container APIs (unimodal)
associative operationTableAASet
insertInsert(k,d)Put(k,d)Insert(t)
look-upRetrieve(k,&d)&d = Get(k)i = Includes(t)
removeRemove(k)Erase(k)Remove(t)

We will give precise descriptions of the behavior of each of these operations. Note also that one can define analogs across the table of any of the operations. For example, we could define Set::Get as &x = Get(x), meaning that Get(x) returns a reference to the stored value x and Table::Includes returning a table iterator to the stored pair. But the operations stated in the table are considered to be the primary defining operations for the Set, Table, and Associative Array abstract data types.

Table Semantics

We will look at the Associative Array semantics below.

Ordered Table API

Just as with Set, the ordered Table has additional search operations LowerBound and UpperBound, that locate the item using the tableorder structure.

Ordered Table Semantics

The principal additional functionality distinguishing the ordered table is that a standard traversal encounters stored pairs in sorted order by key, whereas in an unsorted, or hashed, table, a traversal encounters stored pairs in pseudo-random key order.


Associative Arrays

Associative Arrays, sometimes also called Symbol Tables, are essentially the same as tables with a redifined API. Aside from different terminology, the Associatve Array brings a novel kind of operation, a kind of bracket operator. But unlike the bracket operator of ordinary arrays, vectors, and deques, the AA bracket operator takes something of the client-defined KeyType as its argument. The prototype of this operator is:

DataType& operator[] (KeyType k); // AA bracket operator
DataType& operator[] (size_t i);  // array bracket operator

(Note: often the parameters shown passed by value are coded to pass by const reference, which is a programming efficiency without functional distinction.) The seemingly simple change from size_t to KeyType has profound consequences, the most striking being

  1. KeyType is often not an integral type, so that it is not clear what "next" and "previous" might mean; and
  2. KeyType may be infinite in size, and in fact between any two keys there may be an infinite number of other keys.

For example, KeyType might be type String.

Exercise. Given two keys of type string, and using lexicographical ordering, explain how there are infinitely many keys between the two.

Among other things, these facts mean that there is no simple analog of the ordinary array traversal loop and, even though a table may contain many entries, the number of keys NOT in the table is often much larger than the table size. We will define the semantics of the AA Put, Get, and bracket operator shortly, in terms of the basic table operations.

We will discuss the semantics of Associatve Array operations in more detail later on. Our last point to discuss here is the "syntactic sugar" nature of the AA bracket operator in terms of a named operation Get:

DataType& operator[] (KeyType k)
{
  return Get(k);
}

where the Get operation has the prototype

DataType& Get (KeyType k);

We continue with a discussion of class Entry which formalizes what we mean by a "pair" in a table.

Using Associative Array

Here we illustrate typical client code using an Associative Array to store table entries. It takes some getting used to.

AssociativeArray<KeyType,DataType> aa;  // declare an (empty) associative array object 

Note that aa[k] may appear on either side of an assignment, and either way there is a lot going on:

aa[k] = d;

The more obvious effect is that the data associated with k is set to d. Less obvious is that the invocation aa[k] itself ensures that the pair (k,some_data) is actually in the table.

d = aa[k];

Again, it is fairly obvious that d is set to refer to the data associated with k in the table. Less obvious is that the invocation aa[k] ensures that there is such a data item associated with k in the table.

Exercise. Consider the loop implementing a form of sequential search:

AssociativeArray aa;
...
String s = stringBegin;
while (s != stringEnd)
{
  if (aa[s] == searchValue)
    return true;
  s = NextString(s);
}
return false;

Explain why this is a bad idea.

Class Entry<KeyType, DataType>

We already mentioned that a Table can be usefully thought of as a set of pairs. Here we define exactly what kind of "pair" to store in a table.

struct Entry // a key,data pair
{
  const KeyType  key_;
        DataType data_;
  Entry (const KeyType& k, const DataType& d) : key_(k),data_(d) {}
};

bool operator == (const Entry& e1, const Entry& e2)
{
  return (e1.key_ == e2.key_);
}

Entry& operator= (const Entry& that)
{
  if (this->key_ == that.key_) { this->data_ = that.data_; }
  else { ERROR(); }
  return *this;
}

bool operator <  (const Entry& e1, const Entry& e2)
{
  return (e1.key_ < e2.key_);
}

bool operator != (const Entry& e1, const Entry& e2)
{
  return !(e1 == e2); // best practices pattern
}

Notes:

  1. const KeyType key_ is immutable and must be given a value when it comes into scope. We do this with the constructor.
  2. Assignment first checks that the two keys are equal and then assigns the data. Since keys are immutable, we would not be allowed to assign one key to another.
  3. The various comparison operators are key-centric: only the key in a pair matters when deciding on equality or order.

We now consider a table as a Set of Entry pairs and use the Set operations to define the Table and Associative Array counterparts.

Table as Set<Entry>

Table::Insert(k,d) wraps k,d in an Entry object e and calls Set::Insert(e).

Table::Retrieve(k,&d) is a little more convoluted, but still simple: create any entry object e with key k and find it in the set with i = Set::Includes(e). The iterator i now points to the table entry with key k, so *i.data_ is the data stored with the key.

Table::Remove(k) follows the same pattern: create an entry object e with key k, and remove e from the set.

Note how the key-centric semantics of the Entry comparison operators ensure that unimodal Set semantics transfer to unimodal Table semantics.

Associative Array as Set<Entry>

The only distinct operation Associative Array has (other than nomenclature) is Get and its alias, the AA bracket operator. To implement Get(KeyType k) in the Set<Entry> model, define an entry object with key k and default data: Entry e(k,DataType()). First search to see if e is in the set, and if it is not then insert it. In either case, return a reference to the data stored with k.

Ordered Table Analysis (summary - so far)

The analysis for ordered Table is based on the Set<Entry> model, where we use the most efficient implementation of Set that we have covered (so far) - the ordered vector. These esitiamtes carry over straight from the Introduction to Sets chapter.

Exercise. Analyze the AssociativeArray::Get operation using the ordered vector implementation of Set.