Program Modularity

There are many reasons why all modern programming languages have features used to break a program into components that may be written, compiled, and tested as a unit. The existence and use of such features is called program modularity. The most important benefits of program modularity are:

Reduction of program complexity: By breaking a program into modules, its complexity is reduced and its design is compartmentalized
Distributon of programming effort: Different modules may be created by different people or teams without interfering with the work of other components of a project
Re-use of program components: Once created, a sub-program may be used by any other program that may be created in the future

It is unthinkable to create any sizeable program without generous use of modularity in its design.

Classes in C++

Functions and classes constitute the principal means of modularization in C++. C++ classes and objects are used to:

Encapsulate data, that is, limit the scope of data variables
Protect data from outside manipulation
Limit scope of functionality by providing class methods
Specialize functionality to the particular data and intended use of the class

Classes and objects may be created as part of an application program, or they may have been created in a previous project and be used again, or they may be part of the language system and distributed in libraries (such as iostream).

C++ shares the use of stand-alone functions with C and the use of classes with Java. C++ can thus be thought of as covering two independent dimensions of programming language design: procedural programming as in C and object-oriented programming as in Java. In this sense, C++ can be thought of as a multi-dimensional language with C and Java representing orthogonal subspaces of C++.

Classes and Objects

According to [Stroustrup, p. 224], a class in C++ is a user-defined type, and an object is an instance of that type. Of course, there is a lot more to classes and objects: taken together, with all their complexity and subtlety, they form the basis of two entire programming paradigms: object-based and object-oriented programming.

The following steps must be accomplished before an object can be used:

The class, or new type, must be defined.
Member functions of the class must be implemented.
The object itself must be declared.

Here is a simple example of a class definition, based on one from [Stroustrup]:

class Date
{
public:
  void Init(int d, int m, int y);   // sets the date
  void AddYear(int n);              // adds n to the year
  void AddMonth(int n);             // adds n to the month
  void AddDay(int n);               // adds n to the day

private:
  int day_, month_, year_;  // the basic date data
} ;

Before proceeding, let's look at formatting for classes. The above is perfectly OK, and meets a minimal standard of format readability. The following is equivalent code, but is even more readable and facilitates an organized understanding of the class definition:

class Date
{
public:
  void Init     (int d, int m, int y);   // sets the date
  void AddYear  (int n);                 // adds n to the year
  void AddMonth (int n);                 // adds n to the month
  void AddDay   (int n);                 // adds n to the day

private:
  int day_,         // the day of the month, restricted to [1..31]
      month_,       // the month of the year, restricted to [1..12]
      year_;        // positive = AD, negative = BC, restricted to non-zero
} ;

Notice that this second format lines up the critical elements of declarations in columns: return values, member function names, parameter lists, and documentation statements. The data is also better documented by placing each item on a separate line. We will generally follow a formatting system like this second example.

Class Definition

Returning to the actual code, here is a third rendering, with the documentation removed to make room for explanations. Reserved words and required syntactical components are shown in color:

class Date     // "class" followed by programmer-spec'd class name
{
public:        // portion of class accessible through objects 
  void Init     (int d, int m, int y);   // member function
  void AddYear  (int n);                 // member function
  void AddMonth (int n);                 // member function
  void AddDay   (int n);                 // member function

private:       // portion of class with restricted access
  int day_, month_, year_;                // member data
} ;            // brace and semicolon required

A class definition begins with the reserved word class followed by the class name, which is chosen by the programmer subject to the usual identifier constraints. The body of the defintion follows, captured by enclosing braces. Finally there is a required semicolon, whose purpose is to allow for declaring objects of the new type immediately as it is defined.

Note that there are two kinds of members defined for this class: member functions and member variables. The member functions are declared via prototypes, and the member variables are declared as typed variables, both in the usual way. These class members have scope equal to the class itself. In this way the class is its own namespace.

The reserved words public and private are used to designate the access control to that portion of the class. Public access means that any object may access that class member using the "dot" notation. Private access means that the member can be accessed only by other class members. In this example, all of the methods are public and all of the data is private. This is a typical design constraint, but it is not required by the language and there are exceptions where the constraint is best not followed.

There are three increasingly restrictive levels of access control status: public, protected, and private, with private being the default. Levels protected and private are distinguished only under class inheritance which we will discuss in a following chapter.

Friends

The keyword friend is used to grant a non-member function or another class access to private data. For example:

class Date
{
friend bool operator == (const Date& d1, const Date& d2); 
// allows access to private data by non-member function of class

public:        // portion of class accessible through objects 
  void Init     (int d, int m, int y);   // member function
  void AddYear  (int n);                 // member function
  void AddMonth (int n);                 // member function
  void AddDay   (int n);                 // member function

private:       // portion of class with restricted access
  int day_, month_, year_;               // member data
} ;

allows the boolean operator==() access to the private data of class Date for its imple,entaton, which would go something like this:

int operator ==(const Date& d1, const Date& d2)
{
  return d1.day_ == d2.day_ && d1.month_ == d2.month_ && d1.year_ == d2.year_;
}

Friendship is necessary in many cases, especially to implement operators as in the example above. Granting friendship to another class is an especially generous "giveaway" of privacy rights and should be done with extreme caution in code design.

Implementing Class Member Functions

The member function prototypes in a class definition must be implemented. This is the same process as implementing a stand-alone function, except that the class scope and class membership must be taken into account. Class scope is resolved using the scope resolution operator ::. Class membership means that the class variables may be used by the class member function implementation without any re-declaration. Here are two examples:

void Date::Init (int d, int m, int y)
{
  day_ = d;
  month_ = m;
  year_ = y;
}

void Date::AddYear (int n)
{
  year_ += n;
}

Note how the class scope is indicated by the class name and the scope resolution operator and how the member variables are accessed by the function bodies without declaration.

Declaring and Using Objects

Objects are just variables of user-defined type. (In fact, we often speak of variables of the built-in types as objects, though they are not strictly instances of a class.) Here is sample code fragments creating objects of several types:

# include < fstream >  // defines classes ifstream and ofstream

int main()
{
  std::ifstream ifs;   // ifs can be used to open files for reading
  std::ofstream ofs;   // ofs can be used to open files for writing
  Date d1, d2;         // d1 and d2 are Date objects
  ...
}

Note that ifs, ofs, d1, and d2 are all objects of various types. We may call them "objects" or "variables" as taste and circumstances encourage. These objects may access the public members of their class using the "dot" notation, as in the following code fragments:

# include < fstream >  // defines classes ifstream and ofstream

int main()
{
  std::ifstream ifs;   // ifs can be used to open files for reading
  std::ofstream ofs;   // ofs can be used to open files for writing
  Date d1, d2;         // d1 and d2 are Date objects
  ...
  ifs.open("file1");   // opens file "file1" for reading
  ofs.open("file2");   // opens file "file2" for writing
  d1.Init(7,9,2004);   // sets the date to Sep 7, 2004
  ...
}

This "dot" notation is part of the legacy of C and the classic struct of that language. The C++ class is a generalization of the C struct. Note that "dot" is actually another C++ operator.

Constructors

Whenever an object is created, for example by a declaration such as the declarations of ifs, ofs, d1, and d2 above, a constructor is called as the object is created. The object doesn't officially exist until these two steps have been completed:

Memory for the object footprint is allocated
A constructor for the class has been called (and returns)

A memory footprint is a contiguous block of memory that is correctly sized for the object: the number of bytes should equal the value returned by sizeof(T), where T is the type (i.e., class) of the object. A constructor is a member function of the class that is prototyped with a special syntax and implemented along with all other member functions of the class. The purpose served by the constructor is to appropriately initialize the object. The constructor syntax is special in two ways: first, constructors have no return type; and second, they are named the same as the name of the class. Here is our class Date with some constructors added:

class Date
{
public:
       Date      ();                      // sets the date today
       Date      (int d, int m, int y);   // sets the date to arguments
  void AddYear   (int n);                 // adds n to the year
  void AddMonth  (int n);                 // adds n to the month_
  void AddDay    (int n);                 // adds n to the day

private:
  int day_,         // the day of the month_, restricted to [0..31]
      month_,       // the month_ of the year, restricted to [1..12]
      year_;        // positive = AD, negative = BC, restricted to non-zero
} ;

The two new methods, each named "Date", are constructors. Note that no return type is given. Also note that these are overloads of the same name, so they must be distinguished by their parameter lists, as they are in this example. We could implement these something like the following:

Date::Date (int day, int month, int year)
{
  day_ = day;
  month_ = month;
  year_ = year;
}

Date::Date ()
{
  day_ = today.d;
  month_ = today.m;
  year_ = today.y;
}

where today is the C library structure that maintains the date from the operating system. (In most Unix systems this struct is globally defined in time.h.) Note that Date() sets the date to today's date. Thus with these constructors we may now make declarations

Date d1;
Date d2(11,9,2001);

Then d1 is initialzed to today's date, while d2 is initialized to September 11, 2001.

According to [Stroustrup, p.226], The use of functions such as Init() to provide initialization for class objects is inelegant and error-prone. Most theorists and practioners of OBP and OOP would agree. Therefore we have removed this function from our class definition and will refrain from introducing such functions in the future. Initialization should be handled exclusively through constructors.

On the other hand, it is a good thing to isolate relatively complicated segments of code into separate function calls instead of repeating them in several places. This practice is an aspect of modularity and highly encouraged. If, for example, several constructors (and perhaps also an assignment operator and copy constructor - see the next chapter) all use the same body of code, that code may be used to define a class method that is in turn called by each of these. Such helper functions should be made private so that they are not accessible to client programs.

Default Constructors

We stated above that a constructor is called whenever an object is created. What happens if there is no constructor defined for the class?

Under that circumstance, a default constructor for the class is created by the compiler. This default constructor does little more than tag the object as belonging to its class (after the memory footprint has been created) and then make sure that the default constructors for the data members of the class are called. The default constructor is "parameterless".

If on the other hand there is any constructor defined for a class, then that class will not receive any other constructor by default: once a constructor is defined, then the programmer is responsible for defining all other constructors, including a parameterless constructor if one is needed. For simplicity, a parameterless constructor, whether defined by the compiler or the class programmer, is often called the "default" constructor for the class. Using this terminology, Date has a default constructor.

A parameterless, or default, constructor is almost always a desired feature of a class. Among other reasons, a default constructor is called by a parameterless declaration. For example:

Date d1;               // calls the parameterless (default) constructor
Date d2 (11, 9, 2001); // calls the 3-parameter constructor

The first declaration above would result in a compile error if no parameterless/default constructor had been defined.

One final point is that constructors may have default parameter values and these may implicitly define a parameterless constructor. For example, in the following version of Date:

class Date
{
public:
       Date      (int d = 0, int m = 0, int y = 0);
  void AddYear   (int n);
  void AddMonth  (int n);
  void AddDay    (int n);

private:
  int day_, month_, year_;
} ;

the parameters have default value zero. A clever observation makes this work: because 0 is not a valid day, month, or year, the value 0 can be used to trigger a change to today's value:

Date::Date (int d, int m, int y) : day_(d), month_(m), year_(y)
{
  if (day_ == 0)   day_   = today.d; 
  if (month_ == 0) month_ = today.m; 
  if (year_ == 0)  year_  = today.y; 
}

The single constructor with three default parameter values works with 0, 1, 2, or 3 arguments and thus replaces separate definitions of 0-, 1-, 2-, and 3-parameter constructors.

Initialization Lists

The comma-separated list that is part of the header of the implementation of the constructor above is called an initialization list. The items of the initialization list are specific calls to constructors for the data members of the class.

Date::Date (int d, int m, int y) : day_(d), month_(m), year_(y)
{
  // optional code here also
}

The effect of using an initialization list is the same as performing the initializations in the body of the constructor implementation. However, use of the initialization list is preferred for several reasons, including the following:

Readability: the initializations are organized in a special place where they are immediately recognizable
Efficiency: the initial values of data members are set by the compiler as the object is created, rather than creating the object first and then copying new values into the data members

Initialization lists can be used for constructors with and without parameters as well as copy constructors.

Code Organization

C++ Class code is usually organized into header files and implementation files, following the practice established by C for functions. The header file would contain the class definition, and the implementation file would contain the implementations of the class methods. Here then are typical files and contents for the Date class. First the header file:

/*
    file: date.h

    Containing the definition of the class Date

    (other file documentation)
*/

class Date
{
public:
       Date      ();                      // sets the date today
       Date      (int d, int m, int y);   // sets the date to arguments
  void AddYear   (int n);                 // adds n to the year
  void AddMonth  (int n);                 // adds n to the month
  void AddDay    (int n);                 // adds n to the day

private:
  int day_,         // the day of the month, restricted to [0..31]
      month_,       // the month of the year, restricted to [1..12]
      year_;        // positive = AD, negative = BC, restricted to non-zero
} ;

Then the implementation file:

/*
    file: date.cpp

    Implementations of methods of the Date class

    (other file documentation)
*/

#include < date.h >  // defines the class Date 
#include < time.h >  // defines struct today, used by constructor

Date::Date (int d, int m, int y) : day_(d), month_(m), year_(y)
{
  // empty body - used initialization list
}

Date::Date () :  day_(today.d),  month_(today.m), year_(today.y)

{
  // empty body - used initialization list
}

void Date::AddYear (int n)
{
  year_ += n;
}

void Date::AddMonth (int n)
{
  month_ += n;
}

void Date::AddDay (int n)
{
  day_ += n;
}

Just as with functions, the implementation file can be separately compiled to object code. Note that the implementation file requires inclusion of the header file in order that the class Date, as both a namespace and for its member variables, is defined. Code libraries typically pre-compile the implementation files and make the header files available to the compiler.

As with functions, there are exceptions where both the class definiiton and member function implementations are placed in the same file. Most of these exceptions are template files, which are discussed in a future chapter.

Note that we have used initialization lists to initialize class variables in the constructors. This is best practice, replacing initialization in the constructor body.

Structures and Classes

The C structure, keyword struct, served as the basis for the c++ class. In C, struct defines only a way to organize data: structures have no member functions and all data in a struct is public. In C++, the concept of structure (keyword struct also exists. However, in C++, structures may have all of the features of classes, including member data and member functions. In fact: The only distinction between C++ struct and C++ class is that the default access for struct is public, whereas the default access for class is private. In particular, structures in legacy C code may be used with the same meaning in C++. As an example, the following two declarations are exactly equivalent:

struct Widget
{
       Widget  ();
  void Func1   ();
  int  Func2   ();

private:
  char Func3   ();
  int  data1_;
  char data2_;
};

class Widget
{
  char Func3   ();
  int  data1_;
  char data2_;

public:
       Widget  ();
  void Func1   ();
  int  Func2   ();
};

Ether way it is defined, Widget has a public constructor, public member functions Func1 and Func2, a private member function Func3, and two private data members data1_ and data2_.

In practice, it is best to use the keyword class when defining classes in C++. Some professionals enjoy using struct under special circumstances (when the class is identical to a C struct) but even this practice is discouraged. It does little except to let the reader know that "hey - I'm an old C programmer, and I use this new-fangled C++ just because it is better".