There are many reasons why all modern programming languages have features used to break a program into components that may be written, compiled, and tested as a unit. The existence and use of such features is called program modularity. The most important benefits of program modularity are:
It is unthinkable to create any sizeable program without generous use of
modularity in its design.
Functions and classes constitute the principal means of modularization in C++. C++ classes and objects are used to:
Classes and objects may be created as part of an application program, or they may have been created in a previous project and be used again, or they may be part of the language system and distributed in libraries (such as iostream).
C++ shares the use of stand-alone functions with C and the use of classes with
Java. C++ can thus be thought of as covering two independent dimensions of
programming language design: procedural programming as in C and
object-oriented programming as in Java. In this sense, C++ can be
thought of as a multi-dimensional language with C and Java representing
orthogonal subspaces of C++.
According to [Stroustrup, p. 224], a class in C++ is a user-defined type, and an object is an instance of that type. Of course, there is a lot more to classes and objects: taken together, with all their complexity and subtlety, they form the basis of two entire programming paradigms: object-based and object-oriented programming.
The following steps must be accomplished before an object can be used:
Here is a simple example of a class definition, based on one from [Stroustrup]:
class Date { public: void Init(int d, int m, int y); // sets the date void AddYear(int n); // adds n to the year void AddMonth(int n); // adds n to the month void AddDay(int n); // adds n to the day private: int day_, month_, year_; // the basic date data } ;
Before proceeding, let's look at formatting for classes. The above is perfectly OK, and meets a minimal standard of format readability. The following is equivalent code, but is even more readable and facilitates an organized understanding of the class definition:
class Date { public: void Init (int d, int m, int y); // sets the date void AddYear (int n); // adds n to the year void AddMonth (int n); // adds n to the month void AddDay (int n); // adds n to the day private: int day_, // the day of the month, restricted to [1..31] month_, // the month of the year, restricted to [1..12] year_; // positive = AD, negative = BC, restricted to non-zero } ;
Notice that this second format lines up the critical elements of declarations
in columns: return values, member function names, parameter lists, and
documentation statements. The data is also better documented by placing each
item on a separate line. We will generally follow a formatting system like this
second example.
Returning to the actual code, here is a third rendering, with the documentation removed to make room for explanations. Reserved words and required syntactical components are shown in color:
class Date // "class" followed by programmer-spec'd class name { public: // portion of class accessible through objects void Init (int d, int m, int y); // member function void AddYear (int n); // member function void AddMonth (int n); // member function void AddDay (int n); // member function private: // portion of class with restricted access int day_, month_, year_; // member data } ; // brace and semicolon required
A class definition begins with the reserved word class followed by the class name, which is chosen by the programmer subject to the usual identifier constraints. The body of the defintion follows, captured by enclosing braces. Finally there is a required semicolon, whose purpose is to allow for declaring objects of the new type immediately as it is defined.
Note that there are two kinds of members defined for this class: member functions and member variables. The member functions are declared via prototypes, and the member variables are declared as typed variables, both in the usual way. These class members have scope equal to the class itself. In this way the class is its own namespace.
The reserved words public and private are used to designate the access control to that portion of the class. Public access means that any object may access that class member using the "dot" notation. Private access means that the member can be accessed only by other class members. In this example, all of the methods are public and all of the data is private. This is a typical design constraint, but it is not required by the language and there are exceptions where the constraint is best not followed.
There are three increasingly restrictive levels of access control status: public, protected, and private, with private
being the default. Levels protected and private are distinguished only under class inheritance
which we will discuss in a following chapter.
The keyword friend is used to grant a non-member function or another class access to private data. For example:
class Date { friend bool operator == (const Date& d1, const Date& d2); // allows access to private data by non-member function of class public: // portion of class accessible through objects void Init (int d, int m, int y); // member function void AddYear (int n); // member function void AddMonth (int n); // member function void AddDay (int n); // member function private: // portion of class with restricted access int day_, month_, year_; // member data } ;
allows the boolean operator==() access to the private data of class Date for its imple,entaton, which would go something like this:
int operator ==(const Date& d1, const Date& d2) { return d1.day_ == d2.day_ && d1.month_ == d2.month_ && d1.year_ == d2.year_; }
Friendship is necessary in many cases, especially to implement operators as in
the example above. Granting friendship to another class is an especially generous
"giveaway" of privacy rights and should be done with extreme caution in code
design.
The member function prototypes in a class definition must be implemented. This is the same process as implementing a stand-alone function, except that the class scope and class membership must be taken into account. Class scope is resolved using the scope resolution operator ::. Class membership means that the class variables may be used by the class member function implementation without any re-declaration. Here are two examples:
void Date::Init (int d, int m, int y) { day_ = d; month_ = m; year_ = y; } void Date::AddYear (int n) { year_ += n; }
Note how the class scope is indicated by the class name and the scope
resolution operator and how the member variables are accessed by the function
bodies without declaration.
Objects are just variables of user-defined type. (In fact, we often speak of variables of the built-in types as objects, though they are not strictly instances of a class.) Here is sample code fragments creating objects of several types:
# include < fstream > // defines classes ifstream and ofstream int main() { std::ifstream ifs; // ifs can be used to open files for reading std::ofstream ofs; // ofs can be used to open files for writing Date d1, d2; // d1 and d2 are Date objects ... }
Note that ifs, ofs, d1, and d2 are all objects of various types. We may call them "objects" or "variables" as taste and circumstances encourage. These objects may access the public members of their class using the "dot" notation, as in the following code fragments:
# include < fstream > // defines classes ifstream and ofstream int main() { std::ifstream ifs; // ifs can be used to open files for reading std::ofstream ofs; // ofs can be used to open files for writing Date d1, d2; // d1 and d2 are Date objects ... ifs.open("file1"); // opens file "file1" for reading ofs.open("file2"); // opens file "file2" for writing d1.Init(7,9,2004); // sets the date to Sep 7, 2004 ... }
This "dot" notation is part of the legacy of C and the classic struct of that
language. The C++ class is a generalization of the C
struct. Note that "dot" is actually another C++ operator.
Whenever an object is created, for example by a declaration such as the declarations of ifs, ofs, d1, and d2 above, a constructor is called as the object is created. The object doesn't officially exist until these two steps have been completed:
A memory footprint is a contiguous block of memory that is correctly sized for the object: the number of bytes should equal the value returned by sizeof(T), where T is the type (i.e., class) of the object. A constructor is a member function of the class that is prototyped with a special syntax and implemented along with all other member functions of the class. The purpose served by the constructor is to appropriately initialize the object. The constructor syntax is special in two ways: first, constructors have no return type; and second, they are named the same as the name of the class. Here is our class Date with some constructors added:
class Date { public: Date (); // sets the date today Date (int d, int m, int y); // sets the date to arguments void AddYear (int n); // adds n to the year void AddMonth (int n); // adds n to the month_ void AddDay (int n); // adds n to the day private: int day_, // the day of the month_, restricted to [0..31] month_, // the month_ of the year, restricted to [1..12] year_; // positive = AD, negative = BC, restricted to non-zero } ;
The two new methods, each named "Date", are constructors. Note that no return type is given. Also note that these are overloads of the same name, so they must be distinguished by their parameter lists, as they are in this example. We could implement these something like the following:
Date::Date (int day, int month, int year) { day_ = day; month_ = month; year_ = year; } Date::Date () { day_ = today.d; month_ = today.m; year_ = today.y; }
where today is the C library structure that maintains the date from the operating system. (In most Unix systems this struct is globally defined in time.h.) Note that Date() sets the date to today's date. Thus with these constructors we may now make declarations
Date d1; Date d2(11,9,2001);
Then d1 is initialzed to today's date, while d2 is initialized to September 11, 2001.
According to [Stroustrup, p.226], The use of functions such as Init() to provide initialization for class objects is inelegant and error-prone. Most theorists and practioners of OBP and OOP would agree. Therefore we have removed this function from our class definition and will refrain from introducing such functions in the future. Initialization should be handled exclusively through constructors.
On the other hand, it is a good thing to isolate relatively complicated segments
of code into separate function calls instead of repeating them in several
places. This practice is an aspect of modularity and highly
encouraged. If, for example, several constructors (and perhaps also an
assignment operator and copy constructor - see the next chapter) all use the
same body of code, that code may be used to define a class method that is in
turn called by each of these. Such helper functions should be made
private so that they are not accessible to client programs.
We stated above that a constructor is called whenever an object is created. What happens if there is no constructor defined for the class?
Under that circumstance, a default constructor for the class is created by the compiler. This default constructor does little more than tag the object as belonging to its class (after the memory footprint has been created) and then make sure that the default constructors for the data members of the class are called. The default constructor is "parameterless".
If on the other hand there is any constructor defined for a class, then that class will not receive any other constructor by default: once a constructor is defined, then the programmer is responsible for defining all other constructors, including a parameterless constructor if one is needed. For simplicity, a parameterless constructor, whether defined by the compiler or the class programmer, is often called the "default" constructor for the class. Using this terminology, Date has a default constructor.
A parameterless, or default, constructor is almost always a desired feature of a class. Among other reasons, a default constructor is called by a parameterless declaration. For example:
Date d1; // calls the parameterless (default) constructor Date d2 (11, 9, 2001); // calls the 3-parameter constructor
The first declaration above would result in a compile error if no parameterless/default constructor had been defined.
One final point is that constructors may have default parameter values and these may implicitly define a parameterless constructor. For example, in the following version of Date:
class Date { public: Date (int d = 0, int m = 0, int y = 0); void AddYear (int n); void AddMonth (int n); void AddDay (int n); private: int day_, month_, year_; } ;
the parameters have default value zero. A clever observation makes this work: because 0 is not a valid day, month, or year, the value 0 can be used to trigger a change to today's value:
Date::Date (int d, int m, int y) : day_(d), month_(m), year_(y) { if (day_ == 0) day_ = today.d; if (month_ == 0) month_ = today.m; if (year_ == 0) year_ = today.y; }
The single constructor with three default parameter values works with 0, 1, 2,
or 3 arguments and thus replaces separate definitions of 0-, 1-, 2-, and
3-parameter constructors.
The comma-separated list that is part of the header of the implementation of the constructor above is called an initialization list. The items of the initialization list are specific calls to constructors for the data members of the class.
Date::Date (int d, int m, int y) : day_(d), month_(m), year_(y) { // optional code here also }
The effect of using an initialization list is the same as performing the initializations in the body of the constructor implementation. However, use of the initialization list is preferred for several reasons, including the following:
Initialization lists can be used for constructors with and
without parameters as well as copy constructors.
C++ Class code is usually organized into header files and implementation files, following the practice established by C for functions. The header file would contain the class definition, and the implementation file would contain the implementations of the class methods. Here then are typical files and contents for the Date class. First the header file:
/* file: date.h Containing the definition of the class Date (other file documentation) */ class Date { public: Date (); // sets the date today Date (int d, int m, int y); // sets the date to arguments void AddYear (int n); // adds n to the year void AddMonth (int n); // adds n to the month void AddDay (int n); // adds n to the day private: int day_, // the day of the month, restricted to [0..31] month_, // the month of the year, restricted to [1..12] year_; // positive = AD, negative = BC, restricted to non-zero } ;
Then the implementation file:
/* file: date.cpp Implementations of methods of the Date class (other file documentation) */ #include < date.h > // defines the class Date #include < time.h > // defines struct today, used by constructor Date::Date (int d, int m, int y) : day_(d), month_(m), year_(y) { // empty body - used initialization list } Date::Date () : day_(today.d), month_(today.m), year_(today.y) { // empty body - used initialization list } void Date::AddYear (int n) { year_ += n; } void Date::AddMonth (int n) { month_ += n; } void Date::AddDay (int n) { day_ += n; }
Just as with functions, the implementation file can be separately compiled to object code. Note that the implementation file requires inclusion of the header file in order that the class Date, as both a namespace and for its member variables, is defined. Code libraries typically pre-compile the implementation files and make the header files available to the compiler.
As with functions, there are exceptions where both the class definiiton and member function implementations are placed in the same file. Most of these exceptions are template files, which are discussed in a future chapter.
Note that we have used initialization lists to initialize class
variables in the constructors. This is best practice, replacing initialization
in the constructor body.
The C structure, keyword struct, served as the basis for the c++ class. In C, struct defines only a way to organize data: structures have no member functions and all data in a struct is public. In C++, the concept of structure (keyword struct also exists. However, in C++, structures may have all of the features of classes, including member data and member functions. In fact: The only distinction between C++ struct and C++ class is that the default access for struct is public, whereas the default access for class is private. In particular, structures in legacy C code may be used with the same meaning in C++. As an example, the following two declarations are exactly equivalent:
struct Widget { Widget (); void Func1 (); int Func2 (); private: char Func3 (); int data1_; char data2_; }; class Widget { char Func3 (); int data1_; char data2_; public: Widget (); void Func1 (); int Func2 (); };
Ether way it is defined, Widget has a public constructor, public member functions Func1 and Func2, a private member function Func3, and two private data members data1_ and data2_.
In practice, it is best
to use the keyword class when defining classes in C++. Some
professionals enjoy using struct under special circumstances (when the
class is identical to a C struct) but even this practice is
discouraged. It does little except to let the reader know that "hey - I'm an old
C programmer, and I use this new-fangled C++ just because it is better".