C++ Program Structure

A C++ program must adhere to certain structural constraints.

The reason use of the form #include" " is not recommended: use of "" builds an assumption about the location of files into the source code file itself. Using <> makes the project structure independent of location of files, leaving the location issues to the project build, where they are handled with the -I compile option. It is very common for project files to be developed in one directory and then migrate into various directories after development. Using <> makes the code files independent of changes in file location, whereas using "" means the files would have to be edited when their locations change.

Native Data Types

Native (aka "built in" or "atomic") data types are the types defined by the C++ language.

The sizes for the various types are implementation dependent, with some constraints. The size of char is typically one byte, and the sizes must be non-decreasing as you read from left to right in these lists. To see what a particular installation uses, run this program:

#include <iostream>
int main()
{
   std::cout << "Size of bool           = " << sizeof(bool)   << " bytes\n\n";

   std::cout << "Size of char           = " << sizeof(char)   << " bytes\n";
   std::cout << "Size of short          = " << sizeof(short)  << " bytes\n";
   std::cout << "Size of int            = " << sizeof(int)    << " bytes\n";
   std::cout << "Size of long           = " << sizeof(long)   << " bytes\n\n";

   std::cout << "Size of unsigned char  = " << sizeof(unsigned char)   << " bytes\n";
   std::cout << "Size of unsigned short = " << sizeof(unsigned short)  << " bytes\n";
   std::cout << "Size of unsigned int   = " << sizeof(unsigned int)    << " bytes\n";
   std::cout << "Size of unsigned long  = " << sizeof(unsigned long)   << " bytes\n\n";

   std::cout << "Size of float          = " << sizeof(float)  << " bytes\n";
   std::cout << "Size of double         = " << sizeof(double) << " bytes\n";
   std::cout << "Size of long double    = " << sizeof(long double) << " bytes\n";
   return 0;
}

The sizeof() function may be applied to user-defined types as native types. It also applies to variables. The return value is in units of bytes.

Declared Variable Attributes

Every declared variable has the following attributes:

The declaration of variables is discussed later in this chapter.

Declared variables are static, meaning that (1) they have names that are fixed and determined at the time and place they are declared and (2) there is memory bound to the variable at the time the program is compiled. We say that these variables are bound to memory at compile time. When a program is compiled, a symbol table is created mapping the static variable name to its type, size, address and other attributes. Another kind of variable is bound at run time and called dynamic. These are discussed in another chapter.

Naming Variables

The names of variables (and other identifiers chosen by the programmer, such as names of constants, classes, and types) are subject to constraints:

Here is a list of reserved keywords in C++:

asm       auto              bad_cast   bad_typeid 
bool      break             case       catch 
char      class             const      const_cast 
continue  default           delete     do 
double    dynamic_cast      else       enum 
except    explicit          extern     false 
finally   float             for        friend 
goto      if                inline     int 
long      mutable           namespace  new 
operator  private           protected  public 
register  reinterpret_cast  return     short 
signed    sizeof            static     static_cast 
struct    switch            template   this 
throw     true              try        type_info 
typedef   typeid            typename   union 
unsigned  using             virtual    void 
volatile  while

Note that other words are defined in standard libraries and are therefore reserved implictly. Examples include: size_t (defined in <stdlib>) and wchar_t (defined in <iostream>).

Declaring Variables

The basic format for declaration is:

typeName variableName;

Note that the left-hand item is a type, and must be understood as such by the compiler. The right-hand item is the identifier chosen by the programmer. The following are some examples of declarations:

int x;
float y;
int a, b, c;        // can also list several variables in one declaration 
                    //  statement, separated by commas
int a=0, b=0, c;    // can also initialize variables in 
                    //  declaration statements
int a(0), b(0), c;  // alternate syntax for initialization

A variable can be declared constant by using the keyword const; a constant must be initialized in the same statement, as in the following example:

const double PI = 3.14159;

Here are more examples:

int main()
{
   int x;
   float average;
   char letter;
   int y = 0;
   int bugs, daffy = 3, sam;
   double larry = 2.4, moe = 6, curly;
   char c, ch = 't', option;
   long a;
   long double ld;
   unsigned int u = 12;
}

Literals


Comments

A comment in a program is a portion that is ignored by the compiler. The very important purpose of comments is to serve the humans who create, read, and maintain the code. The C block-style technique of commenting carries over to C++, i.e., comments can be enclosed in delimiters /* (comment here) */. This form of comment is useful for multiple lines of commentary.

/* This is a comment.
   It can stretch over
   several lines.  */ 

C++ allows an additional comment style, designed for shorter remarks embedded in code but suitable for only one line of comment:

int x;    // This is a comment that ends at the end of this line
x = 3;    // This is a comment that ends at the end of this line

Everything from the double slash // to the end of the line is a comment.

Operators

Operators are functions with special evaluation syntax. It's a good idea to keep in mind that operators are functions, because the function evaluation syntax must be used when re-defining operators, a topic we will study in this course.

Here is an example of the familiar addition operator used with operator syntax and operator function syntax:

int x, y, z;          // declare three int variables
...                   // code that gives x and y each a value
z = x + y;            // operator syntax
z = operator+(x,y);   // operator function syntax

The last two lines of this code have identical behavior: z is assigned a value equal to the sum of x and y.

A unary operator is an operator function that has one operand. A binary operator is an operator function that has two operands. A ternary operator is an operator function that has three operands. With the exception of the "conditional expression" operator expr ? expr : expr inherited from C, all C++ operators are either unary or binary.

C++ has a very rich set of operators. (There are 67 C++ operators listed on pp. 120-121 of [Stroustrup], with 18 levels of precedence.) We will discuss only a few here. Others will be introduced as they are needed.

Arithmetic Operators

The basic arithmetic operators are defined as native for integer types as follows:

Name       Symbol   Type       Usage
Add + binary x + y
Subtract - binary x - y
Multiply * binary x * y
Divide / binary x / y
Modulo % binary x % y
Minus - unary -x

These operators perform arithmetic as expected for integers. Note that division x/y produces the quotient and modulo x%y produces the remainder when x is divided by y. All but modulo are overloaded for floating point types and have meaning as expected in that context. Here is an illustration:

int   p = 23, q = 5, r;
float x = 23, y = 5, z;
r = p / q; // r has the value 4
r = p % q; // r has the value 3
z = x / y; // z has the value 4.6

When doing an arithmetic operation on two operands of the same type, the result is also the same type. What about operations on mixed types? Suppose we have the declarations

int x = 5; 
float y = 3.6; 

If we do (x + y), what type will the result have? In this case, a float. The rule is: an arithmetic operation on two mixed types returns the larger type. The result of (x + y) will be 8.6.

Please keep in mind that we are glossing over the issue of internal representation. A computer uses binary representation for internal storage of numbers. We are using decimal representation in this discussion. Decimal notation is only an external representation for human use. This is what we would see if we output the values to screen.

Operator Precedence

As noted above, there are 18 levels of operator precedence in C++, way beyond the scope of this review. (We will encounter many of these operators during this course, but not all.) This is both very rich and extraordinarily complicated to remember. The best rule is: When in doubt, use parentheses to force the order of operator evaluation.

For the more familiar and oft-used operators, however, it is easy to remember their syntax, precedence, and associativity. Here is a table of most of the C++ operators:

Common C++ Operators, Grouped by Precedence (High to Low)
Name Usage
binary scope resolution class_name :: member
binary scope resolution namespace_name :: member
unary (global) scope resolution :: name

value construction type expr
run-time checked conversion dynamic_cast<type> ( expr )
compile-time checked conversion static_cast<type> ( expr )
unchecked conversion reinterpret_cast<type> ( expr )
const conversion const_cast<type> ( expr )
post increment Lvalue++
post decrement Lvalue--
member selection object.member
member selection pointer->member
bracket operator pointer [ expression ]
function call function ( parameter list )

size of object sizeof expression
size of type sizeof ( type )
pre increment ++Lvalue
pre decrement --Lvalue
not ! expression
unary minus - expression
address of & lvalue
dereference * pointer
create new type
destroy delete type

member selection object.*pointer-to-member
member selection pointer->*pointer-to-member

multiply expr * expr
divide expr / expr
modulo expr % expr

add expr + expr
subtract expr - expr

shift left expr << expr
shift right expr >> expr

less than expr < expr
less than or equal expr <= expr
greater than expr > expr
greater than or equal expr >= expr

equal expr == expr
not equal expr != expr

bitwise AND expr & expr

bitwise XOR expr ^ expr

bitwise OR expr | expr

logical AND expr && expr

logical OR expr || expr

conditional expression expr ? expr : expr

assignment lvalue = expr
multiply and assign lvalue *= expr
divide and assign lvalue /= expr
modulo and assign lvalue %= expr
add and assign lvalue += expr
subtract and assign lvalue -= expr
shift left and assign lvalue <<= expr
shift right and assign lvalue >>= expr
bitwise AND and assign lvalue &= expr
bitwise OR and assign lvalue |= expr
bitwise XOR and assign lvalue ^= expr

throw exception throw expr

comma (sequencing) expr , expr

Note that the arithmetic operators have relative precedence in the language that follows normal mathematical usage. Note also that assignment (and its embelishments) have very low precedence, so that in a statement such as

x = a + b * c;

the evaluation is as you would hope and expect, namely x is assigned the value a + ( b * c ). There are suprises lurking in all this complexity, however, so remember the "when in doubt" rule.

Increment and Decrement

C++ has a number of unary operators. Among the most used are the four increment/decrement operators:

int x,y;
...
++x;   // prefix increment   same as x = x + 1; returns reference to (new) x
x++;   // postfix increment  same as x = x + 1; returns value of old x
--x;   // prefix decrement   same as x = x - 1; returns reference to (new) x
x--;   // postfix decrement  same as x = x - 1; returns value of old x

Note the distinction between the pre- and post- versions. The prefix returns a reference to the (newly updated) variable. The postfix returns the value of the variable before updating it. The behaviors are illustrated in this code example:

x = 2;
y = ++x;    // x and y have the value 3
y = x++;    // y has the value 3 and x has the value 4

Because the postfix versions of increment/decrement must build and return a value, they are slightly less efficient than the prefix versions. It is therefore good practice to use the prefix versions unless there is a specific need for postfix.

Operator Associativity

Each operator has a default associativity used when an otherwise ambiguous expression is formed. For example, the statement

sum = x + y + z;

is technically ambiguous, because operator+( , ) requires exactly two arguments, and there are three in the statement. The default associativity takes over in such situations to provide consistent meaning:

sum = x + y + z;    
sum = (x + y) + z;  // statement meaning identical to first

That is, first x and y are added, then z is added to the result. For this operator, default associativity is left-to-right, or LR. Most binary operators have LR default associativity. A notable exception is the assignment operator = which associates right-to-left (RL):

a = b = c;    // valid statement
a = (b = c);  // identical meaning

That is, first c is assigned to b, then b is assigned to a. The result is that all three variables a, b, and c have the same value.

Assignment and Equality Operators

The symbol = has ambiguous meaning in algebra, sometimes asserting that two things have the same value (as in "let x = y"), and other times asking the question whether two things have the same value (as in "solve x = y"). These two usages must be separated in a programming language. The first is assignment and is done in C/C++ with operator =. The second is equality and is done in C/C++ with operator ==. These two operators are very different in meaning and useage. Unfortunately, they are very similar in appearance, which can cause problems debugging programs when they are inadvertantly interchanged.

Assignment is an operator that first, as a side effect, makes its Lvalue (the operand on its left) equal to its Rvalue (the operand on its right), and second returns a reference to the (new) Lvalue). Here are some example usages:

x = 5;       // x is assigned the value 5
y = 10;      // y is assigned the value 10
z = x + y;   // z is assigned the value of the expression x + y,
             //  which is evaluated first, obtaining 15, which is assigned to z

Equality is an operator that returns "true" or "false" (either a boolean value or an integer), depending on whether the arguments are in fact equal or not. Equality is commonly used to test for conditional branching or loop termination:

if ( x == y )         // conditionally execute one of two statements
  z = 1;
else
  z = 2;

do 
{
  whatever();
}
while (x == 100);   // conditionally terminate loop

There is an entire family of assignment operator derivatives, such as operator += and operator &=. There is another family of equality/inequality operators such as operator < and operator >=.

Implicit Type Conversion

Whenever a variable of an unexpected type is used in an expression, either the ambiguities must be resolved by implicit, or "automatic" type conversion, or an error will occur. The general rule is that when there is a known rule for converting the unexpected type to the expected type, that rule will be invoked and computation can proceed.

For example, when the types on the left and right sides of an assignment statement do not match, that is, the Rvalue and Lvalue have different types, the assignment statement is allowed to proceed if and only if there is a way provided to convert from the Rvalue type to the Lvalue type. For native types, there is generally a way to convert from smaller types to larger types but not the reverse. (Care must also be taken when converting between signed and unsigned types of the same size.) Similarly, when mixed types appear in an arithmetic expression, the types will be converted to the largest type appearing in the expression:

char          a, b;
int           m, n;
float         x, y;
unsigned int  u, v;
m = a;  // OK
a = m;  // error
x = m;  // OK
u = m;  // dangerous - possibly no warning
x = a + n; // result of (a + n) is type int; converted to type float for assignment

Type conversion is another place where a "when in doubt" rule should be used: When in doubt, make type conversions explicit.

Explicit Type Conversion: Casting

It is excellent practice to always explicitly convert types rather than rely on the compiler, which may not always know the programmer's intent. Most experienced programmers use implicit type conversion only within one of these two families of native types:

Signed Family = {char, short, int, long, float, double, long double}
Unsigned Family = {unsigned char, unsigned short, unsigned int, unsigned long}

and otherwise use explicit type conversion in the form of cast operators. The C cast operator is invoked like this:

c = (char)y;   // cast a copy of the value of y as a char, and assign to c
x = (int)b;    // cast a copy of the value of b as an int, and assign to x

C++ installations may or may not recognize the C cast operator. C++ has a richer set of casting operators that give the programmer better control of how and when the type conversion occurs. The analog of C casting is the operator static_cast<type_name>(expr), which converts the value of expr to type type_name. This new style cast is invoked like this:

c = static_cast<char>(y); 
x = static_cast<int>(b); 

There are two other C++ cast operators, dynamic_cast<type_name>(expr) and reinterpret_cast<type_name>(expr). The angle brackets in these operators are used to denote template parameter arguments, a topic we will cover later in the course.

Scope

The scope of a variable is the portion of the source code where the variable is valid, or "visible", to the computational context. Scope is determined implicitly by program structure.

A variable that is declared outside of any compound blocks and is usable anywhere in the file from its point of declaration is called a global variable and is said to have global scope. A variable declared within a block (i.e. a compound statement) has scope only within that block.

C++ allows the declaration of variables anywhere within a program, subject to the declare before use rule. C requires variable declarations at the beginning of a block. Here is code illustrating scope of three variables:

                                //   scopes:
                                //  x  i  j  k
float x;                        //  |
int main()                      //  |
{                               //  |
  int i;                        //  |  |
  for (int j = 0; j < 100; ++j) //  |  |  |
  {                             //  |  |  |
    std::cin >> i;              //  |  |  |
    int k = i;                  //  |  |  |  |
    // more code                //  |  |  |  |
  }                             //  |  |
  return 0;                     //  |  |
}                               //  |

Note that x is global, i and j are local. The following code is more subtle.

#include <iostream>
int main()
{
  std::cout << "\nStarting Program\n";
  {
    int x = 5;			// declare new variable
    std::cout << "x = " << x << '\n';
    {
      int x = 8;
      std::cout << "x = " << x << '\n';
    }
    std::cout << "x = " << x << '\n';
    {
      x = 3;
    }
    std::cout << "x = " << x << '\n';
  }
}

You can run the program to be sure you understand how scope rules affect the values of the variables.

Namespaces

Namespaces are also used to limit the scope of identifiers. A namespace is created using the namespace key word as follows:

// filename: mystuff.h
namespace mystuff
{
  // declare, define things here, such as
  int myfunction (int x)
  {
    // code here
  }
} // end namespace mystuff

Any declarations and definitions made within the namespace block will not be in the global namespace, but will be in the namespace mystuff. Thus statements using items from mystuff must "resolve" the scope with a scope resolution operator ::, as in this code:

#include < mystuff.h >
int x, y;
x = myfunction(y);          // error - unrecognized identifier
x = mystuff::myfunction(y); // OK - namespace resolved

Implicit namespace resolution may be used by invoking a using directive, as follows:

#include < mystuff.h >
using mystuff;              // includes mystuff into global namespace
int x, y;
x = myfunction(y);          // OK
x = mystuff::myfunction(y); // OK

A namespace may be opened and added to in several places, which enhances the convenience of their use. For example, several different files could add items to the mystuff namespace. Namespaces are extremely useful when more than one person is working on code for a single project, a situation quite common in the professional world. Using namespaces can prevent name conflicts and ambiguities created when two programmers (or one programmer at two different times) happen to use the same identifier for different purposes.

In general, global variables should be avoided. For this reason, the using directive should also be avoided in this course.

Storage Class and Linkage

The storage class of a variable determines the period during which it exists in memory. (Note that this period must be at least as long as the variable is in scope, but it may excede that time.) There are two storage classes: automatic and static. Confusingly, C++ provides five storage class specifiers which determine not just the storage class but also other things such as linkage and how the variable is treated by various components of a running program.

The storage class specifiers are as follows:

Clearly these specifiers affect the storage class, the linkage, and in some cases the scope of the variable.

C++ Standard I/O

In C, I/O (Input/Output) is handled with functions found in stdio.h: printf and scanf. These are examples of "formatted" I/O statements and assume a certain file-oriented format.

In C++, I/O is handled through objects called streams. This is a very useful, flexible, and programmer-modifiable system; it is also somewhat complex. We will begin using two streams cin and cout that are pre-defined in the library iostream. cin is an object of type istream and cout is an object of type ostream. These stream objects reside in the namespace std. cin is typically bound to the keyboard and cout is typically bound to the screen, although files can be substituted using re-direction.

To use these stream objects:

  1. Include the library where they are defined: #include<iostream>
  2. Resolve the namespace where they are defined: std::cin and std::cout
  3. Typically work with the input and output operators >> and <<, respectively
/* example use of cout with output operator */
#include <iostream>           // includes library
std::cout << "Hello World";   // sends string "Hello World" to screen
std::cout << 'a';             // sends character 'a' to screen
std::cout << x << y << z;     // sends values of x, y, and z to screen, in that order
/* example use of cin with input operator */
#include <iostream>           // includes library
std::cin >> x;                // read entered value into x
std::cin >> a >> b >> c;      // read three entered values to a, b, c (in that order)

Note that literals cannot be used on the right side of the input operator. The input operator reads data IN to a variable location: The right side of the operator must specify an Lvalue.

The input operator is sometimes called the extraction operator - it "extracts" data from the input stream object and puts it into the variable. Similarly, the output operator is sometimes called the insertion operator - it "inserts" a copy of the data into the output stream object. Some people find this terminology somewhat convoluted and prefer "input/output" to "extraction/insertion".

Notes on Archaic Code

C++ was officially standardized in 1998, so some older code will not reflect some of the newer features. A few worth noting:

The compilers in the FSU CS student computing environment are versions of the gnu g++ compiler. They are in fairly good compliance with the C++ standard. There is also an active standards group that will soon adopt a revised standard for C++. This will present yet another target for textbook and compiler writers as this rich language evolves.

References