Term not found? Please inform the instructor. Thanks!

Copyright R.A. van Engelen, FSU Department of Computer Science, 2000-2003

Background Information and Glossary


Instructions: click on the Define this term-buttons provided in the notes to obtain a detailed description of a term.
 
 
 
 
 
 
 
 


Programming languages:
Programming languages are central to Compute Science. They reflect many aspects of Computer Science in a nutshell, such as language syntax, programming lanugage semantics (meaning), applicatin of theorem proving (for type checking and inference), abstract and virtual machinesDefine this term, data structures, software engineering, computer architecture and hardware issues, etc.

Programming languages follow simple syntactic conventions (as opposed to natural languages), see also BNFDefine this term syntax. Programming languages are compiled into machine codeDefine this term (also known as object code) by a compilerDefine this term and linkerDefine this term, or interpreted by an interpreterDefine this term, or executed by a hybrid compiler/interpreter.

Programming languages can be classified as imperativeDefine this term or declarativeDefine this term. This classification is further subdivided as follows:
 

declarativeDefine this term: implicit solution
"what the computer should do"
functionalDefine this term (e.g. LispDefine this term, SchemeDefine this term, MLDefine this term, HaskellDefine this term)
logicDefine this term (e.g. PrologDefine this term)
dataflowDefine this term
imperativeDefine this term: explicit solution
"how the computer should do it"
proceduralDefine this term ("von Neumann", e.g. FortranDefine this term, PascalDefine this term, BasicDefine this term, CDefine this term)
object-orientedDefine this term (SmalltalkDefine this term, EiffelDefine this term, C++Define this term, C#Define this term, JavaDefine this term)


Imperative programming languages:
Programs written in imperative programming languages describe exactly the computational steps necessary for the computer to obtain a result. In contrast, declarative languagesDefine this term allow a programming problem to be stated without certain explicit details by which the calculation should proceed. Imperative languages are proceduralDefine this term  languages (e.g. FortranDefine this term, PascalDefine this term, BasicDefine this term, CDefine this term) and (most) object-orientedDefine this term languages (SmalltalkDefine this term, EiffelDefine this term, C++Define this term, C#Define this term, JavaDefine this term).


Declarative programming languages:
Programs written in a declarative programming lack explicit details by which the calculation should proceed. Rather, a program is written in a style that assumes a more implicit execution ordering. Typically, recursionDefine this term is used in declarative programming, possibly in combination with higher-order functionsDefine this term. Declarative languages are the functionalDefine this term languages (e.g. LispDefine this term, SchemeDefine this term, MLDefine this term, HaskellDefine this term), logicDefine this term  languages (e.g. PrologDefine this term), and dataflowDefine this term languages.


Functional programming languages:
The underlying machinery of functional programming languages is based on Church's lambda calculusDefine this term. The computational model is based on recursive functions and a program is considered a function thats maps inputs to outputs. Through the process of top-down refinement, a program is defined in terms of simpler functions.

Example languages in this category are LispDefine this term, SchemeDefine this term, MLDefine this term, and HaskellDefine this term.

Functional program example (Haskell):

gcd a b
  | a == b = a
  | a >  b = gcd (a-b) b
  | a <  b = gcd a (b-a)

Dataflow programming languages:
Dataflow programming languages model computation as the flow of information among primitive functional nodes. An example of this model of computation is a spread-sheet program. The cells can be viewed as primitive computational units that communicate with other cells to obtain values used to calculate the value displayed in the cell from a formula.


Logic programming languages:
Logic programming languages are declarative languages that derive results by logical inference. PrologDefine this termfor example, is based on propositional logic. The computational model consists of an inference process on a database to find values that satisfy certain constraints and relationships.

Logic program example (Prolog):

gcd(A, A, A).  % note: if the first two arguments are the same, the third argument (GCD) is A
gcd(A, B, G) :- A > B, N is A-B, gcd(N, B, G).
gcd(A, B, G) :- A < B, N is B-A, gcd(A, N, G).

Procedural ("von Neumann") programming languages:
Although object-oriented programmingDefine this term is gaining more popularity, the procedural languages are still the most familiar and successful languages. The basic mode of operation is the modification of variables which is sometimes referred to as computing via side effectsDefine this term: procedural languages are based on statements that influence subsequent computation by chaning the value of memory. The success of these languages can be mainly contributed to the efficiency of the language implementation in our current computer architectures, called the von Neumann computer architectures. This common architecture exhibits a central processing unit (CPU) and memory which are connected by a bus.

Example procedural languages are Fortran 77Define this term, BasicDefine this term, PascalDefine this term, AdaDefine this term, and CDefine this term.

Procedural program example (C):

int gcd(int a, int b)
{ while (a != b)
    if (a > b) a = a-b; else b = b-a;
  return a;
}

Object-oriented programming languages:
Most object-oriented programming languages are closely related to the procedural languages. The fundamental difference in programming style, which is known as object-oriented programming OOPDefine this term), is that object-oriented languages put objects and their interactions on the forefront rather than computation as the operation of a processor on a monolithic memory. Each object has an internal state and executable functions to manage that state.

Example object-oriented languages are SmalltalkDefine this term, EiffelDefine this term, C++Define this term, C#Define this term, JavaDefine this term.


Safe programming languages:
Strong typingDefine this term is considered the most important safety issue of a programming language as typing errors are always detected. Among other things are the safety issues  dealing with the use of a progamming language on the Internet, such as in Java and C# which have elaborate authentication schemes. Languages such as C and C++ are not safe, because the compiler and runtime environment cannot guarantee type safety. For example, pointer casts can be used to change the type of the data pointed to without actually converting the data.

Here is an example of a safe cast to convert an integer to a float:

int n = 5;
float f = (float)n;
This is an example of an unsafe cast, which is prohibited in type-safe languages:
int n = 5;
float *fp = (float*)&n;
Such casts, whether explicit as above or implicit, can lead to disaster. Type-safe languages usually do not support pointer arithmetic to prevent accessing data of the ''wrong'' type.


Strong typing:
In a strongly typed programming language typing errors are always detected. The detection can be at compile time or at run time. A strongly typed language is considered more safe, because it prevents operations from being applied to the wrong type of object which can cause unintended modifications to the state of the program. For example, AdaDefine this term, JavaDefine this term, and HaskellDefine this term are strongly typed languages. CDefine this term and C++Define this term are not, e.g. because void pointers can point to any type of object that can be manipulated, see the example inDefine this term. PascalDefine this term is "almost" strongly typed. The exception is the use of a variant record (union) without discriminator. The variant record can hold alternative types of objects during the execution of a program.



Relocatable:
When machine code is relocatable in memory it means that the code can be moved from one location to another to make room for new or modified routines in memory. Relative addressing is used in the code and/or the absolute addresses in the code are converted before the code is executed. This conversion can take place during the loading of an executable program in memory by a loaderDefine this term.



Machine code or object code:
Machine code (also known as object code) consists of machine-specific operations expressed in binary code. A central processing unit (CPU) of a computer executes the binary machine code which is typically fetched from the main memory of a machine. An executable program consists of object code which contains a sequence of machine instructions. An executable program is loaded (sometimes with a loaderDefine this term and/or by an operating system (OS)) into main memory for execution. See also assembly languageDefine this term Assembly is translated into machine code (object code) by an assemblerDefine this term.



Assembler:
Translator of assemblyDefine this term programs (mnemonic instructions) to machine code (or object code).



Assembly language and machine/object code:
An assembly language is a processor-specific language that uses mnemonic abbreviations to define low-level machine instructions. The mnemonic abbreviations are translated into machine code (also called object code) by an assemblerDefine this term.

The abbreviations usually consist of the name for the instruction followed by operands which are register names such as 'sp' (stack pointer) and 'a0' (address register 0), memory references such as 'A' (local label) and 'putint' (function label), and memory offsets such as '20(sp)' (20 bytes/words from the location pointed to be the stack pointer ).

Example MIPS assembly program to compute GCD (from textbook page 1):

   addiu    sp,sp,-32
   sw       ra,20(sp)
   jal      getint
   nop
   jal      getint
   sw       v0,28(sp)
   lw       a0,28(sp)
   move     v1,v0
   beq      a0,v0,D
   slt      at,v1,a0
A: beq      at,zero,B
   nop
   b        C
   subu     a0,a0,v1
B: subu     v1,v1,a0
C: bne      a0,v1,A
   slt      at,v1,a0
D: jal      putint
   nop
   lw       ra,20(sp)
   addiu    sp,sp,32
   jr       ra
   move     v0,zero
Example MIPS R4000 machine code of the above assembly program (from textbook page 1):
27bdffd0 afbf0014 0c1002a8 00000000 0c1002a8 afa2001c 8fa4001c
00401825 10820008 0064082a 10200003 00000000 10000002 00832023
00641823 1483fffa 0064082a 0c1002b2 00000000 8fbf0014 27bd0020
03e00008 00001025

Structured programming:
Considered a revolution in programming in the 70s (much like object-oriented programming in the late 80s and early 90s). A programming technique that emphasizes top-down design, modularization of code (large routines are broken down into smaller, modular, routines), structured types (eg. records, sets, pointers, and multi-dimensional arrays), descriptive variable and constant names, and extensive commenting conventions. The use of the GOTO statement is discouraged to avoid spaghetti code (code that exhibits a criss-cross control flow behavior at run-time). Certain programming statements are indented in order to make loops and other program logic easier to follow.

Structured languages, such as PascalDefine this term and AdaDefine this term, force the programmer to write a structured program. However, unstructured languages such as Fortran 77Define this term, CobolDefine this term, and BasicDefine this term require discipline on the part of the programmer to follow.

Here is an example of a non-structured program in CDefine this term that counts the number of goto's in a file whose filename is given as an argument on the command line. This program can be used to measure the "spaghettiness" of a C program:

#include <stdio.h>
#include <malloc.h>
main(togo,toog)
int togo;
char *toog[];
{char *ogto,   tgoo[80];FILE  *ogot;  int    oogt=0, ootg,  otog=79,
ottg=1;if (    togo==  ottg)   goto   gogo;  goto    goog;  ggot:
if (   fgets(  tgoo,   otog,   ogot)) goto   gtgo;   goto   gott;
gtot:  exit(); ogtg: ++oogt;   goto   ogoo;  togg:   if (   ootg > 0)
goto   oggt;   goto    ggot;   ogog:  if (  !ogot)   goto   gogo;
goto   ggto;   gtto:   printf( "%d    goto   \'s\n", oogt); goto
gtot;  oggt:   if (   !memcmp( ogto, "goto", 4))     goto   otgg;
goto   gooo;   gogo:   exit(   ottg); tggo:  ootg=   strlen(tgoo);
goto   tgog;   oogo: --ootg;   goto   togg;  gooo: ++ogto;  goto
oogo;  gott:   fclose( ogot);  goto   gtto;  otgg:   ogto=  ogto +3;
goto   ogtg;   tgog:   ootg-=4;goto   togg;  gtgo:   ogto=  tgoo;
goto   tggo;   ogoo:   ootg-=3;goto   gooo;  goog:   ogot=  fopen(
toog[  ottg],  "r");   goto    ogog;  ggto:  ogto=   tgoo;  goto
ggot;}
Fortran and Basic programs developed in the early days of computing were difficult to read and understand, somewhat similar to this example in terms of the choice in variable names (limit 6 characters in Fortran and 2 in Basic) and the frequent use of goto.

Block structured language:
A language that supports the local declaration of variables with a limited scopeDefine this term in a block or compound statement. For example, the following C fragment declares a temporary integer variable n to be used in the loop to copy a file from standard input to standard output:
{   int n;
    while ((n = getchar()) != EOF)
        putchar(n);
}
Variable n has a local scopeDefine this term limited to the block and it's value is only accessible within the block.

In C, C++, Java, and C#, a block is opened with { and closed with }. Pascal and Ada use begin and end keywords to delimit a block.

The use of blocks is so common today that we don't tend to think of it as something special.


Object-oriented programming (OOP):
Object-oriented programming OOP is a programming style that puts objects and their interactions on the forefront rather than computation as the operation of a processor on a monolithic memory. Each object has an internal state and executable functions to manage that state. This programming style is naturally adopted in object-oriented programmingDefine this term languages but can also be adopted in proceduralDefine this term or functionalDefine this term languages. In fact, most object oriented languages were designed as an extension of a procedural language (e.g. C++ and object-oriented Pascal dialects) and the concept of a class is largely based on the concept of an abstract data typeDefine this term.


Abstract data type (ADT):
The concept of an ADT is based on encapsulating data and a set of operations on the data. An ADT declaration is in some respect similar to a class declaration in an object-oriented programming language, except that the abstract data type is typically declared in a module. Like a class, an abstract data type has an internal state and a set of operations on its state. However, the state is global, i.e. only one "instance" exists at any one time. InheritanceDefine this term is not supported.

The following Modula-2 example stack abstraction is from the textbook page 124:

CONST stack_size = ...
TYPE element = ...
...
MODULE stack;
IMPORT element, stack_size;
EXPORT push, pop;
TYPE
  stack_index = [1..stack_size];
VAR
  s   : ARRAY stack_index OF element;
  top : stack_index;

PROCEDURE error; ...

PROCEDURE push (elem : element);
BEGIN
  IF top = stack_size THEN
    error;
  ELSE
    s[top] := elem;
    top := top + 1;
  END;
END push;

PROCEDURE pop () : element
BEGIN
  IF top = 1 THEN
    error;
  ELSE
    top := top - 1;
    RETURN s[top];
  END;
END pop;

BEGIN
  top := 1;
END stack;


Class:
The concept of a class extends the notion of abstract data typesDefine this term (ADTs) with inheritanceDefine this term. ADTs are limited to packages that encapsulate a data type declaration with a set of operations.

The following is an example stack template class in C++:

template <class element> class stack
{ private:
    int top;
    element[] s;
  public:
    stack(int size)
    { s = new element[size]; top = 0; };
    ~stack()
    { delete[] s; };
    void push(element elem)
    { s[top++] = elem; };
    void pop(void)
    { top--; };
    element top(void)
    { return s[top-1]; };
};


Inheritance:
A classDefine this term inherits structure and properties from a base class. Some object oriented languages support multiple inheritance. Single inheritance is simpler to implement and avoids possible ambiguity problems caused by multiple inheritance. Therefore, newer languages such as Java and C# support single inheritance.



Ada (Ada 83):
History's largest design effort is the development of the Ada language, primarely based on the design of PascalDefine this term. Over 40 organizations outside DoD with over 200 participants collaborated (and competed with different designs) in the final design of Ada. Originally intended to be the standard language for all software commissioned by the US Department of Defense. Prototypes designed by teams at several sites; final '83 language developed by a team at Honeywell's Systems and Research Center in Minneapolis and Alsys Corp. in France, led by Jean Ichbiah.

Example program in Ada:

with TEXT_IO;
use TEXT_IO;
procedure AVEX is
  package INT_IO is new INTEGER_IO (INTEGER);
  use INT_IO;
  type INT_LIST_TYPE is array (1..99) of INTEGER;
  INT_LIST : INT_LIST_TYPE;
  LIST_LEN, SUM, AVERAGE : INTEGER;
  begin
    SUM := 0;
    -- read the length of the input list
    GET (LIST_LEN);
    if (LIST_LEN > 0) and (LIST_LEN < 100) then
      -- read the input into an array
      for COUNTER := 1 .. LIST_LEN loop
        GET (INT_LIST(COUNTER));
        SUM := SUM + INT_LIST(COUNTER);
      end loop;
      -- compute the average
      AVERAGE := SUM / LIST_LEN;
      -- write the input values > average
      for counter := 1 .. LIST_LEN loop
        if (INT_LIST(COUNTER) > AVERAGE) then
          PUT (INT_LIST(COUNTER));
          NEW_LINE;
        end if
      end loop;
    else
      PUT_LINE ("Error in input list length");
    end if;
  end AVEX;

Ada 95:
Ada 95 is a revision developed under government contract by a team at Intermetrics, Inc. It fixes several subtle problems in the earlier language, and adds objects, shared-memory synchronization, and several other features.

Algol 60:
The original block-structuredDefine this term language. The design of Algol 60 is a landmark of clarity and conciseness and made a first use of Backus-Naur form (BNF)Define this term for formally defining the grammar. All subsequent imperative programming languages are based on Algol 60, and these languages are sometimes referred to as "Algol-like" languages. Strangely, it lacks input/output statements and has no character set. Algol 60 never gained wide acceptance in the US, partly because of the intrenchment of Fortran and lack of support by IBM. Besides block-structures, Algol 60 has recursionDefine this term and stack-dynamic arraysDefine this term.

Example Algol 60 program:

comment avex program
begin
  integer array intlist [1:99];
  integer listlen, counter, sum, average;
  sum := 0;
  comment read the length of the input list
  readint (listlen);
  if (listlen > 0) L (listlen < 100) then
    begin
      comment read the input into an array
      for counter := 1 step 1 until listlen do
        begin
          readint (intlist[counter]);
          sum := sum + intlist[counter]
        end;
      comment compute the average
      average := sum / listlen;
      comment write the input values > average
      for counter := 1 step 1 until listlen do
        if intlist[counter] > average then
          printint (intlist[counter])
    end
  else
    printstring ("Error in input list length")
end

Algol 68:
Introduced user defined types with an attempt to design a language that is orthogonal: a few primitive types and structures can be combined to form new types and structures. The languages also added new programming constructs to Algol 60 which were already available in other languages. Unfortunately, the Algol 68 documentation was unreadable and Algol 68 never gained widespread acceptance. Includes (among other things) structures and unions, expression-based syntax, reference parameters, a reference model of variables, and concurrency.


Algol W:
A smaller, simpler alternative to Algol 68Define this term, proposed by Niklaus Wirth and C. A. R. Hoare. The precursor to PascalDefine this term. Introduced the case statement.

APL:
A functional language designed by Kenneth Iverson in the late 1950's and early 1960's, primarily for the manipulation of numeric arrays. Extremely concise language with a powerful set of operators. It employs an extended character set to express operators with special symbols. Intended for interactive use ``throw away programming'' (quick programming of a solution that is not intended to be kept: hard to understand the programming solution later!).

Example APL program:

(2=(+/[2]0=(iN°.|(iN)))/iN

This program computes prime numbers in the range 1 to N.


BASIC:
BASIC (Beginner's All-purpose Symbolic Instruction Set) is a simple imperative language that gained popularity because of its ease of use and its interpreted execution, despite the fact that the early versions lacked many language features found in modern languages (e.g. procedures). Many dialects exists. The most widely used version of BASIC today is Microsoft's Visual Basic. The structure of programs written in early Basic dialects resemble the structure of Fortran programs with similar limitations.

Example QuickBasic program:

REM avex program
  DIM intlist(99)
  sum = 0
REM read the length of the input list
  INPUT listlen
  IF listlen > 0 AND listlen < 100 THEN
REM read the input into an array
    FOR counter = 1 TO listlen
      INPUT intlist(counter)
      sum = sum + intlist(counter)
    NEXT counter
REM compute the average
    average = sum / listlen
REM write the input values > average
    FOR counter = 1 TO listlen
      IF intlist(counter) > average THEN
        PRINT intlist(counter);
    NEXT counter
  ELSE
    PRINT "Error in input list length"
  END IF
END

C:
C is one of the most successful imperative languages that was originally defined as part of the development of the UNIX operating system. It is still considered a system's programming language for which certain features such as pointers and the absence of dynamic semantic checks (e.g. array bound checking) are very useful to manipulate memory. Two notably different version of C exist: the original K&R  (Kernighan and Ritchie) version and ANSI C.

Example program in C:

main()
{   int intlist[99], listlen, counter, sum, average;
    sum = 0;
    /* read the length of the list */
    scanf("%d", &listlen);
    if (listlen > 0 && listlen < 100)
    {   /* read the input into an array */
        for (counter = 0; counter < listlen; counter++)
        {   scanf("%d", &intlist[counter]);
            sum += intlist[counter];
        }
        /* compute the average */
        average = sum / listlen;
        /* write the input values > average */
        for (counter = 0; counter < listlen; counter++)
            if (intlist[counter] > average)
                printf("%d\n", intlist[counter]);
    }
    else
        printf("Error in input list length\n");
}

C++:
The most successful of several object-oriented successors of C. It is a large an fairly complex language, in part because it supports both procedural and object-oriented programming. The Standard Template Library (STL) is an important library with common compound data types and operations.
main()
{   std::vector<int> intlist;
    int listlen;
    /* read the length of the list */
    std::cin >> listlen;
    if (listlen > 0 && listlen < 100)
    {   int sum = 0;
        /* read the input into an STL vector */
        for (int counter = 0; counter < listlen; counter++)
        {   int value;
            std::cin >> value;
            intlist.push_back(value);
            sum += value;
        }
        /* compute the average */
        int average = sum / listlen;
        /* write the input values > average */
        for (std::vector<int>::const_iterator it = intlist.begin(); it != intlist.end(); ++it)
            if ((*it) > average)
                std::cout << (*it) << std::endl;
    }
    else
        std::cerr << "Error in input list length" << std::endl;
}


C#:
Pronounced ``C sharp''. A language developed by Microsoft that is very similar to JavaDefine this term. C# is part of Microsoft's VisualStudio.NET, a development environmentDefine this term for Internet-based computing. C# uses the Common Lanuage Runtime (CLR) to  manage objects that can be shared among different languages (C#, VisualBasic, C++, Haskell). Objects can be exchanged over the Web and remote methods can be invoked using SOAP (Simple Object Access Protocol).

COBOL:
COBOL (COmmon Business Oriented Language) was for long the most widely used programming language in the world. COBOL is intended primarely for business data processing with elaborate input/output facilities. It supports extensive numerical formatting features and decimal number storage format. COBOL introduced the concept of records and nested selection statements. Still the most widely used programming language for business applications on mainframes and minis. Originally developed by the Department of Defense. The language is very wordy and adopts English names for arithmetic operators. A COBOL program is structured into the following divisions:
      Division name    Contains
      IDENTIFICATION   Program identification.
      ENVIRONMENT      Types of computers used.
      DATA             Buffers, constants, work areas.
      PROCEDURE        The processing parts (program logic).
Example COBOL program to convert Fahrenheit to Celcius:
IDENTIFICATION DIVISION.
PROGRAM-ID.  EXAMPLE.

ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SOURCE-COMPUTER.   IBM-370.
OBJECT-COMPUTER.   IBM-370.

DATA DIVISION.
WORKING-STORAGE SECTION.
77 FAHR  PICTURE 999.
77 CENT  PICTURE 999.

PROCEDURE DIVISION.
DISPLAY 'Enter Fahrenheit ' UPON CONSOLE.
ACCEPT FAHR FROM CONSOLE.
COMPUTE CENT = (FAHR- 32) * 5 / 9.
DISPLAY 'Celsius is ' CENT UPON CONSOLE.
GOBACK.


CLOS:
The Common Lisp Object System is a set of object-oriented extensions to Common Lisp, now incorporated into the ANSI standard language (see Common LispDefine this term). The leading notation for object-oriented functional programming.


Eiffel:
An object-oriented language developed by Bertrand Meyer and associates at the Societe des Outils du Logiciela Paris. Includes (among other things) multiple inheritanceDefine this term, automatic garbage collectionDefine this term, and powerful mechanisms for re-naming of data members and methods in derived classes.


Euclid:
Imperative language developed by Butler Lampson and associates at the Xerox Palo Alto Research Center in the mid 1970's. Designed to eliminate many of the sources of common programming errors in PascalDefine this term, and to facilitate formal verification of programs. Has closed scopes and module types. 

Fortran (I, II, IV, 77):
The first high-level programming language was Fortran (I) (FORmula TRANslator), developed in the mid-50s. It had a dramatic impact on computing in early days when most of the programming took place in machine code or assembly code for an assemblerDefine this term. It was originally designed to express mathematical formulas. Fortran 77 is still widely used for scientific, engineering, and numerical problems, mainly because very good compilers exist. These compilers are very effective in optimizing code, because of the maturity of the compilers and due to the lack of pointers and recursionDefine this term in Fortran 77. Fortran 77 has limited type checking and lacks records, unions, dynamic allocation, case-statements, and while-loops. Variable names are upper case and the name length is limited to 6 characters. Fortran 77 is not structuredDefine this term and not object-orientedDefine this term. More recent Fortran dialects such as Fortran 90Define this term are better structured and support modern programming constructs.

Example Fortran 77 program:

      PROGRAM AVEX
      INTEGER INTLST(99)
C     variable names that start with I,J,K,L,N,M are integers
      ISUM = 0
C     read the length of the list
      READ (*, *) LSTLEN
      IF ((LSTLEN .GT. 0) .AND. (LSTLEN .LT. 100)) THEN
C     read the input in an array
      DO 100 ICTR = 1, LSTLEN
      READ (*, *) INTLST(ICTR)
      ISUM = ISUM + INTLST(ICTR)
100   CONTINUE
C     compute the average
      IAVE = ISUM / LSTLEN
C     write the input values > average
      DO 110 ICTR = 1, LSTLEN
      IF (INTLST(ICTR) .GT. IAVE) THEN
      WRITE (*, *) INTLST(ICTR)
      END IF
110   CONTINUE
      ELSE
      WRITE (*, *) 'ERROR IN LIST LENGTH'
      END IF
      END

Fortran (90, 95, HPF):
Fortran 90 is a major revision of the language. RecursionDefine this term, pointers, records, dynamic allocation, a module facility, and new control flow constructs are added. Also array operations are added that operate on arrays and array slices. Array operations on distributed arrays in HPF can be parallelized.

Example Fortran 90 program:

      PROGRAM AVEX
      INTEGER INT_LIST(1:99)
      INTEGER LIST_LEN, COUNTER, AVERAGE
C     read the length of the list
      READ (*, *) LISTLEN
      IF ((LIST_LEN > 0) .AND. (LIST_LEN < 100)) THEN
C     read the input in an array
      DO COUNTER = 1, LIST_LEN
      READ (*, *) INT_LIST(COUNTER)
      END DO
C     compute the average
      AVERAGE = SUM(INT_LIST(1:LIST_LEN)) / LIST_LEN
C     write the input values > average
      DO COUNTER = 1, LIST_LEN
      IF (INT_LIST(COUNTER) > AVERAGE) THEN
      WRITE (*, *) INT_LIST(1:LIST_LEN)
      END IF
      END DO
      ELSE
      WRITE (*, *) 'ERROR IN LIST LENGTH'
      END IF
      END

Haskell:
Haskell is the currently leading functional programming language. Descended from MirandaDefine this term. Designed by a committee of researchers beginning in 1987. Includes curried functions, higher-order functionsDefine this term, non-strict semantics, static polymorphic typing, pattern matching, list comprehensions, modules, monadic I/O, and layout (indentation)-based syntactic grouping.

Example Haskell program:

sum []    = 0
sum (a:x) = a + sum x
avex []    = []
avex (a:x) = [n | n <- a:x, n > sum (a:x) / length (a:x)]

Java:
Java is an object-oriented language based largely on C++, developed at SUN Microsystems. The language is intended for the construction of highly portable, machine-independent programs. Includes (among other things) a reference model of (class-typed) variables, mix-in inheritanceDefine this term, threads, and extensive pre-defined libraries for graphics, communication, etc. Heavily used for transmission of program fragments, called applets, over the Internet. The language is designed to be translated into intermediate Java byte code that can be transmitted over the Internet. Java byte code is executed by the Java virtual machine (JVM)Define this term or compiled into native machine code by a just-in-time (JIT) compilerDefine this term. Java is a safe languageDefine this term.

Example Java program:

import java.io;
class Avex
{   public static void main(String args[]) throws IOException
    {   DataInputStream in = new DataInputStream(System.in);
        int listlen, counter, sum = 0, average;
        int [] intlist = int[100];
        // read the length of the list
        listlen = Integer.parseInt(in.readLine());
        if (listlen > 0 && listlen < 100)
        {   // read the input into an array
            for (counter = 0; counter < listlen; counter++)
            {   intlist[counter] = Integer.valueOf(in.readline()).intValue();
                sum += intlist[counter];
            }
            // compute the average
            average = sum / listlen;
            // write the input values > average
            for (counter = 0; counter < listlen; counter++)
            {   if (intlist[counter] > average)
                    System.out.println(intlist[counter] + "\n");
            }
        }
        else
          System.out.println("Error in input length\n");
    }
}

Lisp:
Lisp (LISt Processing language) was developed by McCarthy as a realization of Church's lambda calculusDefine this term. Many dialects exists, among which Common Lisp and SchemeDefine this term are the most popular. Lisp is the dominant language used in Artificial Intelligence. The emphasis is on symbolic computation rather than numeric. Lisp is very powerful for symbolic computation with lists and Lisp often used in artificial intelligence. As a functional languageDefine this term, all control is performed by recursionDefine this term and conditional expressions. Lisp was the first language with implicit memory management (automatic allocate and deallocate) by "garbage collection"Define this term. Lisp heavily influenced functional programming languages (e.g. MLDefine this term, MirandaDefine this term, HaskellDefine this term)



Miranda:
A purely functional language designed by David Turner in the mid 1980's. Resembles MLDefine this term in several respects; has type inference and automatic currying. Unlike ML, provides list comprehensions, and uses lazy evaluation for all arguments. Like HaskellDefine this term, it uses indentation and line breaks for syntactic grouping.


ML:
A functional language with "Pascal-like" syntax. Originally designed in the mid to late 1970's by Robin Milner and associates at the University of Edinburgh as the meta-language for a program verification system. Pioneered aggressive compile-time type inference and polymorphism. ML has a few imperative features.


Modula-2:
The immediate successors to Pascal, developed by Niklaus Wirth. The original Modula was an explicitly concurrent monitor-based language. Modula-2 was originally designed with coroutinesDefine this term, but no real concurrency. Both languages provide mechanisms for module-as-manager style data abstractions.


Modula-3:
A major extension to Modula-2Define this term developed by Luca Cardelli, Jim Donahue, Mick Jordan, Bill Kalsow, and Greg Nelson at the Digital Systems Research Center and the Olivetti Research Center in the late 1980's. Intended to provide a level of support for large, reliable, and maintainable systems comparable to that of AdaDefine this term, but in a simpler and more elegant form.


Oberon:
A deliberately minimal language designed by Niklaus Wirth. Essentially a subset of Modula-2Define this term, augmented with a mechanism for type extension.


Pascal:
A high-level programming language designed by Swiss professor Niklaus Wirth (Wirth is pronounced "Virt") in the late 60s and named after the French mathematician, Blaise Pascal.  It was designed largely in reaction to Algol 68Define this term, which was widely perceived as bloated. It is noted for its structured programmingDefine this term and was heavily used in the 70s and 80s, particularly for teaching. Pascal has had strong influence on subsequent high-level languages, such as AdaDefine this term, MLDefine this term, Modula-2Define this term and Modula-3Define this term.

Example Pascal program:

program avex(input, output);
  type
    intlisttype = array [1..99] of integer;
  var
    intlist : intlisttype;
    listlen, counter, sum, average : integer;
begin
  sum := 0;
  (* read the length of the input list *)
  readln(listlen);
  if ((listlen > 0) and (listlen < 100)) then
    begin
      (* read the input into an array *)
      for counter := 1 to listlen do
        begin
          readln(intlist[counter]);
          sum := sum + intlist[counter]
        end;
      (* compute the average *)
      average := sum / listlen;
      (* write the input values > average *)
      for counter := 1 to listlen do
        if (intlist[counter] > average) then
          writeln(intlist[counter])
    end
  else
    writeln('Error in input list length')
end.


PL/I:
Developed by IBM and intended to displace FortranDefine this term, COBOLDefine this term, and AlgolDefine this term. Very complicated and poorly designed language that is kept alive by IBM. The first language that adopted exception handlingDefine this term and pointer types.

Example PL/I program:

AVEX: PROCEDURE OPTIONS (MAIN);
  DECLARE INTLIST (1:99) FIXED;
  DECLARE (LISTLEN, COUNTER, SUM, AVERAGE) FIXED;
  SUM = 0;
  /* read the input list length */
  GET LIST (LISTLEN);
  IF (LISTLEN > 0) & (LISTLEN < 100) THEN
    DO;
    /* read the input into an array */
    DO COUNTER = 1 TO LISTLEN;
      GET LIST (INTLIST(COUNTER));
      SUM = SUM + INTLIST(COUNTER);
    END;
    /* compute the average */
    AVERAGE = SUM / LISTLEN;
    /* write the input values > average */
    DO COUNTER = 1 TO LISTLEN;
      IF INTLIST(COUNTER) > AVERAGE THEN
        PUT LIST (INTLIST(COUNTER));
    END;
  ELSE
    PUT SKIP LIST ('ERROR IN INPUT LIST LENGTH');
END AVEX;

Prolog:
Prolog is the most polular logic programming language. Most Prolog systems are conforming to the ISO Prolog standard, but deviations make it hard to write Prolog programs that are portable between different Prolog systems. The language is based on formal logic and it can be summarized as an intelligent database system that uses an inferencing process to infer the truth of given queries.

Example Prolog program:

avex(IntList, GreaterThanAveList) :-
    sum(IntList, Sum),
    length(IntList, ListLen),
    Average is Sum / ListLen,
    filtergreater(IntList, Average, GreaterThanAveList).
% sum(+IntList, -Sum)
% recursively sums integers of IntList
sum([Int | IntList], Sum) :-
    sum(IntList, ListSum),
    Sum is Int + ListSum.
sum([], 0).
% filtergreater(+IntList, +Int, -GreaterThanIntList)
% recursively remove integers smaller or equal to Int from IntList
filtergreater([AnInt | IntList], Int, [AnInt | GreaterThanIntList]) :-
    AnInt > Int, !,
    filtergreater(IntList, Int, GreaterThanIntList).
filtergreater([AnInt | IntList], Int, GreaterThanIntList) :-
    filtergreater(IntList, Int, GreaterThanIntList).
filtergreater([], Int, []).
The following example illustrates a more "traditional" use of Prolog to infer information from a database of facts:
rainy(rochester).                 % fact: rochester is rainy
rainy(seattle).                   % fact: seattle is rainy
cold(rochester).                  % fact: rochester is cold
snowy(X) :- rainy(X), cold(X).    % rule: X is snowy if X is rainy and cold
With this program loaded, we can query the system interactively:
?- rainy(X).        % user question
X = rochester       % system answer(s)
X = seattle
?- snowy(X).
X = rochester

Scheme:
Scheme is one of the most popular dialects of LispDefine this term. Developed in the mid 1970's by Guy Steele and Gerald Sussman. Standardized by the IEEE and ANSI. Has static scoping and true first-class functions. Scheme is widely used for teaching.

Example Scheme program:

(DEFINE (avex lis)
  (filtergreater lis (/ (sum lis) (length lis)))
)
(DEFINE (sum lis)
  (COND
    ((NULL? lis) 0)
    (ELSE        (+ (CAR lis) (sum (CDR lis))))
  )
)
(DEFINE (filtergreater lis num)
  (COND
    ((NULL? lis)       '())
    ((> (CAR lis) num) (CONS (CAR lis) (filtergreater (CDR lis) num)))
    (ELSE              (filtergreater (CDR lis) num)
  )
)

Simula 67:
Designed at the Norwegian Computing Centre, Oslo, in the mid 1960's by Ole-Johan Dahl, Bjorn Myhrhaug, and Kristen Nygaard. Extends Algol 60Define this term with classes and coroutinesDefine this term. The name of the language reflects its suitability for discrete-event simulation

Smalltalk-80:
The first full implementation of an object-oriented language is still considered the quintessential object-oriented language. Developed at Xerox PARC pioneered the use of graphical user interfaces.

Example Smalltalk-80 program:

class name                 Avex
superclass                 Object
instance variable names    intlist
"Class methods"
"Create an instance"
  new
    ^ super new
"Instance methods"
"Initialize"
  initialize
    intlist <- Array new: 0
"Add int to list"
  add: n | oldintlist |
    oldintlist <- intlist.
    intlist <- Array new: intlist size + 1.
    intlist <- replaceFrom: 1 to: intlist size with: oldintlist.
    ^ intlist at: intlist size put: n
"Calculate average"
  average | sum |
    sum <- 0.
    1 to: intlist size do:
      [:index | sum <- sum + intlist at: index].
    ^ sum // intlist size
"Filter greater than average"
  filtergreater: n | oldintlist i |
    oldintlist <- intlist.
    i <- 1.
    1 to: oldintlist size do:
      [:index | (oldintlist at: index) > n
          ifTrue: [oldintlist at: i put: (oldintlist at: index)]]
    intlist <- Array new: oldintlist size.
    intlist replaceFrom: 1 to: oldintlist size with: oldintlist
Example Smalltalk-80 session:
av <- Avex new
av initialize
av add: 1
1
av add: 2
2
av add: 3
3
av filtergreater: av average
av at: 1
3

Lambda calculus:
A very simple algebraic model of computation designed byAlonzo Church in the 60s. In its pure form, everything is a function (even primitive types such as numbers and compound data structures such as lists). While the syntax and the rewrite rules of lambda calculus are very primitive, it has been shown that lambda calculus provides a theoretical model of computation. This model is actually easier to program than a Turing machine, the other well-known model of computation. LispDefine this term is a direct realization of lambda calculus as a programming language.

A lambda expression is recursively defined as

Only two rewrite rules on lambda expressions are necessary to model universal computation: alpha reduction and beta reduction. The difference between a name and a variable becomes clear in the context in which they are used. A variable is used in a lambda abstraction to represent the input parameter, while a name is ``inert'', i.e. it has no value other than the name itself.

Some examples:

f a denotes the application of a function symbol f to an argument a.

l v . v is a lambda abstraction that denotes the identity function: argument v takes a value and the function returns the (unchanged) value. The application of this function to an expression a, for example, is written (l v . v) a and this evaluates to a (argument v takes a and the function returns the value of v).

l v. f v is a lambda abstraction that, when applied to a value, applies f to it. For example, (lv. f v) a results in f a.

l f. f a is a lambda abstraction that, when applied to a function name, applies this function to a. For example, (l f. f a) g results in g a.


Coroutines:
A routine that runs concurrently with other coroutines. Coroutines do not necessarily run in parallel. A coroutine can temporarily relinguish control to another coroutine without involving the subroutine calling mechanism. That is, control in a coroutine can jump to another coroutine and back, possibly multiple times during the lifetime of a coroutine. Therefore, it appears as if the coroutines are operating concurrently.


Garbage collection:
A routine that searches memory for program segments or data that are no longer active or used in order to reclaim that space. It tries to make as much memory available on the heapDefine this term as possible. Implicit garbage collection operates on the background of a running program to clean up unused heap space.


Heap:
An area in memory for the dynamic creation of data during the lifetime of a program. The heap contains application data that is not static or stack-allocatedDefine this term.


Stack:
A lifo (last in first out) structure to hold temporary data. The implementation of a programming language requires at least one stack data structure for subroutine calling (and object oriented method invocation) in languages that support recursion. The stack holds the return address of the caller of a subroutine and the parameters passed to the subroutine. Local stack-allocated data for the subroutine is also pushed on the stack.


Regular expression:
Regular expressions describe the tokensDefine this term of a programming language. A regular expression is one of For example, the regular expression describing an identifier in C/C++ is:
identifier -> letter (letter | digit)*
digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
letter -> a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z
        | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
        | _
For compiler design, tools exist that generate efficient scannersDefine this term automatically from regular expressions (e.g. flex).

BNF:
Backus-Naur Form (BNF) is a form of context-free grammar frequently used to describe a programming language syntax.
LL grammar:
An LL grammar is a grammar suitable for top-down parsing. If it is not possible to write a recursive descent parser for a grammar, it is not LL(1). An LL(n) grammar is a grammar suitable for top-down parsing using n lookahead tokens.

An LL grammar cannot have left-recursive productionsDefine this term, because a recursive descent parser would recursively call itself forever without consuming any input characters.

The following grammar is not LL(1)

<A> -> <B> <C>
<A> -> a
<B> -> a b
<B> -> b
<C> -> c
It is not LL(1) because the subroutine for nonterminal A cannot decide which production to use when it sees an a on the input:
proc A
  if next_token="a"
    ?? cannot decide whether the first or second production for <A> applies here ??
The grammar is LL(2), because the token after next token can be used to determine which production should be applied:
proc A
  if next_token="a" and token_after_next_token="b"
    B()
    C()
  else if next_token="b"
    B()
    C()
  else
    match("a");

LR grammar:
A LR grammar is a grammar suitable for bottom-up parsing. A LR(n) grammar is a grammar suitable for bottom-up parsing using n lookeahead tokens. The class of LR grammars includes the class of LL grammars.


Ambiguous grammar:
A grammar is ambiguous if a string exists that has more than one distinct derivation resulting in distinct parse trees. See also ambiguous if-then-elseDefine this term.

The grammar for simple expressions below is ambiguous:

<expression> -> identifier
              | unsigned_integer
              | - <expression>
              | ( <expression> )
              | <expression> <operator> <expression>
<operator> -> + | - | * | /
because we find two distinct (left-most) derivations for the string a-b+1:
<expression>
  => <expression> <operator> <expression>
  => <expression> <operator> <expression> <operator> <expression>
  => identifier <operator> <expression> <operator> <expression>
  => identifier - <expression> <operator> <expression>
  => identifier - identifier <operator> <expression>
  => identifier - identifier + <expression>
  => identifier - identifier + unsigned_integer
        (a)     -    (b)     +       (1)
and
<expression>
  => <expression> <operator> <expression>
  => identifier <operator> <expression>
  => identifier - <expression>
  => identifier - <expression> <operator> <expression>
  => identifier - identifier <operator> <expression>
  => identifier - identifier + <expression>
  => identifier - identifier + unsigned_integer
        (a)     -    (b)     +       (1)
The simple expression grammar below is unambiguous:
<expression> -> <term>
              | <expression> <add_op> <term>
<term> -> <factor>
        | <term> <mult_op> <factor>
<factor> -> identifier | unsigned_integer
          | - <factor> | ( <expression> )
<add_op> -> + | -
<mult_op> -> * | /
We find only one derivation for all strings in the language defined by the grammar. For example, the left-most derivation of a-b+1 is:
<expression>
  => <expression> <add_op> <term>
  => <expression> <add_op> <term> <add_op> <term>
  => <term> <add_op> <term> <add_op> <term>
  => <factor> <add_op> <term> <add_op> <term>
  => identifier <add_op> <term> <add_op> <term>
  => identifier - <term> <add_op> <term>
  => identifier - <factor> <add_op> <term>
  => identifier - identifier <add_op> <term>
  => identifier - identifier + <term>
  => identifier - identifier + <factor>
  => identifier - identifier + unsigned_integer
        (a)     -    (b)     +       (1)

Ambiguous if-then-else:
A problem with the if-then-else grammar for Pascal and C is the formulation of unambiguous grammar productions for if-then-else. The grammar below is ambiguous
<stmt> -> if <expr> then <stmt>
        | if <expr> then <stmt> else <stmt>
because we find two distinct derivations of the string
if C1 then if C2 then S1 else S2
(where C1 and C2 are some expressions, S1 and S2 are some statements):
<stmt>
  => if <expr> then <stmt>
  => if <expr> then if <expr> then <stmt> else <stmt>
and another derivation
<stmt>
  => if <expr> then <stmt> else <stmt>
  => if <expr> then if <expr> then <stmt> else <stmt>
An anuambiguous grammar for if-then-else is (you don't need to memorize this):
<stmt> -> <balanced_stmt>
        | <unbalanced_stmt>
<balanced_stmt> -> if <expr> then <balanced_stmt> else <balanced_stmt>
                 | <other_stmt>
<unbalanced_stmt> -> if <expr> then
                   | if <expr> then <balanced_stmt> else <unbalanced_stmt>
which is an LR grammarDefine this term, but not an LL grammarDefine this term and no pure top-down parserDefine this term can be used to parse program fragments with the unambiguous if-then-else grammar.


Attribute grammar:
A grammar augmented with attributes for terminals and nonterminals and semantic rulesDefine this term that operate on the attribute values.


Semantic rule:
A rule with a grammar production that is used to operate on the values of the attributes of terminals and nonterminals in the grammar.

Example

grammar production               semantic rule
<number1> -> <number2> <digit>   number1.value := 10*number2.value + digit.value
<number>  -> <digit>             number.value := digit.value
<digit>   -> 0                   digit.value := 0
           | 1                   digit.value := 1
           | 2                   digit.value := 2
           | 3                   digit.value := 3
           | 4                   digit.value := 4
           | 5                   digit.value := 5
           | 6                   digit.value := 6
           | 7                   digit.value := 7
           | 8                   digit.value := 8
           | 9                   digit.value := 9
In this example, the nonterminals <number> and <digit> have an attribute 'value' that holds the value of the numeric representation defined by the grammar. When the semantic rules are applied on the syntactic representation of the input, the rules compute the value of the input. When the values computed are used to check the validity of the input, then the semantic rules are used to enforce semantic checksDefine this term. Note: the nonterminal <number> has subscripts in the first production to distinguish the nonterminal on the left hand side and right hand side of the production.


Semantic checks:Show slide
Static semantic checksDefine this term are performed by the compiler at compile time. Dynamic semantic checksDefine this termare performed at run time. A compiler cannot always ensure that certain constraints on programming constructs are met at compile time, for example, whether the index value of an array is out of bounds. A compiler may generate run time checks in the target code to enforce these constraints at run time.


Static semantic checks:
Static semantic checks performed by a compiler at compile time is applied to ensure that variables are declared before used, variables are typed correctly in expressions, labels have targets, etc.


Dynamic semantic checks:
A compiler may generate run time checks in the target code to enforce programming language specific onstraints on programming constructs at run time. An interpreter or virtual machine may enforce constraints immediately while executing an intruction. Exceptions are raised when an error is detected.


Tokens:
Tokens are the indivisible units a scannerDefine this term of a compiler produces for further analysis by the parser. Example tokens are programming language keywords, operators, identifiers, numbers, and punctuation. Tokens are also called terminalsDefine this term in the context of grammarsDefine this term.


Terminals:
A terminal of a grammar of a programming language is a tokenDefine this term, e.g. a keyword or operator.


Nonterminals:
A nonterminal of a grammar denotes a syntactic category of a language. For example, a programming language statement as a syntactic category can be one of many alternative statements.


Production:
BNFDefine this term grammar productions are of the form
<nonterminal> -> sequence of (non)terminals
Productions provide descriptions of the syntax for a syntactic category denoted by a nonterminal.

A production is immediately left recursive if it is of the form

<A> -> <A> ...
and a production is immediately right recursive if it is of the form
<A> -> ... <A>
where <A> is some nonterminal.

Productions can be left or right recursive through other productions. For example

<A> -> <B> ...
<B> -> <A> ...



Derivation:


Parse tree:
A parse tree depicts a derivationDefine this term as a tree: the nodes are the nonterminalsDefine this term, the children of a node are the symbols (terminals and nonterminals) of a right-hand side of a productionDefine this term for the nonterminal at the node, and the leaves are the terminalsDefine this term.

Given the grammar

<id_list> -> identifier <id_list_tail>

<id_list_tail> -> , identifier <id_list_tail>
                | ;
The parse tree of "A,B,C;" is
 
Parser tree example


Abstract syntax tree (AST):


Associative:
An operator is left associative if the operations are performed from the left to the right in an expression. Similarly, an operator is right associative if the operations are performed from the right to the left in an expression. For example, addition is left associative and in the expression 1 + 2 + 3 the numbers 1 and 2 are added first, after which 3 is added. Note that for the addition of numbers the associativity of + does not matter as the terms can be reordered in a formal system. However, limited numeric precision in a computer restricts this reordering and an overflow may occur when the terms are reordered. Also, if the terms are functions with side effectsDefine this term the result would be different after reordering. Arithmetic operators in a programming language are typically left associative with the notable exception of exponentiation (^) which is right associative. However, this rule of thumb is not universal.

Associativity can be captured in a grammar. For a left associative binary operator op we have a production of the form

<expr> -> <term> | <expr> op <term>
and for a right associative operator <op> we have a production of the form
<expr> -> <term> | <term> op <expr>
Note that the production for a left associative operator is left recursiveDefine this term and therefore has to be rewritten for a recursive descent parser:
<expr> -> <term> <more_terms>
<more_terms> -> op <term> <more_terms> | e

Precedence:
The precedence of an operator indicates the priority of applying the operator relative to other operators. For example, multiplication has a higher precedence than addition, so a+b*c is evaluated by multiplying b and c first, after which a is added. That is, multiplication groups more tightly compared to addition. The rules of operator precedence vary from one programming language to another.

The relative precedences between operators can be captured in a grammar. A nonterminal is introduced for every group of operators with identical precedence. The nonterminal of the group of operators with lowest precedence is the nonterminal for the expression as a whole. Productions for (left associative) binary operators with lowest to highest precedences are written of the form

<expr> -> <expr1> | <expr> <lowest_op> <expr1>
<expr1> -> <expr2> | <expr1> <one_but_lowest_op> <expr2>
...
<expr9> -> <term> | <expr9> <highest_op> <term>
<term> -> identifier | number | - <term> | ( <expr> )
where <lowest_op> is a nonterminal denoting all operators with the same lowest precedence, etc.


Scanner:
A scanner of a compilerDefine this term breaks up the character stream of a source program into tokensDefine this term. The process of scanning comprises the lexical analysis phase of a compiler. The purpose of scanning is to simplify the task of the parser of the compiler. Comments and white space are removed, keywords are recognized and represented as tokensDefine this term, identifers for names of variables and functions are stored in a symbol table and tagged with source file and line numbers.

Example scanner written in Java:

import java.io.*;
public class Scanner
{ public static void main(String argv[]) throws IOException
  { FileInputStream stream = new FileInputStream(argv[0]);
    InputStreamReader reader = new InputStreamReader(stream);
    StreamTokenizer tokens = new StreamTokenizer(reader);
    int next = 0;
    while ((next = tokens.nextToken()) != tokens.TT_EOF)
    { switch (next)
      { case tokens.TT_WORD:
          System.out.println("WORD:   " + tokens.sval);
          break;
        case tokens.TT_NUMBER:
          System.out.println("NUMBER: " + tokens.nval);
          break;
        default:
          switch ((char)next)
          { case '"':
              System.out.println("STRING: " + tokens.sval);
              break;
            case '\'':
              System.out.println("CHAR:   " + tokens.sval);
              break;
            default:
              System.out.println("PUNCT:  " + (char)next);
          }
      }
    }
    stream.close();
  }
}
Get Java source. Save it with file name "Scanner.java", compile it with "javac Scanner.java", and run it with "Scanner Scanner.java", where the scanner is applied to itself.


Parser:
A parser of a compiler builds a parse treeDefine this term representation of a stream of tokensDefine this term. The grammar of a programming language defines the parse tree structure produced by a parser given a syntactically valid program fragment.


Top-down parser:
Also called a predictive parser. This type of parser proceeds building a parse tree from the root down. An example top-down parser is a recursive descent parserDefine this term.


Bottom-up parser:
This type of parser proceeds building a parse tree from the bottom up.


Recursive descent parser:
A top-down parserDefine this term based on recursive functions.

Consider for example, the following LL(1) grammar

<expr> -> <term> <term_tail>
<term_tail> -> <add_op> <term> <term_tail> | e
<term> -> <factor> <factor_tail>
<factor_tail> -> <mult_op> <factor> <factor_tail> | e
<factor> -> ( <expr> ) | - <factor> | identifier | unsigned_integer
<add_op> -> + | -
<mult_op> -> * | /
For this LL(1) grammar a recursive descent parser in Java is:
import java.io.*;
public class CalcParser
{ private static StreamTokenizer tokens;
  private static int ahead;
  public static void main(String argv[]) throws IOException
  { InputStreamReader reader = new InputStreamReader(System.in);
    tokens = new StreamTokenizer(reader);
    tokens.ordinaryChar('.');
    tokens.ordinaryChar('-');
    tokens.ordinaryChar('/');
    get();
    expr();
    if (ahead == (int)'$')
      System.out.println("Syntax ok");
    else
      System.out.println("Syntax error");
  }
  private static void get() throws IOException
  { ahead = tokens.nextToken();
  }
  private static void expr() throws IOException
  { term();
    term_tail();
  }
  private static void term_tail() throws IOException
  { if (ahead == (int)'+' || ahead == (int)'-')
    { add_op();
      term();
      term_tail();
    }
  }
  private static void term() throws IOException
  { factor();
    factor_tail();
  }
  private static void factor_tail() throws IOException
  { if (ahead == (int)'*' || ahead == (int)'/')
    { mult_op();
      factor();
      factor_tail();
    }
  }
  private static void factor() throws IOException
  { if (ahead == (int)'(')
    { get();
      expr();
      if (ahead == (int)')')
        get();
      else System.out.println("closing ) expected");
    }
    else if (ahead == (int)'-')
    { get();
      factor();
    }
    else if (ahead == tokens.TT_WORD)
      get();
    else if (ahead == tokens.TT_NUMBER)
      get();
    else System.out.println("factor expected");
  }
  private static void add_op() throws IOException
  { if (ahead == (int)'+' || ahead == (int)'-')
      get();
  }
  private static void mult_op() throws IOException
  { if (ahead == (int)'*' || ahead == (int)'/')
      get();
  }
}
Get Java source.This parser does not construct a parse tree but verifies if a string terminated with a $ is an expression.


A recursive descent parser to evaluate simple expressions:
Get Java source

To run the example, save the source with file name "Calc.java", compile it with "javac Calc.java" and run it with "java Calc".


A recursive descent parser to translate simple expressions into Lisp expressions:
Get Java source of the CalcAST class
Get Java source of the AST class

To run the example, save the CalcAST class source with file name "Calc.java", save the AST class source with file name "AST.java", compile it with "javac AST.java CalcAST.java" and run it with "java CalcAST".


Compiler:
A compiler translates source programs into assembly codeDefine this term, machine code, or code for a virtual machineDefine this term.


Just-in-time compiler:
A translator of intermediate code (e.g. for a virtual machineDefine this term) into machine code for a particular platform. The translation is done just before the program is executed. Just-in-time compilers are available for many types of machines to translate Java byte code into native machine code.



Virtual machine:


Interpreter:
An interpreter is a virtual machine for a high-level language.


Loader:
Because the memory addressing in older systems was typically flat, a loader was required to place a binary executable program in memory. The loader modified the absolute addresses used for jumps and static data within a program to reflect the change in addressing by the placement of the code at a particular address in memory.


Linker:
A linker merges object codesDefine this term and static library routines together to produce a binary executable program.


Preprocessor:
A preprocessor applies macro expansionDefine this term to a source program. In C and C++ for example, #define macros are expanded and header files are incuded in the source for the first phase of compiler analysis (lexical analysis by the scanner).



Macro:
A definition of a name for a text fragment. Macro expansion is a repeated provess that replaces the occurrences of macro names in a text with the textual content of the macro.


Exception:



Function:


Procedure:


Formal parameter:
A formal parameter is an parameter declared with a subroutine definition. It is an identifier refering to the value of an actual parameterDefine this term when the subroutine is called. For example, in the C program fragment
main(int argc, char argv[])
{ ... }
both argc and argv are formal parameters in main's definition.

Also known as dummy arguments in Fortran.


Actual parameter:
A value or reference to an object which is passed to a function or procedure.



Scope:


Static scope:


Dynamic Scope:


Exceptions and exception handling:

Inheritance:


Software development environment:
An integrated software development environment (IDE) offers a source code editor, compiler, linker, and debugger.



Side effect:
A side effect is an intentional modification of the value of a location in memory to affect the global state of the machine. Side effects can change the behavior of a function across multiple function calls. Functions that return a value that solely depends on the values of the parameters passed to it is called side-effect free.

Example of a function with a side effect:

sum = 0;
int accumulate(int value)
{ sum += value;
  return sum;
}




Referentially transparent:
A referentially transparent expression  is composed of side-effect free functions and operators. No side-effects may occur in the evaluation of the expression. As a result, the expression evaluates to a value that is solely depending on the values of the variables used in the expression.

Example of a non-referentially transparent expression (where 'accumulate' and 'sum' are defined inDefine this term) is:

accumulate(2) + sum
This expression uses a function with a side effect. The value of this expression is undetermined in C and C++, because these languages allow different operator evaluation orders, which means that the value of 'sum' may or may not have been updated through the 'accumulate' call.



Recursion:


Higher-order function:
Functions that take other functions as input parameters or return newly constructed functions.