Copyright R.A. van Engelen, FSU Department of Computer Science, 2000

Blackboard

The classroom blackboard is under construction. Please be patient. Some descriptions are not available yet but will be added later.

Help:
You can click on the Define this term buttons provided in the notes to obtain a description of a term.
 
 
 
 
 
 
 
 
 


Programming languages:
Are central to Compute Science. Programming languages reflect many aspects of computer science like syntax, semantics, theorem proving (for type inference), abstract and virtual machines, data structures, software engineering, hardware issues, etc. Programming languages follow simple syntactic conventions (as opposed to natural languages), see also BNFDefine this term.

Programming languages are imperativeDefine this term or declarativeDefine this termand  can be further subdivided into functionalDefine this term, logicDefine this term, dataflowDefine this term, proceduralDefine this term, and object-orientedDefine this term languages.


Imperative programming languages:
Programs written in an imperative programming language describe exactly the computational steps to necessary for the computer to obtain a result. In contrast, declarative languagesDefine this term allow the problem to be stated without all of the details how the result should be obtained. Imperative languages are the proceduralDefine this term and object-orientedDefine this term languages.


Declarative programming languages:
Programs written in a declarative programming language do not state all the details how the computer should obtain the result. Declarative languages are the functionalDefine this term, logicDefine this term, and dataflowDefine this term languages.


Functional programming languages:
The underlying machinery of functional programming languages is based on Church's lambda calculusDefine this term. The computational model is based on recursive functions and a program is considered a function mapping inputs to outputs. Through refinement, a program is defined in terms of simpler functions.

Example languages in this category are LispDefine this term, ML, and HaskellDefine this term.


Dataflow programming languages:


Logic programming languages:
Logic programming languages derive results by logical inference. PrologDefine this termfor example, is based on propositional logic. The computational model consists of an inference process on a database to find values that satisfy certain constraints and relationships.


Procedural ("von Neumann") programming languages:
Although object-oriented programmingDefine this term is gaining more popularity, the procedural languages are the most familiar and successful languages. The basic mode of operation is the modification of variables which is sometimes referred to as computing via side effects: procedural languages are based on statements that influence subsequent computation by chaning the value of memory. The success of these languages can be mainly contributed to the efficiency of the language implementation in our current computer architectures, called the von Neumann computer architectures. This common architecture exhibits a central processing unit (CPU) and memory which are connected by a bus.

Example procedural languages are Fortran 77Define this term, BasicDefine this term, PascalDefine this term, AdaDefine this term, and CDefine this term.


Object-oriented programming languages:


Safe programming languages:
Strong typingDefine this term is considered and important safety issue of a programming language.


Strong typing:
In a strongly typed programming language typing errors are always detected. The detection can be at compile time or at run time. A strongly typed language is considered more safe, because it prevents operations from being applied to the wrong type of object. For example, Ada, Java, and Haskell are strongly typed languages. C and C++ are not, e.g. because void pointers can point to any type of object that can be manipulated. Pascal is "almost" strongly typed. The exception is the use of a variant record (union) without discriminator. The variant record can hold alternative types of objects during the execution of a program.



Relocatable:
When machine code is relocatable in memory it means that the code can be moved from one location to another to make room for new or modified routines in memory. Relative addressing is used in the code or  the absolute addresses in the code are converted before the code is executed.
 
 



Assembler:
Translator of assembly programs with mnemonic instructions to machine code.

Example MIPS assembly program (from textbook page 1):

   addiu    sp,sp,-32
   sw       ra,20(sp)
   jal      getint
   nop
   jal      getint
   sw       v0,28(sp)
   lw       a0,28(sp)
   move     v1,v0
   beq      a0,v0,D
   slt      at,v1,a0
A: beq      at,zero,B
   nop
   b        C
   subu     a0,a0,v1
B: subu     v1,v1,a0
C: bne      a0,v1,A
   slt      at,v1,a0
D: jal      putint
   nop
   lw       ra,20(sp)
   addiu    sp,sp,32
   jr       ra
   move     v0,zero
Example MIPS R4000 machine code of the above assembly program (from textbook page 1):
27bdffd0 afbf0014 0c1002a8 00000000 0c1002a8 afa2001c 8fa4001c
00401825 10820008 0064082a 10200003 00000000 10000002 00832023
00641823 1483fffa 0064082a 0c1002b2 00000000 8fbf0014 27bd0020
03e00008 00001025

Structured programming:
Considered a revolution in programming in the 70s (much like object-oriented programming in the late 80s and early 90s). A programming technique that emphasizes top-down design, modularization of code (large routines are broken down into smaller, modular, routines), structured types (eg. records, sets, pointers, and multi-dimensional arrays), descriptive variable and constant names, and extensive commenting conventions. The use of the GOTO statement is discouraged to avoid spaghetti code (code that exhibits a criss-cross control flow behavior at run-time). Certain programming statements are indented in order to make loops and other program logic easier to follow.

Structured languages, such as Pascal and Ada, force the programmer to write a structured program. However, unstructured languages such as Fortran 77, Cobol, and Basic require discipline on the part of the programmer to follow.

Here is an example of a non-structured program in C that counts the number of goto's in a file whose filename is given as an argument to the executable of this program. This program can be used to measure the "spaghettiness" of a C program:

#include <stdio.h>
#include <malloc.h>
main(togo,toog)
int togo;
char *toog[];
{char *ogto,   tgoo[80];FILE  *ogot;  int    oogt=0, ootg,  otog=79,
ottg=1;if (    togo==  ottg)   goto   gogo;  goto    goog;  ggot:
if (   fgets(  tgoo,   otog,   ogot)) goto   gtgo;   goto   gott;
gtot:  exit(); ogtg: ++oogt;   goto   ogoo;  togg:   if (   ootg > 0)
goto   oggt;   goto    ggot;   ogog:  if (  !ogot)   goto   gogo;
goto   ggto;   gtto:   printf( "%d    goto   \'s\n", oogt); goto
gtot;  oggt:   if (   !memcmp( ogto, "goto", 4))     goto   otgg;
goto   gooo;   gogo:   exit(   ottg); tggo:  ootg=   strlen(tgoo);
goto   tgog;   oogo: --ootg;   goto   togg;  gooo: ++ogto;  goto
oogo;  gott:   fclose( ogot);  goto   gtto;  otgg:   ogto=  ogto +3;
goto   ogtg;   tgog:   ootg-=4;goto   togg;  gtgo:   ogto=  tgoo;
goto   tggo;   ogoo:   ootg-=3;goto   gooo;  goog:   ogot=  fopen(
toog[  ottg],  "r");   goto    ogog;  ggto:  ogto=   tgoo;  goto
ggot;}
Fortran and Basic programs developed in the early days of computing were difficult to read and understand, somewhat similar to this example in terms of the choice in variable names (limit 6 characters in Fortran and 2 in Basic) and the frequent use of goto.

Block structured language:
A language that supports the local declaration of variables with a limited scope in a block or compound statement.


Object-oriented programming (OOP):



Ada (Ada 83):
History's largest design effort is the development of the Ada language, primarely based on the design of Pascal. Over 40 organizations outside DoD with over 200 participants collaborated (and competed with different designs) in the final design of Ada. It was originally intended to be the standard language for all software commisioned by the Department of Defense.

Example program in Ada:

with TEXT_IO;
use TEXT_IO;
procedure AVEX is
  package INT_IO is new INTEGER_IO (INTEGER);
  use INT_IO;
  type INT_LIST_TYPE is array (1..99) of INTEGER;
  INT_LIST : INT_LIST_TYPE;
  LIST_LEN, SUM, AVERAGE : INTEGER;
  begin
    SUM := 0;
    -- read the length of the input list
    GET (LIST_LEN);
    if (LIST_LEN > 0) and (LIST_LEN < 100) then
      -- read the input into an array
      for COUNTER := 1 .. LIST_LEN loop
        GET (INT_LIST(COUNTER));
        SUM := SUM + INT_LIST(COUNTER);
      end loop;
      -- compute the average
      AVERAGE := SUM / LIST_LEN;
      -- write the input values > average
      for counter := 1 .. LIST_LEN loop
        if (INT_LIST(COUNTER) > AVERAGE) then
          PUT (INT_LIST(COUNTER));
          NEW_LINE;
        end if
      end loop;
    else
      PUT_LINE ("Error in input list length");
    end if;
  end AVEX;

Ada 95:


Algol 60:
The original block-structuredDefine this term language. The design of Algol 60 is a landmark of clarity and conciseness and made a first use of Backus-Naur form (BNF)Define this term for formally defining the grammar. All subsequent imperative programming languages are based on Algol 60, and these languages are sometimes referred to as "Algol-like" languages. Strangely, it lacks input/output statements. Algol 60 never gained wide acceptance in the US, partly because of the intrenchment of Fortran and lack of support by IBM. Besides block-structures, Algol 60 has recursion and stack-dynamic arraysDefine this term.

Example Algol 60 program:

comment avex program
begin
  integer array intlist [1:99];
  integer listlen, counter, sum, average;
  sum := 0;
  comment read the length of the input list
  readint (listlen);
  if (listlen > 0) L (listlen < 100) then
    begin
      comment read the input into an array
      for counter := 1 step 1 until listlen do
        begin
          readint (intlist[counter]);
          sum := sum + intlist[counter]
        end;
      comment compute the average
      average := sum / listlen;
      comment write the input values > average
      for counter := 1 step 1 until listlen do
        if intlist[counter] > average then
          printint (intlist[counter])
    end
  else
    printstring ("Error in input list length")
end

Algol 68:
Introduced user defined types with an attempt to design a language that is orthogonal: a few primitive types and structures can be combined to form new types and structures. The languages also added new programming constructs to Algol 60 which were already available in other languages. Unfortunately, the Algol 68 documentation was unreadable and Algol 68 never gained widespread acceptance.


APL:


BASIC:
BASIC (Beginner's All-purpose Symbolic Instruction Set) is a simple imperative language that gained popularity because of its ease of use and its interpreted execution, despite the fact that the early versions lacked many language features found in modern languages (e.g. procedures). Many dialects exists. The most widely used version of BASIC today is Microsoft's Visual Basic.

Example QuickBasic program:

REM avex program
  DIM intlist(99)
  sum = 0
REM read the length of the input list
  INPUT listlen
  IF listlen > 0 AND listlen < 100 THEN
REM read the input into an array
    FOR counter = 1 TO listlen
      INPUT intlist(counter)
      sum = sum + intlist(counter)
    NEXT counter
REM compute the average
    average = sum / listlen
REM write the input values > average
    FOR counter = 1 TO listlen
      IF intlist(counter) > average THEN
        PRINT intlist(counter);
    NEXT counter
  ELSE
    PRINT "Error in input list length"
  END IF
END

C:
C is one of the most successful imperative languages that was originally defined as part of the development of the UNIX operating system. It is still considered a system's programming language for which certain features such as pointers and the absence of dynamic semantic checks (e.g. array bound checking) are very useful to manipulate memory. Two notably different version of C exist: the original K&R  (Kernighan and Ritchie) version and ANSI C.

Example program in C:

main()
{   int intlist[99], listlen, counter, sum, average;
    sum = 0;
    /* read the length of the list */
    scanf("%d", &listlen);
    if (listlen > 0 && listlen < 100)
    {   /* read the input into an array */
        for (counter = 0; counter < listlen; counter++)
        {   scanf("%d", &intlist[counter]);
            sum += intlist[counter];
        }
        /* compute the average */
        average = sum / listlen;
        /* write the input values > average */
        for (counter = 0; counter < listlen; counter++)
            if (intlist[counter] > average)
                printf("%d\n", intlist[counter]);
    }
    else
        printf("Error in input list length\n");
}

C++:
The most successful of several object-oriented successors of C. It is a large an fairly complex language, in part because it supports both procedural and object-oriented programming.


COBOL:
COBOL (COmmon Business Oriented Language) was for long the most widely used programming language in the world. COBOL is intended primarely for business data processing with elaborate input/output facilities. Still the most widely used programming language for business applications on mainframes and minis. Originally developed by the Department of Defense. The language is very wordy and adopts English names for arithmetic operators. A COBOL program is structured into the following divisions:
      Division name    Contains
      IDENTIFICATION   Program identification.
      ENVIRONMENT      Types of computers used.
      DATA             Buffers, constants, work areas.
      PROCEDURE        The processing (program logic).
Example COBOL program to convert Fahrenheit to Celcius:
IDENTIFICATION DIVISION.
PROGRAM-ID.  EXAMPLE.

ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SOURCE-COMPUTER.   IBM-370.
OBJECT-COMPUTER.   IBM-370.

DATA DIVISION.
WORKING-STORAGE SECTION.
77 FAHR  PICTURE 999.
77 CENT  PICTURE 999.

PROCEDURE DIVISION.
DISPLAY 'Enter Fahrenheit ' UPON CONSOLE.
ACCEPT FAHR FROM CONSOLE.
COMPUTE CENT = (FAHR- 32) * 5 / 9.
DISPLAY 'Celsius is ' CENT UPON CONSOLE.
GOBACK.




Fortran (I, II, IV, 77):
The first high-level programming language was Fortran (I) (FORmula TRANslator), developed in the mid-50s. It was originally designed to express mathematical formulas. Fortran 77 is still widely used for scientific, engineering, and numerical problems, mainly because very good compilers exist. These compilers are very effective in optimizing code, because of the maturity of the compilers and due to the lack of pointers and recursion in Fortran 77. Fortran 77 also lacks records, unions, dynamic allocation, case-statements, and while-loops. Variable names are upper case and the name length is limited to 6 characters.

Example Fortran 77 program:

      PROGRAM AVEX
      INTEGER INTLST(99)
C     variable names that start with I,J,K,L,N,M are integers
      ISUM = 0
C     read the length of the list
      READ (*, *) LSTLEN
      IF ((LSTLEN .GT. 0) .AND. (LSTLEN .LT. 100)) THEN
C     read the input in an array
      DO 100 ICTR = 1, LSTLEN
      READ (*, *) INTLST(ICTR)
      ISUM = ISUM + INTLST(ICTR)
100   CONTINUE
C     compute the average
      IAVE = ISUM / LSTLEN
C     write the input values > average
      DO 110 ICTR = 1, LSTLEN
      IF (INTLST(ICTR) .GT. IAVE) THEN
      WRITE (*, *) INTLST(ICTR)
      END IF
110   CONTINUE
      ELSE
      WRITE (*, *) 'ERROR IN LIST LENGTH'
      END IF
      END

Fortran (90, 95, HPF):
Fortran 90 is a major revision of the language. Recursion, pointers, records, dynamic allocation, a module facility, and new control flow constructs are added. Also array operations are added that operate on arrays and array slices. Array operations on distributed arrays in HPF can be parallelized.

Example Fortran 90 program:

      PROGRAM AVEX
      INTEGER INT_LIST(1:99)
      INTEGER LIST_LEN, COUNTER, AVERAGE
C     read the length of the list
      READ (*, *) LISTLEN
      IF ((LIST_LEN > 0) .AND. (LIST_LEN < 100)) THEN
C     read the input in an array
      DO COUNTER = 1, LIST_LEN
      READ (*, *) INT_LIST(COUNTER)
      ENDDO
C     compute the average
      AVERAGE = SUM(INT_LIST(1:LIST_LEN)) / LIST_LEN
C     write the input values > average
      DO COUNTER = 1, LIST_LEN
      IF (INT_LIST(COUNTER) > AVERAGE) THEN
      WRITE (*, *) INT_LIST(1:LIST_LEN)
      END IF
      END DO
      ELSE
      WRITE (*, *) 'ERROR IN LIST LENGTH'
      END IF
      END

Haskell:
Haskell is the currently leading functional programming language.

Example Haskell program:

sum []    = 0
sum (a:x) = a + sum x
avex []    = []
avex (a:x) = [n | n <- a:x, n > sum (a:x) / length (a:x)]

Java:
Java is an object-oriented language based largely on C++, developed at SUN Microsystems. The language is intended for the construction of highly portable, machine-independent programs. The language is designed to be translated into intermediate Java byte code that can be transmitted over the Internet as so-called applets. Java byte code is executed by the Java virtual machine (JVM). Java is a safe languageDefine this term.

Example Java program:

import java.io;
class Avex
{   public static void main(String args[]) throws IOException
    {   DataInputStream in = new DataInputStream(System.in);
        int listlen, counter, sum = 0, average;
        int [] intlist = int[100];
        // read the length of the list
        listlen = Integer.parseInt(in.readLine());
        if (listlen > 0 && listlen < 100)
        {   // read the input into an array
            for (counter = 0; counter < listlen; counter++)
            {   intlist[counter] = Integer.valueOf(in.readline()).intValue();
                sum += intlist[counter];
            }
            // compute the average
            average = sum / listlen;
            // write the input values > average
            for (counter = 0; counter < listlen; counter++)
            {   if (intlist[counter] > average)
                    System.out.println(intlist[counter] + "\n");
            }
        }
        else
          System.out.println("Error in input length\n");
    }
}

Lisp:
Lisp (LISt Processing language) was developed by McCarthy as a realization of Church's lambda calculusDefine this term. Many dialects exists, among which Common Lisp and SchemeDefine this term. Lisp is the dominant language used in Artificial Intelligence. The emphasis is on symbolic computation rather than numeric. Lisp processes data in lists. Being a functional languageDefine this term, all control is performed by recursion and conditional expressions.
 
 



Pascal:
A high-level programming language designed by Swiss professor Niklaus Wirth (Wirth is pronounced "Virt") in the late 60s and named after the French mathematician, Blaise Pascal.  It was designed largely in reaction to Algol 68, which was widely perceived as bloated. It is noted for its structured programmingDefine this term and was heavily used in the 70s and 80s, particularly for teaching. Pascal has had strong influence on subsequent high-level languages.

Example Pascal program:

program avex(input, output);
  type
    intlisttype = array [1..99] of integer;
  var
    intlist : intlisttype;
    listlen, counter, sum, average : integer;
begin
  sum := 0;
  (* read the length of the input list *)
  readln(listlen);
  if ((listlen > 0) and (listlen < 100)) then
    begin
      (* read the input into an array *)
      for counter := 1 to listlen do
        begin
          readln(intlist[counter]);
          sum := sum + intlist[counter]
        end;
      (* compute the average *)
      average := sum / listlen;
      (* write the input values > average *)
      for counter := 1 to listlen do
        if (intlist[counter] > average) then
          writeln(intlist[counter])
    end
  else
    writeln('Error in input list length')
end.


PL/I:
Developed by IBM and intended to displace Fortran, COBOL, and Algol. Very complicated and poorly designed language that is kept alive by IBM. The first language that adopted exception handling and pointer types.

Example PL/I program:

AVEX: PROCEDURE OPTIONS (MAIN);
  DECLARE INTLIST (1:99) FIXED;
  DECLARE (LISTLEN, COUNTER, SUM, AVERAGE) FIXED;
  SUM = 0;
  /* read the input list length */
  GET LIST (LISTLEN);
  IF (LISTLEN > 0) & (LISTLEN < 100) THEN
    DO;
    /* read the input into an array */
    DO COUNTER = 1 TO LISTLEN;
      GET LIST (INTLIST(COUNTER));
      SUM = SUM + INTLIST(COUNTER);
    END;
    /* compute the average */
    AVERAGE = SUM / LISTLEN;
    /* write the input values > average */
    DO COUNTER = 1 TO LISTLEN;
      IF INTLIST(COUNTER) > AVERAGE THEN
        PUT LIST (INTLIST(COUNTER));
    END;
  ELSE
    PUT SKIP LIST ('ERROR IN INPUT LIST LENGTH');
END AVEX;

Prolog:
Prolog is the most polular logic programming language. Most Prolog systems are conforming to the ISO Prolog standard, but deviations make it hard to write Prolog programs that are portable between different Prolog systems. The language is based on formal logic and it can be summarized as an intelligent database system that uses an inferencing process to infer the truth of given queries.

Example Prolog program:

avex(IntList, GreaterThanAveList) :-
    sum(IntList, Sum),
    length(IntList, ListLen),
    Average is Sum / ListLen,
    filtergreater(IntList, Average, GreaterThanAveList).
% sum(+IntList, -Sum)
% recursively sums integers of IntList
sum([Int | IntList], Sum) :-
    sum(IntList, ListSum),
    Sum is Int + ListSum.
sum([], 0).
% filtergreater(+IntList, +Int, -GreaterThanIntList)
% recursively remove integers smaller or equal to Int from IntList
filtergreater([AnInt | IntList], Int, [AnInt | GreaterThanIntList]) :-
    AnInt > Int, !,
    filtergreater(IntList, Int, GreaterThanIntList).
filtergreater([AnInt | IntList], Int, GreaterThanIntList) :-
    filtergreater(IntList, Int, GreaterThanIntList).
filtergreater([], Int, []).
The following example illustrates a more "traditional" use of Prolog to infer information:
rainy(rochester).                 % fact: rochester is rainy
rainy(seattle).                   % fact: seattle is rainy
cold(rochester).                  % fact: rochester is cold
snowy(X) :- rainy(X), cold(X).    % rule: X is snowy if X is rainy and cold
With this program loaded, we can query the system interactively:
?- rainy(X).        % user question
X = rochester       % system answer(s)
X = seattle
?- snowy(X).
X = rochester

Scheme:
Scheme is one of the most popular dialects of LispDefine this term.

Example Scheme program:

(DEFINE (avex lis)
  (filtergreater lis (/ (sum lis) (length lis)))
)
(DEFINE (sum lis)
  (COND
    ((NULL? lis) 0)
    (ELSE        (+ (CAR lis) (sum (CDR lis))))
  )
)
(DEFINE (filtergreater lis num)
  (COND
    ((NULL? lis)       '())
    ((> (CAR lis) num) (CONS (CAR lis) (filtergreater (CDR lis) num)))
    (ELSE              (filtergreater (CDR lis) num)
  )
)

Simula 67:


Smalltalk-80:
The first full implementation of an object-oriented language is still considered the quintessential object-oriented language. Developed at Xerox PARC pioneered the use of graphical user interfaces.

Example Smalltalk-80 program:

class name                 Avex
superclass                 Object
instance variable names    intlist
"Class methods"
"Create an instance"
  new
    ^ super new
"Instance methods"
"Initialize"
  initialize
    intlist <- Array new: 0
"Add int to list"
  add: n | oldintlist |
    oldintlist <- intlist.
    intlist <- Array new: intlist size + 1.
    intlist <- replaceFrom: 1 to: intlist size with: oldintlist.
    ^ intlist at: intlist size put: n
"Calculate average"
  average | sum |
    sum <- 0.
    1 to: intlist size do:
      [:index | sum < sum + intlist at: index].
    ^ sum // intlist size
"Filter greater than average"
  filtergreater: n | oldintlist i |
    oldintlist <- intlist.
    i <- 1.
    1 to: oldintlist size do:
      [:index | (oldintlist at: index) > n
          ifTrue: [oldintlist at: i put: (oldintlist at: index)]]
    intlist <- Array new: oldintlist size.
    intlist replaceFrom: 1 to: oldintlist size with: oldintlist
Example Smalltalk-80 session:
av <- Avex new
av initialize
av add: 1
1
av add: 2
2
av add: 3
3
av filtergreater: av average
av at: 1
3

Lambda calculus:
A very simple algebraic model of computation designed byAlonzo Church in the 60s. In its pure form, everything is a function (even numbers). A lambda expression is recursively defined as Only two rewrite rules on lambda expressions are necessary to model universal computation: alpha reduction and beta reduction.


Garbage collection:
A routine that searches memory for program segments or data that are no longer active or used in order to reclaim that space. It tries to make as much memory available on the heapDefine this term as possible. Implicit garbage collection operates on the background of a running program to clean up unused heap space.


Heap:


Stack:


Regular expression:
Regular expressions describe the tokensDefine this term of a programming language. A regular expression is one of For example, the regular expression describing an identifier in C/C++ is:
identifier -> letter (letter | digit)*
digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
letter -> a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z
        | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
        | _
For compiler design, tools exist that generate efficient scannersDefine this term automatically from regular expressions (e.g. flex).

BNF:
Backus-Naur Form (BNF) is a form of context-free grammar frequently used to describe a programming language syntax.
LL grammar:
An LL grammar is a grammar suitable for top-down parsing. If it is not possible to write a recursive descent parser for a grammar, it is not LL(1). An LL(n) grammar is a grammar suitable for top-down parsing using n lookahead tokens.

An LL grammar cannot have left-recursive productionsDefine this term, because a recursive descent parser would recursively call itself forever without consuming any input characters.

The following grammar is not LL(1)

<A> -> <B> <C>
<A> -> a
<B> -> a b
<B> -> b
<C> -> c
It is not LL(1) because the subroutine for nonterminal A cannot decide which production to use when it sees an a on the input:
proc A
  if next_token="a"
    ?? cannot decide whether the first or second production for <A> applies here ??
The grammar is LL(2), because the token after next token can be used to determine which production should be applied:
proc A
  if next_token="a" and token_after_next_token="b"
    B()
    C()
  else if next_token="b"
    B()
    C()
  else
    match("a");

LR grammar:
A LR grammar is a grammar suitable for bottom-up parsing. A LR(n) grammar is a grammar suitable for bottom-up parsing using n lookeahead tokens. The class of LR grammars includes the class of LL grammars.


Ambiguous grammar:
A grammar is ambiguous if a string exists that has more than one distinct derivation resulting in distinct parse trees. See also ambiguous if-then-elseDefine this term.

The grammar for simple expressions below is ambiguous:

<expression> -> identifier
              | unsigned_integer
              | - <expression>
              | ( <expression> )
              | <expression> <operator> <expression>
<operator> -> + | - | * | /
because we find two distinct (left-most) derivations for the string a-b+1:
<expression>
  => <expression> <operator> <expression>
  => <expression> <operator> <expression> <operator> <expression>
  => identifier <operator> <expression> <operator> <expression>
  => identifier - <expression> <operator> <expression>
  => identifier - identifier <operator> <expression>
  => identifier - identifier + <expression>
  => identifier - identifier + unsigned_integer
        (a)     -    (b)     +       (1)
and
<expression>
  => <expression> <operator> <expression>
  => identifier <operator> <expression>
  => identifier - <expression>
  => identifier - <expression> <operator> <expression>
  => identifier - identifier <operator> <expression>
  => identifier - identifier + <expression>
  => identifier - identifier + unsigned_integer
        (a)     -    (b)     +       (1)
The simple expression grammar below is unambiguous:
<expression> -> <term>
              | <expression> <add_op> <term>
<term> -> <factor>
        | <term> <mult_op> <factor>
<factor> -> identifier | unsigned_integer
          | - <factor> | ( <expression> )
<add_op> -> + | -
<mult_op> -> * | /
We find only one derivation for all strings in the language defined by the grammar. For example, the left-most derivation of a-b+1 is:
<expression>
  => <expression> <add_op> <term>
  => <expression> <add_op> <term> <add_op> <term>
  => <term> <add_op> <term> <add_op> <term>
  => <factor> <add_op> <term> <add_op> <term>
  => identifier <add_op> <term> <add_op> <term>
  => identifier - <term> <add_op> <term>
  => identifier - <factor> <add_op> <term>
  => identifier - identifier <add_op> <term>
  => identifier - identifier + <term>
  => identifier - identifier + <factor>
  => identifier - identifier + unsigned_integer
        (a)     -    (b)     +       (1)

Ambiguous if-then-else:
A problem with the if-then-else grammar for Pascal and C is the formulation of unambiguous grammar productions for if-then-else. The grammar below is ambiguous
<stmt> -> if <expr> then <stmt>
        | if <expr> then <stmt> else <stmt>
because we find two distinct derivations of the string
if C1 then if C2 then S1 else S2
(where C1 and C2 are some expressions, S1 and S2 are some statements):
<stmt>
  => if <expr> then <stmt>
  => if <expr> then if <expr> then <stmt> else <stmt>
and another derivation
<stmt>
  => if <expr> then <stmt> else <stmt>
  => if <expr> then if <expr> then <stmt> else <stmt>
An anuambiguous grammar for if-then-else is (you don't need to memorize this):
<stmt> -> <balanced_stmt>
        | <unbalanced_stmt>
<balanced_stmt> -> if <expr> then <balanced_stmt> else <balanced_stmt>
                 | <other_stmt>
<unbalanced_stmt> -> if <expr> then
                   | if <expr> then <balanced_stmt> else <unbalanced_stmt>
which is an LR grammarDefine this term, but not an LL grammarDefine this term and no pure top-down parserDefine this term can be used to parse program fragments with the unambiguous if-then-else grammar.


Attribute grammar:


Semantic rule:


Semantic checks:Show slide
Static semantic checksDefine this term are performed by the compiler at compile time. Dynamic semantic checksDefine this termare performed at run time. A compiler cannot always ensure that certain constraints on programming constructs are met at compile time, for example, whether the index value of an array is out of bounds. A compiler may generate run time checks in the target code to enforce these constraints at run time.


Static semantic checks:
Static semantic checks performed by a compiler at compile time is applied to ensure that variables are declared before used, variables are typed correctly in expressions, labels have targets, etc.


Dynamic semantic checks:
A compiler may generate run time checks in the target code to enforce programming language specific onstraints on programming constructs at run time. An interpreter or virtual machine may enforce constraints immediately while executing an intruction. Exceptions are raised when an error is detected.


Tokens:
Tokens are the indivisible units a scannerDefine this term of a compiler produces for further analysis by the parser. Example tokens are programming language keywords, operators, identifiers, numbers, and punctuation. Tokens are also called terminalsDefine this term in the context of grammarsDefine this term.


Terminals:
A terminal of a grammar of a programming language is a tokenDefine this term, e.g. a keyword or operator.


Nonterminals:
A nonterminal of a grammar denotes a syntactic category of a language. For example, a programming language statement as a syntactic category can be one of many alternative statements.


Production:
BNFDefine this term grammar productions are of the form
<nonterminal> -> sequence of (non)terminals
Productions provide descriptions of the syntax for a syntactic category denoted by a nonterminal.

A production is immediately left recursive if it is of the form

<A> -> <A> ...
and a production is immediately right recursive if it is of the form
<A> -> ... <A>
where <A> is some nonterminal.

Productions can be left or right recursive through other productions. For example

<A> -> <B> ...
<B> -> <A> ...



Derivation:


Parse tree:
A parse tree depicts a derivationDefine this term as a tree: the nodes are the nonterminalsDefine this term, the children of a node are the symbols (terminals and nonterminals) of a right-hand side of a productionDefine this term for the nonterminal at the node, and the leaves are the terminalsDefine this term.

Given the grammar

<id_list> -> identifier <id_list_tail>

<id_list_tail> -> , identifier <id_list_tail>
                | ;
The parse tree of "A,B,C;" is
 
Parser tree example


Abstract syntax tree (AST):


Associative:
An operator is left associative if the operations are performed from the left to the right in an expression. Similarly, an operator is right associative if the operations are performed from the right to the left in an expression. For example, addition is left associative and in the expression 1 + 2 + 3 the numbers 1 and 2 are added first, after which 3 is added. Note that for the addition of numbers the associativity of + does not matter as the terms can be reordered in a formal system. However, limited numeric precision in a computer restricts this reordering and an overflow may occur when the terms are reordered. Also, if the terms are functions with side effects the result would be different after reordering. Arithmetic operators in a programming language are typically left associative with the notable exception of exponentiation (^) which is right associative. However, this rule of thumb is not universal.

Associativity can be captured in a grammar. For a left associative binary operator op we have a production of the form

<expr> -> <term> | <expr> op <term>
and for a right associative operator <op> we have a production of the form
<expr> -> <term> | <term> op <expr>
Note that the production for a left associative operator is left recursiveDefine this term and therefore has to be rewritten for a recursive descent parser:
<expr> -> <term> <more_terms>
<more_terms> -> op <term> <more_terms> | e

Precedence:
The precedence of an operator indicates the priority of applying the operator relative to other operators. For example, multiplication has a higher precedence than addition, so a+b*c is evaluated by multiplying b and c first, after which a is added. That is, multiplication groups more tightly compared to addition. The rules of operator precedence vary from one programming language to another.

The relative precedences between operators can be captured in a grammar. A nonterminal is introduced for every group of operators with identical precedence. The nonterminal of the group of operators with lowest precedence is the nonterminal for the expression as a whole. Productions for (left associative) binary operators with lowest to highest precedences are written of the form

<expr> -> <expr1> | <expr> <lowest_op> <expr1>
<expr1> -> <expr2> | <expr1> <one_but_lowest_op> <expr2>
...
<expr9> -> <term> | <expr9> <highest_op> <term>
<term> -> identifier | number | - <term> | ( <expr> )
where <lowest_op> is a nonterminal denoting all operators with the same lowest precedence, etc.


Scanner:
A scanner of a compilerDefine this term breaks up the character stream of a source program into tokensDefine this term. The process of scanning comprises the lexical analysis phase of a compiler. The purpose of scanning is to simplify the task of the parser of the compiler. Comments and white space are removed, keywords are recognized and represented as tokensDefine this term, identifers for names of variables and functions are stored in a symbol table and tagged with source file and line numbers.

Example scanner written in Java:

import java.io.*;
public class Scanner
{ public static void main(String argv[]) throws IOException
  { FileInputStream stream = new FileInputStream(argv[0]);
    InputStreamReader reader = new InputStreamReader(stream);
    StreamTokenizer tokens = new StreamTokenizer(reader);
    int next = 0;
    while ((next = tokens.nextToken()) != tokens.TT_EOF)
    { switch (next)
      { case tokens.TT_WORD:
          System.out.println("WORD:   " + tokens.sval);
          break;
        case tokens.TT_NUMBER:
          System.out.println("NUMBER: " + tokens.nval);
          break;
        default:
          switch ((char)next)
          { case '"':
              System.out.println("STRING: " + tokens.sval);
              break;
            case '\'':
              System.out.println("CHAR:   " + tokens.sval);
              break;
            default:
              System.out.println("PUNCT:  " + (char)next);
          }
      }
    }
    stream.close();
  }
}
Save it with file name "Scanner.java", compile it with "javac Scanner.java", and run it with "Scanner Scanner.java", where the scanner is applied to itself.


Parser:
A parser of a compiler builds a parse treeDefine this term representation of a stream of tokensDefine this term. The grammar of a programming language defines the parse tree structure produced by a parser given a syntactically valid program fragment.


Top-down parser:
Also called a predictive parser. This type of parser proceeds building a parse tree from the root down. An example top-down parser is a recursive descent parserDefine this term.


Bottom-up parser:
This type of parser proceeds building a parse tree from the bottom up.


Recursive descent parser:
A top-down parserDefine this term based on recursive functions.

Consider for example, the following LL(1) grammar

<expr> -> <term> <term_tail>
<term_tail> -> <add_op> <term> <term_tail> | e
<term> -> <factor> <factor_tail>
<factor_tail> -> <mult_op> <factor> <factor_tail> | e
<factor> -> ( <expr> ) | - <factor> | identifier | unsigned_integer
<add_op> -> + | -
<mult_op> -> * | /
For this LL(1) grammar a recursive descent parser in Java is:
import java.io.*;
public class CalcParser
{ private static StreamTokenizer tokens;
  private static int ahead;
  public static void main(String argv[]) throws IOException
  { InputStreamReader reader = new InputStreamReader(System.in);
    tokens = new StreamTokenizer(reader);
    tokens.ordinaryChar('.');
    tokens.ordinaryChar('-');
    tokens.ordinaryChar('/');
    get();
    expr();
    if (ahead == (int)'$')
      System.out.println("Syntax ok");
    else
      System.out.println("Syntax error");
  }
  private static void get() throws IOException
  { ahead = tokens.nextToken();
  }
  private static void expr() throws IOException
  { term();
    term_tail();
  }
  private static void term_tail() throws IOException
  { if (ahead == (int)'+' || ahead == (int)'-')
    { add_op();
      term();
      term_tail();
    }
  }
  private static void term() throws IOException
  { factor();
    factor_tail();
  }
  private static void factor_tail() throws IOException
  { if (ahead == (int)'*' || ahead == (int)'/')
    { mult_op();
      factor();
      factor_tail();
    }
  }
  private static void factor() throws IOException
  { if (ahead == (int)'(')
    { get();
      expr();
      if (ahead == (int)')')
        get();
      else System.out.println("closing ) expected");
    }
    else if (ahead == (int)'-')
    { get();
      factor();
    }
    else if (ahead == tokens.TT_WORD)
      get();
    else if (ahead == tokens.TT_NUMBER)
      get();
    else System.out.println("factor expected");
  }
  private static void add_op() throws IOException
  { if (ahead == (int)'+' || ahead == (int)'-')
      get();
  }
  private static void mult_op() throws IOException
  { if (ahead == (int)'*' || ahead == (int)'/')
      get();
  }
}
This parser does not construct a parse tree but verifies if a string terminated with a $ is an expression.


A recursive descent parser to evaluate simple expressions:
View Java source

To run the example, save the source with file name "Calc.java", compile it with "javac Calc.java" and run it with "java Calc".


A recursive descent parser to translate simple expressions into Lisp expressions:
View Java source of the CalcAST class
View Java source of the AST class

To run the example, save the CalcAST class source with file name "Calc.java", save the AST class source with file name "AST.java", compile it with "javac AST.java CalcAST.java" and run it with "java CalcAST".


Compiler:
A compiler translates source programs into assembly code, machine code, or code for a virtual machine.


Just-in-time compiler:
A translator of intermediate code (e.g. for a virtual machine) into machine code for a particular platform. The translation is done just before the program is executed. Just-in-time compilers are available for many types of machines to translate Java byte code into native machine code.


Interpreter:
An interpreter is a virtual machine for a high-level language.


Linker:
A linker merges object codes and library routines together to produce an executable program.


Preprocessor:
A preprocessor applies macro expansion to a source program. In C and C++ for example, #define macros are expanded and header files are incuded in the source for the first phase of compiler analysis (lexical analysis by the scanner).


Exception:

Function:


Procedure:


Formal parameter:
A formal parameter is an parameter declared with a subroutine definition. It is an identifier refering to the value of an actual parameterDefine this term when the subroutine is called. For example, in the C program fragment
main(int argc, char argv[])
{ ... }
both argc and argv are formal parameters in main's definition.

Also known as dummy arguments in Fortran.


Actual parameter:
A value or reference to an object which is passed to a function or procedure.