Programming languages are imperative
or declarative
and
can be further subdivided into functional
,
logic
,
dataflow
,
procedural
,
and object-oriented
languages.
Example languages in this category are Lisp
,
ML, and Haskell
.
Example procedural languages are Fortran 77
,
Basic
,
Pascal
,
Ada
,
and C
.
Example MIPS assembly program (from textbook page 1):
addiu sp,sp,-32 sw ra,20(sp) jal getint nop jal getint sw v0,28(sp) lw a0,28(sp) move v1,v0 beq a0,v0,D slt at,v1,a0 A: beq at,zero,B nop b C subu a0,a0,v1 B: subu v1,v1,a0 C: bne a0,v1,A slt at,v1,a0 D: jal putint nop lw ra,20(sp) addiu sp,sp,32 jr ra move v0,zeroExample MIPS R4000 machine code of the above assembly program (from textbook page 1):
27bdffd0 afbf0014 0c1002a8 00000000 0c1002a8 afa2001c 8fa4001c 00401825 10820008 0064082a 10200003 00000000 10000002 00832023 00641823 1483fffa 0064082a 0c1002b2 00000000 8fbf0014 27bd0020 03e00008 00001025
Structured languages, such as Pascal and Ada, force the programmer to write a structured program. However, unstructured languages such as Fortran 77, Cobol, and Basic require discipline on the part of the programmer to follow.
Here is an example of a non-structured program in C that counts the number of goto's in a file whose filename is given as an argument to the executable of this program. This program can be used to measure the "spaghettiness" of a C program:
#include <stdio.h>
#include <malloc.h>
main(togo,toog)
int togo;
char *toog[];
{char *ogto, tgoo[80];FILE *ogot; int oogt=0, ootg, otog=79,
ottg=1;if ( togo== ottg) goto gogo; goto goog; ggot:
if ( fgets( tgoo, otog, ogot)) goto gtgo; goto gott;
gtot: exit(); ogtg: ++oogt; goto ogoo; togg: if ( ootg > 0)
goto oggt; goto ggot; ogog: if ( !ogot) goto gogo;
goto ggto; gtto: printf( "%d goto \'s\n", oogt); goto
gtot; oggt: if ( !memcmp( ogto, "goto", 4)) goto otgg;
goto gooo; gogo: exit( ottg); tggo: ootg= strlen(tgoo);
goto tgog; oogo: --ootg; goto togg; gooo: ++ogto; goto
oogo; gott: fclose( ogot); goto gtto; otgg: ogto= ogto +3;
goto ogtg; tgog: ootg-=4;goto togg; gtgo: ogto= tgoo;
goto tggo; ogoo: ootg-=3;goto gooo; goog: ogot= fopen(
toog[ ottg], "r"); goto ogog; ggto: ogto= tgoo; goto
ggot;}
Fortran and Basic programs developed in the early days of computing were
difficult to read and understand, somewhat similar to this example in terms
of the choice in variable names (limit 6 characters in Fortran and 2 in
Basic) and the frequent use of goto.
Example program in Ada:
with TEXT_IO;
use TEXT_IO;
procedure AVEX is
package INT_IO is new INTEGER_IO (INTEGER);
use INT_IO;
type INT_LIST_TYPE is array (1..99) of INTEGER;
INT_LIST : INT_LIST_TYPE;
LIST_LEN, SUM, AVERAGE : INTEGER;
begin
SUM := 0;
-- read the length of the input list
GET (LIST_LEN);
if (LIST_LEN > 0) and (LIST_LEN < 100) then
-- read the input into an array
for COUNTER := 1 .. LIST_LEN loop
GET (INT_LIST(COUNTER));
SUM := SUM + INT_LIST(COUNTER);
end loop;
-- compute the average
AVERAGE := SUM / LIST_LEN;
-- write the input values > average
for counter := 1 .. LIST_LEN loop
if (INT_LIST(COUNTER) > AVERAGE) then
PUT (INT_LIST(COUNTER));
NEW_LINE;
end if
end loop;
else
PUT_LINE ("Error in input list length");
end if;
end AVEX;
Example Algol 60 program:
comment avex program
begin
integer array intlist [1:99];
integer listlen, counter, sum, average;
sum := 0;
comment read the length of the input list
readint (listlen);
if (listlen > 0) L (listlen < 100) then
begin
comment read the input into an array
for counter := 1 step 1 until listlen do
begin
readint (intlist[counter]);
sum := sum + intlist[counter]
end;
comment compute the average
average := sum / listlen;
comment write the input values > average
for counter := 1 step 1 until listlen do
if intlist[counter] > average then
printint (intlist[counter])
end
else
printstring ("Error in input list length")
end
Example QuickBasic program:
REM avex program DIM intlist(99) sum = 0 REM read the length of the input list INPUT listlen IF listlen > 0 AND listlen < 100 THEN REM read the input into an array FOR counter = 1 TO listlen INPUT intlist(counter) sum = sum + intlist(counter) NEXT counter REM compute the average average = sum / listlen REM write the input values > average FOR counter = 1 TO listlen IF intlist(counter) > average THEN PRINT intlist(counter); NEXT counter ELSE PRINT "Error in input list length" END IF END
Example program in C:
main()
{ int intlist[99], listlen, counter, sum, average;
sum = 0;
/* read the length of the list */
scanf("%d", &listlen);
if (listlen > 0 && listlen < 100)
{ /* read the input into an array */
for (counter = 0; counter < listlen; counter++)
{ scanf("%d", &intlist[counter]);
sum += intlist[counter];
}
/* compute the average */
average = sum / listlen;
/* write the input values > average */
for (counter = 0; counter < listlen; counter++)
if (intlist[counter] > average)
printf("%d\n", intlist[counter]);
}
else
printf("Error in input list length\n");
}
Division name Contains IDENTIFICATION Program identification. ENVIRONMENT Types of computers used. DATA Buffers, constants, work areas. PROCEDURE The processing (program logic).Example COBOL program to convert Fahrenheit to Celcius:
IDENTIFICATION DIVISION. PROGRAM-ID. EXAMPLE. ENVIRONMENT DIVISION. CONFIGURATION SECTION. SOURCE-COMPUTER. IBM-370. OBJECT-COMPUTER. IBM-370. DATA DIVISION. WORKING-STORAGE SECTION. 77 FAHR PICTURE 999. 77 CENT PICTURE 999. PROCEDURE DIVISION. DISPLAY 'Enter Fahrenheit ' UPON CONSOLE. ACCEPT FAHR FROM CONSOLE. COMPUTE CENT = (FAHR- 32) * 5 / 9. DISPLAY 'Celsius is ' CENT UPON CONSOLE. GOBACK.
Example Fortran 77 program:
PROGRAM AVEX INTEGER INTLST(99) C variable names that start with I,J,K,L,N,M are integers ISUM = 0 C read the length of the list READ (*, *) LSTLEN IF ((LSTLEN .GT. 0) .AND. (LSTLEN .LT. 100)) THEN C read the input in an array DO 100 ICTR = 1, LSTLEN READ (*, *) INTLST(ICTR) ISUM = ISUM + INTLST(ICTR) 100 CONTINUE C compute the average IAVE = ISUM / LSTLEN C write the input values > average DO 110 ICTR = 1, LSTLEN IF (INTLST(ICTR) .GT. IAVE) THEN WRITE (*, *) INTLST(ICTR) END IF 110 CONTINUE ELSE WRITE (*, *) 'ERROR IN LIST LENGTH' END IF END
Example Fortran 90 program:
PROGRAM AVEX INTEGER INT_LIST(1:99) INTEGER LIST_LEN, COUNTER, AVERAGE C read the length of the list READ (*, *) LISTLEN IF ((LIST_LEN > 0) .AND. (LIST_LEN < 100)) THEN C read the input in an array DO COUNTER = 1, LIST_LEN READ (*, *) INT_LIST(COUNTER) ENDDO C compute the average AVERAGE = SUM(INT_LIST(1:LIST_LEN)) / LIST_LEN C write the input values > average DO COUNTER = 1, LIST_LEN IF (INT_LIST(COUNTER) > AVERAGE) THEN WRITE (*, *) INT_LIST(1:LIST_LEN) END IF END DO ELSE WRITE (*, *) 'ERROR IN LIST LENGTH' END IF END
Example Haskell program:
sum [] = 0 sum (a:x) = a + sum x
avex [] = [] avex (a:x) = [n | n <- a:x, n > sum (a:x) / length (a:x)]
Example Java program:
import java.io;
class Avex
{ public static void main(String args[]) throws IOException
{ DataInputStream in = new DataInputStream(System.in);
int listlen, counter, sum = 0, average;
int [] intlist = int[100];
// read the length of the list
listlen = Integer.parseInt(in.readLine());
if (listlen > 0 && listlen < 100)
{ // read the input into an array
for (counter = 0; counter < listlen; counter++)
{ intlist[counter] = Integer.valueOf(in.readline()).intValue();
sum += intlist[counter];
}
// compute the average
average = sum / listlen;
// write the input values > average
for (counter = 0; counter < listlen; counter++)
{ if (intlist[counter] > average)
System.out.println(intlist[counter] + "\n");
}
}
else
System.out.println("Error in input length\n");
}
}
Example Pascal program:
program avex(input, output);
type
intlisttype = array [1..99] of integer;
var
intlist : intlisttype;
listlen, counter, sum, average : integer;
begin
sum := 0;
(* read the length of the input list *)
readln(listlen);
if ((listlen > 0) and (listlen < 100)) then
begin
(* read the input into an array *)
for counter := 1 to listlen do
begin
readln(intlist[counter]);
sum := sum + intlist[counter]
end;
(* compute the average *)
average := sum / listlen;
(* write the input values > average *)
for counter := 1 to listlen do
if (intlist[counter] > average) then
writeln(intlist[counter])
end
else
writeln('Error in input list length')
end.
Example PL/I program:
AVEX: PROCEDURE OPTIONS (MAIN);
DECLARE INTLIST (1:99) FIXED;
DECLARE (LISTLEN, COUNTER, SUM, AVERAGE) FIXED;
SUM = 0;
/* read the input list length */
GET LIST (LISTLEN);
IF (LISTLEN > 0) & (LISTLEN < 100) THEN
DO;
/* read the input into an array */
DO COUNTER = 1 TO LISTLEN;
GET LIST (INTLIST(COUNTER));
SUM = SUM + INTLIST(COUNTER);
END;
/* compute the average */
AVERAGE = SUM / LISTLEN;
/* write the input values > average */
DO COUNTER = 1 TO LISTLEN;
IF INTLIST(COUNTER) > AVERAGE THEN
PUT LIST (INTLIST(COUNTER));
END;
ELSE
PUT SKIP LIST ('ERROR IN INPUT LIST LENGTH');
END AVEX;
Example Prolog program:
avex(IntList, GreaterThanAveList) :- sum(IntList, Sum), length(IntList, ListLen), Average is Sum / ListLen, filtergreater(IntList, Average, GreaterThanAveList).
% sum(+IntList, -Sum) % recursively sums integers of IntList sum([Int | IntList], Sum) :- sum(IntList, ListSum), Sum is Int + ListSum. sum([], 0).
% filtergreater(+IntList, +Int, -GreaterThanIntList) % recursively remove integers smaller or equal to Int from IntList filtergreater([AnInt | IntList], Int, [AnInt | GreaterThanIntList]) :- AnInt > Int, !, filtergreater(IntList, Int, GreaterThanIntList). filtergreater([AnInt | IntList], Int, GreaterThanIntList) :- filtergreater(IntList, Int, GreaterThanIntList). filtergreater([], Int, []).The following example illustrates a more "traditional" use of Prolog to infer information:
rainy(rochester). % fact: rochester is rainy rainy(seattle). % fact: seattle is rainy cold(rochester). % fact: rochester is cold snowy(X) :- rainy(X), cold(X). % rule: X is snowy if X is rainy and coldWith this program loaded, we can query the system interactively:
?- rainy(X). % user question X = rochester % system answer(s) X = seattle
?- snowy(X). X = rochester
Example Scheme program:
(DEFINE (avex lis) (filtergreater lis (/ (sum lis) (length lis))) )
(DEFINE (sum lis) (COND ((NULL? lis) 0) (ELSE (+ (CAR lis) (sum (CDR lis)))) ) )
(DEFINE (filtergreater lis num) (COND ((NULL? lis) '()) ((> (CAR lis) num) (CONS (CAR lis) (filtergreater (CDR lis) num))) (ELSE (filtergreater (CDR lis) num) ) )
Example Smalltalk-80 program:
class name Avex superclass Object instance variable names intlist
"Class methods" "Create an instance" new ^ super new
"Instance methods" "Initialize" initialize intlist <- Array new: 0
"Add int to list" add: n | oldintlist | oldintlist <- intlist. intlist <- Array new: intlist size + 1. intlist <- replaceFrom: 1 to: intlist size with: oldintlist. ^ intlist at: intlist size put: n
"Calculate average" average | sum | sum <- 0. 1 to: intlist size do: [:index | sum < sum + intlist at: index]. ^ sum // intlist size
"Filter greater than average" filtergreater: n | oldintlist i | oldintlist <- intlist. i <- 1. 1 to: oldintlist size do: [:index | (oldintlist at: index) > n ifTrue: [oldintlist at: i put: (oldintlist at: index)]] intlist <- Array new: oldintlist size. intlist replaceFrom: 1 to: oldintlist size with: oldintlistExample Smalltalk-80 session:
av <- Avex new av initialize av add: 1 1 av add: 2 2 av add: 3 3 av filtergreater: av average av at: 1 3
identifier -> letter (letter | digit)*
digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
letter -> a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | _For compiler design, tools exist that generate efficient scanners
An LL grammar cannot have left-recursive productions
,
because a recursive descent parser would recursively call itself forever
without consuming any input characters.
The following grammar is not LL(1)
<A> -> <B> <C> <A> -> a
<B> -> a b <B> -> b
<C> -> cIt is not LL(1) because the subroutine for nonterminal A cannot decide which production to use when it sees an a on the input:
proc A if next_token="a" ?? cannot decide whether the first or second production for <A> applies here ??The grammar is LL(2), because the token after next token can be used to determine which production should be applied:
proc A
if next_token="a" and token_after_next_token="b"
B()
C()
else if next_token="b"
B()
C()
else
match("a");
The grammar for simple expressions below is ambiguous:
<expression> -> identifier | unsigned_integer | - <expression> | ( <expression> ) | <expression> <operator> <expression>
<operator> -> + | - | * | /because we find two distinct (left-most) derivations for the string a-b+1:
<expression> => <expression> <operator> <expression> => <expression> <operator> <expression> <operator> <expression> => identifier <operator> <expression> <operator> <expression> => identifier - <expression> <operator> <expression> => identifier - identifier <operator> <expression> => identifier - identifier + <expression> => identifier - identifier + unsigned_integer (a) - (b) + (1)and
<expression> => <expression> <operator> <expression> => identifier <operator> <expression> => identifier - <expression> => identifier - <expression> <operator> <expression> => identifier - identifier <operator> <expression> => identifier - identifier + <expression> => identifier - identifier + unsigned_integer (a) - (b) + (1)The simple expression grammar below is unambiguous:
<expression> -> <term> | <expression> <add_op> <term>
<term> -> <factor> | <term> <mult_op> <factor>
<factor> -> identifier | unsigned_integer | - <factor> | ( <expression> )
<add_op> -> + | -
<mult_op> -> * | /We find only one derivation for all strings in the language defined by the grammar. For example, the left-most derivation of a-b+1 is:
<expression> => <expression> <add_op> <term> => <expression> <add_op> <term> <add_op> <term> => <term> <add_op> <term> <add_op> <term> => <factor> <add_op> <term> <add_op> <term> => identifier <add_op> <term> <add_op> <term> => identifier - <term> <add_op> <term> => identifier - <factor> <add_op> <term> => identifier - identifier <add_op> <term> => identifier - identifier + <term> => identifier - identifier + <factor> => identifier - identifier + unsigned_integer (a) - (b) + (1)
<stmt> -> if <expr> then <stmt> | if <expr> then <stmt> else <stmt>because we find two distinct derivations of the string
if C1 then if C2 then S1 else S2(where C1 and C2 are some expressions, S1 and S2 are some statements):
<stmt> => if <expr> then <stmt> => if <expr> then if <expr> then <stmt> else <stmt>and another derivation
<stmt> => if <expr> then <stmt> else <stmt> => if <expr> then if <expr> then <stmt> else <stmt>An anuambiguous grammar for if-then-else is (you don't need to memorize this):
<stmt> -> <balanced_stmt> | <unbalanced_stmt>
<balanced_stmt> -> if <expr> then <balanced_stmt> else <balanced_stmt> | <other_stmt>
<unbalanced_stmt> -> if <expr> then | if <expr> then <balanced_stmt> else <unbalanced_stmt>which is an LR grammar
<nonterminal> -> sequence of (non)terminalsProductions provide descriptions of the syntax for a syntactic category denoted by a nonterminal.
A production is immediately left recursive if it is of the form
<A> -> <A> ...and a production is immediately right recursive if it is of the form
<A> -> ... <A>where <A> is some nonterminal.
Productions can be left or right recursive through other productions. For example
<A> -> <B> ...
<B> -> <A> ...
Given the grammar
<id_list> -> identifier <id_list_tail> <id_list_tail> -> , identifier <id_list_tail> | ;The parse tree of "A,B,C;" is
![]() |
Associativity can be captured in a grammar. For a left associative binary operator op we have a production of the form
<expr> -> <term> | <expr> op <term>and for a right associative operator <op> we have a production of the form
<expr> -> <term> | <term> op <expr>Note that the production for a left associative operator is left recursive
<expr> -> <term> <more_terms>
<more_terms> -> op <term> <more_terms> | e
The relative precedences between operators can be captured in a grammar. A nonterminal is introduced for every group of operators with identical precedence. The nonterminal of the group of operators with lowest precedence is the nonterminal for the expression as a whole. Productions for (left associative) binary operators with lowest to highest precedences are written of the form
<expr> -> <expr1> | <expr> <lowest_op> <expr1>
<expr1> -> <expr2> | <expr1> <one_but_lowest_op> <expr2>
...
<expr9> -> <term> | <expr9> <highest_op> <term>
<term> -> identifier | number | - <term> | ( <expr> )where <lowest_op> is a nonterminal denoting all operators with the same lowest precedence, etc.
Example scanner written in Java:
import java.io.*;
public class Scanner
{ public static void main(String argv[]) throws IOException
{ FileInputStream stream = new FileInputStream(argv[0]);
InputStreamReader reader = new InputStreamReader(stream);
StreamTokenizer tokens = new StreamTokenizer(reader);
int next = 0;
while ((next = tokens.nextToken()) != tokens.TT_EOF)
{ switch (next)
{ case tokens.TT_WORD:
System.out.println("WORD: " + tokens.sval);
break;
case tokens.TT_NUMBER:
System.out.println("NUMBER: " + tokens.nval);
break;
default:
switch ((char)next)
{ case '"':
System.out.println("STRING: " + tokens.sval);
break;
case '\'':
System.out.println("CHAR: " + tokens.sval);
break;
default:
System.out.println("PUNCT: " + (char)next);
}
}
}
stream.close();
}
}
Save it with file name "Scanner.java", compile it with "javac Scanner.java",
and run it with "Scanner Scanner.java", where the scanner is applied to
itself.
Consider for example, the following LL(1) grammar
<expr> -> <term> <term_tail>
<term_tail> -> <add_op> <term> <term_tail> | e
<term> -> <factor> <factor_tail>
<factor_tail> -> <mult_op> <factor> <factor_tail> | e
<factor> -> ( <expr> ) | - <factor> | identifier | unsigned_integer
<add_op> -> + | -
<mult_op> -> * | /For this LL(1) grammar a recursive descent parser in Java is:
import java.io.*;
public class CalcParser
{ private static StreamTokenizer tokens;
private static int ahead;
public static void main(String argv[]) throws IOException
{ InputStreamReader reader = new InputStreamReader(System.in);
tokens = new StreamTokenizer(reader);
tokens.ordinaryChar('.');
tokens.ordinaryChar('-');
tokens.ordinaryChar('/');
get();
expr();
if (ahead == (int)'$')
System.out.println("Syntax ok");
else
System.out.println("Syntax error");
}
private static void get() throws IOException
{ ahead = tokens.nextToken();
}
private static void expr() throws IOException
{ term();
term_tail();
}
private static void term_tail() throws IOException
{ if (ahead == (int)'+' || ahead == (int)'-')
{ add_op();
term();
term_tail();
}
}
private static void term() throws IOException
{ factor();
factor_tail();
}
private static void factor_tail() throws IOException
{ if (ahead == (int)'*' || ahead == (int)'/')
{ mult_op();
factor();
factor_tail();
}
}
private static void factor() throws IOException
{ if (ahead == (int)'(')
{ get();
expr();
if (ahead == (int)')')
get();
else System.out.println("closing ) expected");
}
else if (ahead == (int)'-')
{ get();
factor();
}
else if (ahead == tokens.TT_WORD)
get();
else if (ahead == tokens.TT_NUMBER)
get();
else System.out.println("factor expected");
}
private static void add_op() throws IOException
{ if (ahead == (int)'+' || ahead == (int)'-')
get();
}
private static void mult_op() throws IOException
{ if (ahead == (int)'*' || ahead == (int)'/')
get();
}
}
This parser does not construct a parse tree but verifies if a string terminated
with a $ is an expression.
To run the example, save the source with file name "Calc.java", compile
it with "javac Calc.java" and run it with "java Calc".
To run the example, save the CalcAST class source with file name "Calc.java",
save the AST class source with file name "AST.java", compile it with "javac
AST.java CalcAST.java" and run it with "java CalcAST".
main(int argc, char argv[])
{ ... }
both argc and argv are formal parameters in main's
definition.
Also known as dummy arguments in Fortran.