Perl History

"PERL" stands for "Practical Extraction and Report Language"

Alternatively, there is also "Pathologically Eclectic Rubbish Lister"

It was created by Larry Wall and became widely known in the 1990s

It was available both from ucbvax and via Usenet

Perl is released under the Artistic License and under the Gnu GPL.

Advantages of Perl

  • Perl 5 is a pleasant language to program in
  • It fills a niche between shell scripts and conventional languages
  • It is very efficient for system administration scripts
  • It is very useful for text processing
  • It is a high level langue with nice support for objects. A Perl program often will take far less space than an equivalent C/C++ program.

Perl is Interpreted

  • Perl is first "compiled" into bytecodes; those bytecodes are then interpreted. Ruby, Python, and Java all have modes that are along these lines, although of course there are other options make other tradeoffs.
  • This is much faster than shell interpretation, particularly when you get into some sort of loop. However, interpretation is still slower than standard compilation.
  • On machines that I have tested over the years, example times include: an empty loop in bash for 1 million iterations took 34 seconds; 1 million iterations of an empty loop in Perl took around .5 seconds; 1 million iterations of an empty loop in C ran in roughly .001 seconds.

A simple Perl program

#!/usr/bin/perl -w
# this is a comment
use strict;
print "Hello World!\n";
exit 0;      
  • The first line indicates that we are to actually execute the program /usr/bin/perl. (The "-w" indicates that the interpreter should warn about any question constructs in the program.
  • The second line is a comment.
  • The third line makes it mandatory to declare variables. (Notice that statements are terminated with a semicolon).
  • The fourth line prints "Hello World", and the fifth line terminates the program with a "normal" exit code of 0.

Basic Perl Concepts

  • There is no explicit main() function, but you can create functions (often called "subroutines" in the Perl world).
  • Features are taken from a large variety of languages, but especially shells and C.
  • It is very easy to write short programs that pack a lot of punch.

Similarities to C

  • Many operators are identical
  • Many control structures are quite close in syntax
  • Supports formatted i/o very similar to C's stdio library syntax
  • Can access command line arguments easily
  • Supports access to i/o streams including stdin, stdout, and stderr

Similarities to shell programming

  • Comment syntax of an initial #
  • $variables
  • Interpolation of variables inside of quoting
  • Support command line arguments
  • Support for regular expressions
  • Some control structures
  • Many operators similar to shell commands and Unix command syntax

Scalars

Scalars represent a single value:

my $var1 = "some string";
my $var2 = 23;      

Scalars are strings, integers, or floating point numbers.

There also "magic" scalars. The most common one is $_, which means the "default" variable, such as when you just do a print with no argument, or are looping over the contents of a list. The "current" item would be referred to by $_.

Numbers

Both integers and floating point numbers are actually stored as a double precision values — unless you invoke the "use integer" pragma:

#!/usr/bin/perl -w
use strict;
use integer;
my $w = 100;
my $x = 3;
print "w / x = " . $w/$x . "\n";
[langley@sophie]$ ./prog
w / x = 33      

Floating point literals

  • Floating point literals are similar to those of C.
  • All three of these literals represent the same value:
    12345.6789
    123456789e-4
    123.456789E2      

Integer decimal literals

  • Similar to C:
    0    -99   1001          
  • Can use underscore as a visual separator:
    2_333_444_555_666          

Other integral literals

  • Hexadecimal:
    0xff12     0x991b
  • Octal:
    01245     07611
  • Binary:
    0b1010111

C-like operators

Operator Meaning
= Assignment
+ - * / % Arithmetic
& | << >> Bitwise operators
> < >= <= Relationals returning "boolean" values
&& || ! Logicals returning "boolean" values
+= -= *= Binary assignment
++ -- Increment/Decrement
? : Ternary
, Scalar binary operator that takes on the rhs value

 

Also, see man perlop

Operators not similar to C operators

Operator Meaning
** Exponetiation
<=> Numeric comparison
x String repetition
. String concatenation
eq ne lt gt ge le String relations
cmp String comparison
=> Like the comma operator, but forces the the first left word to be a string

 

Again, see man perlop

Strings

Strings are a base type in Perl.

Strings can either be quoted to allow interpolation (both metacharacters and variables), or quoted so as not to be. Double quotes allow interpolation, single quotes prevent it.

Single quoted strings

  • Single quoted strings are not subject to most interpolation.
  • However, there are two to be aware of:
    1. Use \' to indicate a literal single quote inside of a signle quoted string that was defined with '. (You can avoid this with the q// syntax.)
    2. Use \\ to insert a backslash; other \SOMECHAR are not interpolated inside of single quoted strings

Double quoted strings

You can specify special characters in double quoted strings easily:

print "This is an end of line\n";
print "there \t tabs \t embedded \t here\n";
print "embedding double quotes \" are easy \n";
print "that costs \$1000\n";
print "the variable \$variable\n";  

String operators

  • The period (".") is used to indicate string concatenation.
  • The "x" operator is used to indicate string repetition:
     "abc " x 4 
      yields
    "abc abc abc abc "

Implicit conversions between numbers and strings

Perl will silently convert numbers and strings where appropriate.

"5" x "10"  yields   "5555555555"

"2" + "2"   yields   4

"2 + 2" . 4 yields  "2 + 24"

Scalars

  • Ordinary scalar variables begin with $
  • Variable names correrspond to the regular expression syntax $[a-zA-z][a-zA-Z0-9_]*
  • Scalars can hold integers, strings, or floating point numbers

Declaring scalars

  • I recommend that you always use the pragma use strict
  • Where you do so, you will have to explicitly declare all of your variables before using them. Use my to declare your variables.
  • You can declare and initialize one or more variables with my:
    my $x;
    my ($x,$y);
    my $x="value";
    my ($x,$y) = ("x","y"); 

Variable interpolation

You can use the special form ${VARIABLENAME} when you need to interpolate a variable surrounded by non-whitespace:

[langley@sophie]$ perl
$a = 12;
print "abc${a}abc\n";
abc12abc

undef value

  • A variable has the special value undef when it is first created (it can also be set with the special function undef() and can be tested with the special function defined()).
  • An undef variable is treated as zero if it is used numerically.
  • An undef variable is treated as an empty string if it is used as a string value.

However, if you run Perl with the -w option, the interpreter will also alert you that an undef variable is being evaluated.

The print function

  • The print function can print a list of expressions, such as strings, variables, or a combination of operatnds and operators.
  • By default, it prints to stdout.
  • The general form is
    print [expression [, expression]* ];

The line input operator
<STDIN>

  • You can use <STDIN> to read a single line of input:
    $a = <STDIN>
  • You can test for end of input with defined($a).

The chomp function

You can remove a newline from a string with chomp:


$line = <STDIN> chomp($line); chomp($line = <STDIN>);

The chomp function

$ perl
chomp($line = );
print $line;
abcdefghijk
abcdefghijk $ 

String relational operators

The string relational operators are eq, ne, gt, lt, ge, and le.

Examples:

100 lt 2
"x" le "y"

String length

You can use the length function to give the number of characters in a string.

Scalar values "typecast"
to boolean values

Man of Perl's control structures look for a boolean value. Perl doesn't have an explicit "boolean" type, so instead we use the following "typecasting" rules for scalar values:

  • If a scalar is a number, then 0 is treated as false, and any other value is treated as true.
  • If a scalar is a string, then "0" and the empty string are treated as false, and any other value as true.
  • If a scalar is not defined, it is treated as false.

If elsif else

Note that both elsif and else are optional, but curly brackets are never optional, even if the block contains only one statement:

if (COND)
  {
  }
[elsif
  {
  }]*
[else
  {
  }]

if example:

if($answer == 12)
{
   print "Right -- one year has twelve months!\n";
}

if/else example:

if($answer == 12)
{
   print "Right -- one year has twelve months!\n";
}
else
{
   print "No, one year has twelve months!\n";
}

if-elsif-else examples

if($answer < 12)
{
   print "Need more months!\n";
}
elsif($answer > 12)
{
   print "Too many months!\n";
}
else
{
   print "Right -- one year has twelve months!\n";
}

if-elsif-else examples

if-elsif-elsif example:

if($answer eq "struct")
{
}
elsif($answer eq "const")
{
}
elsif($answer ne "virtual")
{
}

defined()

You can test to see if a variable has a defined value with defined():

if(!defined($x))
{
  print "Use of undefined value is not wise!\n";
}

The while construction

while(<boolean>)
{
  <statement list>
}

As with if-elsif-else, the curly brackets are not optional.

while examples

while(<STDIN>)
{
  print;
}

You might note that we are using the implicit variable $_ in this fragment.

until control structure

until(<boolean>)
{
  <statement list> 
}

The until construction is the opposite of the while construction since it executes the <statement list> until the <boolean> test becomes true.

until example

#!/usr/bin/perl -w
use strict;
my $line;
until (! ($line=<STDIN>) )
{
  print $line;
}

for control structure

for(<init>; <boolean test>; <increment>)
{
  <statement list> 
}

for example

for($i = 0; $i < 10; $i++)
{
  print "$i * $i = " . $i*$i . "\n";
}

Lists and arrays

  • A list in Perl is an ordered collection of scalars.
  • An array in Perl is a variable that contains an ordered collection of scalars.

List literals

  • Can represent a list of scalar values
  • General form:
    ( <scalar1>, <scalar2> ... )

List literals

Examples:

(0,1,5)          # a list of three scalars that are numbers
('abc','def')    # a list of two scalars that are strings
(1,'abc',3)      # a list of mixed values
($x,$y)          # values can be determined at runtime
()               # empty list

Using qw syntax

You can also use the "quoted words" (qw) synax to specify list literals:

('apples', 'oranges', 'bananas')
qw/ apples oranges bananas /
qw! apples oranges bananas !
qw( apples oranges bananas )
qw< apples oranges bananas >

List literals, continued

  • You can use the range operator ".." to create list elements.
  • Examples:
    (0..5)        # 
    (0.1 .. 5.1)  # same since truncated (not floor()!)
    (5..0)        # evaluates to empty list
    (1,0..5,'x' x 10) # can use with other values
    ($m .. $n)    # can be evaluated at runtime
    

Array variables

  • Arrays are declared with the "@" character.
    my @arr;
    my @arr = ('a', 'b', 'c');
  • Notice that you don't have to declare an array's size.

Arrays and scalars

  • Arrays and scalars are in separate name spaces, so you can have two different $x and @x.
  • Mnemonically, the "$" does look like "S", and "a" does resemble "@".