Shells: starting and automating

In Unix, we have had the idea of a "start-up" file for shells; it contains some number of lines of "start-up" commands to be executed by each new shell.

Other start-up files

Other programs in addition to shells have also had the idea of using "dot" files (i.e., those files whose names begin with a period) to control initial behavior of programs, such as ".emacs" or ".vimrc".

Bash start-up files

The rules for each shell are different; here are the rules for bash from the "man pages":

       When  bash  is invoked as an interactive login shell, or as a non
       interactive shell with the --login option, it first reads and exe
       cutes  commands  from  the file /etc/profile, if that file exists.
       After  reading  that   file,   it   looks   for   ~/.bash_profile,
       ~/.bash_login,  and  ~/.profile, in that order, and reads and exe
       cutes commands from the first one that  exists  and  is  readable.
       The  --noprofile  option  may be used when the shell is started to
       inhibit this behavior.

       When a login shell exits, bash reads and  executes  commands  from
       the file ~/.bash_logout, if it exists.

       When  an  interactive  shell that is not a login shell is started,
       bash  reads  and  executes  commands  from  /etc/bash.bashrc   and
       ~/.bashrc,  if  these files exist.  This may be inhibited by using
       the --norc option.  The --rcfile file option will  force  bash  to
       read  and  execute  commands from file instead of /etc/bash.bashrc
       and ~/.bashrc.

What's the upshot of the bash start-up rules?

If you want to always have a shell setting take effect, such as resetting your prompt, put the commands in your .bashrc, which is always read irrespective of whether or not it is a "login" shell.
If you only want to have command to execute in a login shell, such as showing /etc/motd (the message of the day), then put it in your .profile. day)

What kind of things could you put in your bash start-up files?

Path settings: export PATH=$PATH:/usr/local/bin
Prompt settings: export PS1='% '
Default editor settings: export EDITOR=emacs
Setting bash history: export HISTSIZE=100
Creating aliases: alias rm="rm -i"

Comments

Comments are indicated by a '#':

# this is a comment
# another comment
export HISTSIZE=100
# another comment

Listing your current variables

As mentioned in Lecture 4, you can list variables in a variety of ways:

env
printenv;
in Bash, you can also use set, which also gives additional shell variables that aren't actually in the process's environment.

Shell "scripting"

As is clear from the previous section, Unix shells understand "scripting" — that is, automating a sequence of commands. While start-up scripts happen at a predetermined time, it is possible to write scripts that can be run at any time.

One very convenient thing to note about bash: it has a full reference manual available online at:

http://www.gnu.org/software/bash/manual/bash.html

Execution

Perhaps the most unique part of Unix shell scripting happens at the very beginning of each script:

#!/bin/bash
#
#

The first line is a comment to bash, but it isn't to the kernel.

If the kernel sees the initial characters #! (the "shebang") and the "shebang" is followed by a path to another executable like /bin/bash, then that program is executed, and the original filename is instead given as the first argument.

From the man page execve(2)

   Interpreter scripts
       An  interpreter  script is a text file that has execute permission
       enabled and whose first line is of the form:

           #! interpreter [optional-arg]

       The interpreter must be a valid pathname for an  executable  which
       is  not  itself  a  script.   If the filename argument of execve()
       specifies an interpreter script, then interpreter will be  invoked
       with the following arguments:

           interpreter [optional-arg] filename arg...

       where  arg...  is the series of words pointed to by the argv argu‐
       ment of execve().

Shell scripting capabilities

Shells as scripting languages have the typical abilities of any programming language:

Variables, to keep named state.
Input/output functions, (often called "i/o" functions) to receive and send state
Conditional expressions, to express alternation among code paths
Repetition structures, to express repetition of blocks of code

Variables

Shells have long distinguished between "local" variables and "environment" variables. Local variables are simply those local to the current shell process, and are not inherited by child processes.

"Environment" variables instead are inherited by child processes. Indeed, environment variables are an explicit part of every process in a Unix system --- indeed, since these are explicit parts of every process on the system, you can see your current shell's environment variables (or any process, for that matter) through a kernel "window", the /proc interface:

% cat /proc/self/environ    # the self refers to the current process's process id

Positional parameters

Like environmental variables, arguments sent to a process are an explicit part of every process in a Unix system. You can also see these from /proc:

% cat /proc/$$/cmdline

Inside of a shell, you can refer to these via $0, $1, $2, ... $9. Note that, just as shown /proc/$$/cmdline, $0 refers to the command, $1 refers to the command's first argument, and so forth. You can also refer to all of the arguments with $* (note that this does not include $0, the command.) You can refer to the number of arguments with $#.

More on bash variables

In addition to ordinary "scalar" (single value) variables, bash also supports "array" variables. From the bash man page:

       Bash provides one-dimensional indexed and associative array  vari-
       ables.   Any variable may be used as an indexed array; the declare
       builtin will explicitly declare an array.   There  is  no  maximum
       limit on the size of an array, nor any requirement that members be
       indexed or assigned contiguously.  Indexed arrays  are  referenced
       using  integers  (including arithmetic expressions)  and are zero-
       based; associative arrays are referenced using arbitrary strings.

Setting variables

In bash, setting variables is easy enough:

x=`uuidgen`          # this sets the local (scalar) variable "x"
export y=`uuidgen`   # this sets the environmental variable "y"
declare -a xx        # create indexed array "xx"
xx[7]=`uuidgen`      # this sets the eighth element of indexed array "xx"
declare -A yy        # create associative array "yy"
yy["2013-02-19"]=cloudy   # this sets the value for element 2013-02-19 of the associative array y

Printing variables

In bash, printing variables is easy enough. Assuming the variables from the previous slide:

echo $x             # this prints the local (scalar) variable "x"
echo $y             # this prints the environmental variable "y"
echo ${x[7]}        # this prints the eighth element of indexed array "x"
echo ${y["2013-02-19"]}   # this prints the value for element 2013-02-19 of the associative array y

Alternation

In bash, alternation (if-then[-else]) is easy enough to express:

if [ "$x" -eq "string1" ]
then 
  echo $x is ready
fi

if [ "$x" -eq "string1" ]
then 
  echo $x is ready
else
  echo $x is not ready
fi

Tests

The important boolean 2-ary (aka "binary") test operators for bash are:

$x -eq $y     # numerical equality test
$x -ne $y     # numerical inequality test
$x -lt $y     # numerical less than
$x -gt $y     # numerical greater than
$x -le $y     # numerical less than or equal to
$x -ge $y     # numerical greater than or equal to
$x == $y      # string equality
$x != $y      # string inequality
$x < $y       # string "less than"
$x > $y       # string "greater than"

Repetition

In bash, repetition is easy enough to express. There are three main forms:

for NAME in WORD ... ; do LIST ; done
for (( EXPR1 ; EXPR2 ; EXPR3 )) ; do LIST ; done
while CMDLIST ; do LIST ; done

The basic "for" loop

Like every version of Bourne shell, bash supports the basic "for" loop:

for name in *.tex
do
  pdflatex $name
done
for name in *.pdf
do
  lpr $name
done

The iterative "for" loop

In addition to the basic "for" loop, bash also supports a very C-like iterative mode

for (( x=0 ; x<10; x++ ))
do
  echo $x
done

The "while" loop

You can use the return code as a loop test with the "while" loop:

while `true`    # infinite loop
do
  echo xyz
done