unixprocesses.html

COP4610: Operating Systems & Concurrent Programming

up ↑

Unix Process Management

Unix/Linux Views of Process Control

Abstract conceptual view
Shell view
API view
Internal view

Conceptual View of Unix/Linux Process Control

every process has an ID, a parent, and possibly children
fork clones entire process
wait allows parent to wait for child to terminate
exec loads new program to be executed by process
kill sends a signal to a process
receipt of a signal by a process can have different effects:
- terminate the process
- stop (suspend) the process
- continue (unsuspend) the process
- be ignored
- cause execution of a handle procedure
- . . . (and other special cases)
sleep allows a process to sleep for a period of time

Effect of fork(I) on buffered output

New process is a complete clone of the parent,
including the output buffers
If the parent process has written output to a stream but the content of the stream's buffer has not yet been written to the hardware device, the buffered output will be copied into the child process.
That is why the output of the program simple_fork.c (see previous notes) has replicated output lines.

Shell View of Process Control (Bourne Shell)

Shell is a program that reads in commands and interprets them to create and manage processes.

Some commands are built into the shell. Others require the shell to fork off a child process to exec a program that does the command.

A shell command language can be quite general, allowing a person to write sophisticated programs that are executed directly by the shell.

An Example

An abbreviated version of simple_fork_shell_script:

#!/bin/sh
echo the process number of this shell is $$
if [ $# -eq 0 ]; then 
   echo command has no parameters
else 
   echo command has $# parameters: $@
fi
if [  \( $# -eq 1 \) -a \( "$1" = "dofork" \) ]; then
   echo doing recursive call of this script, in background
   ./simple_fork_shell_script child&
   echo waiting for child $!
   wait $! 
elif [  \( $# -eq 1 \) -a \( "$1" = "child" \) ]; then
   echo child is about to sleep for one second
   sleep 1
   echo child is done sleeping
fi
if [  \( $# -eq 1 \) -a \( "$1" = "comments" \) ]; then
cat - > tmpfile <<EOF
# Try this script with parameter "dofork", and then again
# with parameter "comments".
# Does the script work equally well on Linux and Solaris?
EOF
cat tmpfile; rm tmpfile
fi

New Shell Features Used in the Example

shell command-line arguments ($#, $@, $1)
running a program in "background"
getting the process id of the last backgrounded command (%!)
waiting for a process
redirection of shell text to a file
sleep program (not built-in)

The notation $# designates the number of command-line arguments (like arc in the API). The notation $1 designates the first argument. The notation $@ designates the entire list of arguments.

Normally, when the shell executes a program it waits for the program to complete before going on the the next line of shell command input. You can cause the shell to go on by putting the command into the "background". The character '&'at the end of the command "./simple_fork_shell_script child&" puts the child process into the background. You can also do this interactively, by interrupting the shell program using the Ctl-Z combination at the keyboard, and then executing the shell command "bg".

The notation $! gives the process id of the last-executed command. It is used here in "wait $!" to wait for the backgrounded child process.

The program cat just copies from its standard input file to its standard output file.

The notation "cat - > tmpfile <<EOF" is used to tell the shell to run the program cat and pipe to standard input of cat everything in the shell input stream up to the next line containing only the delimiter, "EOF". The net effect is to create a file named tmpfile containing the portion of the shell shell script between this command and the delimiter line.

Make certain you understand everything in this example. When you have finished, you should look at another example, test1_sh. This shell script was written by the instructor of a course for testing student submissions of a program simple_fork.c which were submitted using the shell-script submit1_sh.

API View of Process Control

simple_fork1.c - example using fork() without fflush() and waitpid()
simple_fork2.c - example using fflush()
simple_fork3.c - example using waitpid()
print_child_status.c - code to examine status value returned from waitpid()
fork_wait.c - example using waitpid() and kill() with job-control signals
child.c - simple program for use with execve()
fork_exec.c - example using execve()

The first four examples were already covered, though not in exactly the same form, in previous class meetings. In particular, the example fork_wait.c was covered in the Friday recitation class. I plan to review these examples, briefly, in class. We will then look at the example fork_exec.c in detail.

The latter example introduces the execve() system call.

The execve() Operation

if (execve ("child", argument_list, environ) == -1) {
   perror ("execve failed");
}
fprintf (stderr, "execution should never reach here");

execve() is the most general one of several forms of exec
all forms have the same basic effect
cause the current process to execute a new program, from the beginning
the new program is loaded from a file
the file is specified by a pathname
this form allows specification of the argument list (argv) and the envnronment variables (environ) of the new program

The execve() function is the most general of several different forms of exec call. It allows the caller to specify the file to be executed, the arguments to be passed, and the environment variable values to be passed. All of the exec operations cause the current process (i.e., the one that requests exec) to load and start executing a new program, from the beginning. If the operation succeeeds, there is no return from the call, since the old program is replaced by a new one. The operation can fail if the new program file cannot be found, or if the caller does not have permission to execute it.

The fork() and exec() operations are used together to create a new process that executes a new program. This is at the heart of the implementation of a shell program. For example, consider what happens when a user enters a command "ls -a". The shell forks off a child, and the child execs the program named "ls" with the argument list {"ls", "-a", NULL}.

The waitpid() Operation

#include <sys/types.h>
#include <sys/wait.h>
pid_t wait(int *status);
pid_t waitpid(pid_t pid, int *status, int options);

e.g.,

waitpid (child, &status, 0);

waitpid() is the most general one of two forms of wait
causes the current process to block (unless WNOHANG is specified)
until a child process terminates (or if stops, if WUNTRACED is specified)
the status of the child process is returned via new the variable pointed to by status
the options parameter is an integer that is interpreted as a vector of bits
option WNOHANG means don't block
option WUNTRACED means includes processes that are stopped

Macros for use with waitpid() Status

WIFEXITED(status)
true (non-zero) iff the child exited normally
WEXITSTATUS(status)
only valid if WIFEXITED(status) is true
return code of the child, provided by exit() or a return statement in the main program
WIFSIGNALED(status)
true iff the child process was terminated by a signal
WTERMSIG(status)
only valid if WIFSIGNALED(status) is true
the signal that caused the child to terminate
WIFSTOPPED(status)
true if the child process is stopped
this is only possible if the call was done using WUNTRACED
WSTOPSIG(status)
only valid if WIFSTOPPED(status) is true
the signal that caused the child to stop

These macros have two forms: which I'll call WIFxxxxx and Wyyyyyyy, and they occur in pairs. The WIFxxxxx macro returns the C equivalent of a Boolean value, suitable for use in an if-statement. (Hence the "IF" in the name.) You use it to determine whether the status is of the xxxxx kind. If the value is true, you use the paired Wyyyyyy macro to extract the appropriate component of the status.

Specifically, the value of WEXITSTATUS(x) is valid if and only iff WIFEXITED(x) returns a nonzero value. If it is valid, the value returned by WEXITSTATUS(x) is whatever value the main program returned with a return statement, or whatever value was returned via the terminating call to exit(). For example, if x is the status returned by waitpid() for a process that terminated by calling exit (-2), WIFEXITED(x) should return a true value, and WEXITSTATUS(x) should return -2. If x is the status returned by waitpid() for a process that terminates due to some other reason than a "normal exit" (meaning return from the main function or a call to exit or _exit), such as a segmentation violation, WIFEXITED(x) should be false.

Linux Internal View of Process Control (kernel version 2.4.7)

The file sched.h contains the declarations of some of the important data structures used to implement processes
The conceptual "process control block" is implemented by the structure task_struct.
The file sched.c contains the the Linux dispatcher (called the scheduler in Linux terminology)
The dispatcher is implemented by the function schedule. The version shown here has had the conditional code for SMP (symmetric multiprocessing) deleted, to make the logic more readable.

You should look through these files after reading about the process control block and dispatcher in Chapter 3 of the Stallings' textbook. You should be able to recognize or guess the meaning of some of the names. For example, find the following items listed in the slide below in task_struct.

You can click on each item to find the declaration highlighted in red, but please do this only after you have first tried looking through the unmarked file to find the answer.

In some of these cases, the information is scattered over more than one component. For example, the process state is more than just the one component highlighted.

Process Control Block Components in Linux

In the declaration of struct task_struct

Process state
Process priority (this is actually only used for real-time processes)
Identifier of this process
Identifier of the process that created this process (parent process)
User identifier
Links to next process in queue

By browsing this code you will see that writing a real operating system requires attention to quite a few more details than are mentioned in operating systems texts, and get a feeling for the real level of complexity. However, don't be overwhelmed. After a few months of study, any computer science student -- including you -- can learn to read and navigate this code.

T. P. Baker. ($Id)