linking.html

COP4610: Operating Systems & Concurrent Programming

Linking & Loading

Context: Programming Language Implementation

Direct Execution
- done by hardware
- program must be translated into machine instructions, in main memory
- example: compilation, linking, and loading of C program
Interpretation
- done by software
- program is never translated all the way to machine instructions
- example: interpretation of shell scripts by a shell program
- example: interpretation of Java byte code by a JVM
- ultimately depends on a directly executed intepreter program

There are various strategies for implementing programming languages. One important division is between direct execution and interpretation. The operating system provides the support for direct execution. Interpreters can be built on that.

Compilation, Assembly, Linking and Loading

Before a program can be executed, it must be translated from the programming language in which it was written (source code) to the machine instructions of the hardware that is to execute it, and the executable program must be loaded into main memory.

The first phase of the process is translation from source code to machine code. This is done by a compiler or assembler. Of course, the compilation process itself generally is broken into phases. Of these a C programmer needs to be aware of at least three of the phases, because they may detect different kinds of errors.

The C preprocessor handles the directives that begin with "#", such as #define, #include, and #ifdef. It expands the #includes and the macro calls. The output is still C source code, but without the preprecessor directives and macro calls. If you use the -E option with gcc, only the preprocessing phase is performed, and the expanded source code is put out. On most Unix systems, the preprocessor can also be invoked using the name cpp.

The central part of the compiler (which itself has several internal phases, translates the source code to assembly language. If you use the -S options with gcc, processing stops at the end of this phase, and the assembly language code is put out.

The assembler translates the assembly code to machine code, in the form of an object module. An object module is not an entire program. It is a translation of whatever is fed to the compiler (including preprocessor, compiler and assembler). If the compiler is fed just one function subprogram, the object modules contains just the machine code for that one function subprogram. It the gcc compiler is given assembly code as input, it will invoke just the assembly phase. (Using this and the -S option earlier, a person can hand-optimize the translation.) In Unix systems, the assembler usually can also be invoked using the name as.

Compilation, Linking and Loading

In order to make a complete program, various separately compiled (and assembled) modules must be combined. This is done by the linker. Finally, the linked program is loaded into memory by the loader.

In a Unix operating system, a process invokes the loader by calling one of the exec functions. The loader reads the load module from the specified file into memory, performing any necessary dynamic linkage (see further below) with libraries, and the kernel causes the process that called exec to start executing the new program.

The linker is also known as the "linkage editor", because it modifies (edits) the code to link the modules together.

In reality, this process is more complicated. In order to be able to share and reuse copies of common code, the system supports object libraries. The archiver (called "ar" in Unix) combines individual object files together into a library. The library is organized with an index, that allows the linker to find individual subprograms quickly.

Libraries

statically linked
dynamically linked
shareable

Libraries have evolved from a static model to a very dynamic model. The types of libraries correspond to the types of linkers and loaders that operate on them. We will cover each type of library in the context of the corresponding type of linking and/or loading.

Static Linker Functions

combine object modules into a single load module
linkage = replacement of symbolic addresses (relative to other module origins)
- by absolute addresses
  if used with an absolute loader
- by offsets relative to the new load module origin
  if used with a relocating loader
relocation = replacement of addresses relative to local module origin
- by absolute addresses
  if used with an absolute loader
- by offsets relative to the new load module origin
  if used with a relocating loader

The linker does these two main things. The details depend on whether it is preparing a relocatable module or an absolute load module.

Example of Static Linkage & Link-Time Relocation: Source Code

f.c:

int f (int x) {
  if (x <= 1) return x;
  return x * f (x - 1);
}

main.c:

#include <stdio.h>
extern int f  (int x);
int i = 2;
char format[] = "f (%d) = %d\n";
int main (int argc, char const *argv[]) {
  int j;
  j = f (i);
  printf (format, i, j);
  return 0;
}

Suppose these two source files are each compiled into separate object files. We will look at what the linker has to do to combine them into a single load module.

Example of Static Linkage & Link-Time Relocation: Linker Actions

The linker decides to lay out the code for main and f contiguously, one immediately following the other in memory. This means the address of all the instructions in f need to be relocated by the length of the code for main, which is 59 bytes.

The function f contains a recursive call to itself. The entry address of f relative to its own code is zero, but after the relocation it is 60. Thus, the zero in the address field of the call instruction in the code for f must be replaced by 60. This adjustment is necessary because of the relocation of the code for f relative to the original object module that contained it.

The function main also contains a call to f. At the time main was compiled, that address of f was unknown. The linker now knows the address of f is 60 relative to the origin of the new object module, so it replaces the address field of the call instruction for f in main by 60. Thus, the zero in the address field of the call instruction in the code for f must be replaced by 60. This adjustment is necessary to link of the code for f to the code for main.

Complete Example

In directory examples/linkage:

Makefile
main.c - C source code
main.E - expanded source code
main.s - assembly code (i386)
main.o - object code
main.obj - human-readable object code dump
main.nm - dump of link names
f.c - C source code
f.E - expanded source code
f.s - assembly code
f.o - object code
f.obj - human-readable object code dump
f.nm - dump of link names
linkex - loadable main program
linkex.obj - human-readable object code dump
linkex.out - verbose output of ld when linking the program

Not all the linker actions were shown in the previous figure. It needed to relocate references to the variable i, and the string format. It also needed to link in the various references to libraries and C run time system code. The latter code includes the "wrapper" that initializes global data, calls "main", and on return passes the return value back to the operating system.

So that you can see all the details, links to the source code and files produced at various stages of the process are provided here and in the diagrams below.

Compilation Phases & Intermediate Files

	main.c main.s main.E main.obj main.nm f.c f.s f.E f.obj f.nm

Using various options for the C compiler, we can view the results of the intermediate phases of the compilation process, including the expanded object code, the assembly code, and the individual object files.

Since object files are in binary format, it is not possible to print them out or display them directly. The utility program provided for humans to view the contents of object files is objdump. The objdump output provided here is just one of many forms of report that that objdump utility can produce. It has many parameters and options, that allow an experienced person to extract and present the various data components of an object file in many other different ways.

Linking All the (Static) Components

By selecting the linker options "-Xlinker -verbose" and "-Xlinker -print-map" we can see what the linker, which is invoked as the final stage of gcc, did to produce the final executable program. The verbose linker informational output (which one does not ordinarily see) wasredirected to file linkex.out. That output shows that the linker opened the following object and library files:

LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crt1.o
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crti.o
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/crtbegin.o
LOAD main.o
LOAD f.o
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/libgcc.a
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../libc.so
LOAD /lib/libc.so.6
LOAD /usr/lib/libc_nonshared.a
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/libgcc.a
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/crtend.o
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crtn.o

The files with "crt" in their name are components of the C-language run time system. The files with "lib" in their name are libraries.

Function f Before Relocation & Linking

Offset	Hex Instr	Symbolic Instruction
0:	55	push	%ebp
1:	89 e5	mov	%esp,%ebp
3:	83 7d 08 01	cmpl	$0x1,0x8(%ebp)
7:	7f 07	jg	10 <f+0x10>
9:	8b 45 08	mov	0x8(%ebp),%eax
c:	89 c0	mov	%eax,%eax
e:	eb 18	jmp	28 <f+0x28>
10:	83 ec 0c	sub	$0xc,%esp
13:	8b 45 08	mov	0x8(%ebp),%eax
16:	48	dec	%eax
17:	50	push	%eax
18:	e8 fc ff ff ff	call	19 <f+0x19> [f]
1d:	83 c4 10	add	$0x10,%esp
20:	89 c0	mov	%eax,%eax
22:	0f af 45 08	imul	0x8(%ebp),%eax
26:	89 c0	mov	%eax,%eax
28:	c9	leave
29:	c3	ret
2a:	89 f6	mov	%esi,%esi

The first column of the table contains offsets, relative to the beginning of the object module. The second column of the table displays the actual code on the instruction starting at the given offset, as hexadecimal bytes. The third column displays the op-code of the instruction, in symbolic format. The fourth column displays the operand(s) of the instruction, in symbolic format.

This function is very short and simple. The only address to which the linker and loader need to pay attention is the reference to the entry address of f itself (shown in red), in the recursive call. The actual value of this address will not be known until the program is loaded.

Function main Before Relocation & Linking

Offset	Hex Instr	Symbolic Instruction
0:	55	push	%ebp
1:	89 e5	mov	%esp,%ebp
3:	83 ec 08	sub	$0x8,%esp
6:	83 ec 0c	sub	$0xc,%esp
9:	ff 35 00 00 00 00	pushl	0x0 [i]
f:	e8 fc ff ff ff	call	10 <main+0x10> [f]
14:	83 c4 10	add	$0x10,%esp
17:	89 c0	mov	%eax,%eax
19:	89 45 fc	mov	%eax,0xfffffffc(%ebp)
1c:	83 ec 04	sub	$0x4,%esp
1f:	ff 75 fc	pushl	0xfffffffc(%ebp)
22:	ff 35 00 00 00 00	pushl	0x0 [i]
28:	68 00 00 00 00	push	$0x0 [format]
2d:	e8 fc ff ff ff	call	2e <main+0x2e> [printf]
32:	83 c4 10	add	$0x10,%esp
35:	b8 00 00 00 00	mov	$0x0,%eax
3a:	c9	leave
3b:	c3	ret

This function contains more relocatable addresses, plus some external references. The address references to which the linker and loader need to pay attention are all shown in red. They include references to f, i, format, and printf.

Relocations of Symbols

Offset	Symbol
0x08048400	main
0x0804843c	f
0x080494ec	format
0x080494e8	i

The linker must decide on an order in which to lay out the objects declared in the two source modules, as they will appear in the new load module. The table shows the addresses that it chose for main, f, format, and i. This information came from the output of the linkage phase of gcc, which is in the file linkex.out.

After Relocation and Linking

Offset	Hex Instr	Symbolic Instruction
- main -
0x08048400	55	push	%ebp
0x08048401	89 e5	mov	%esp,%ebp
0x08048403	83 ec 08	sub	$0x8,%esp
0x08048406	83 ec 0c	sub	$0xc,%esp
0x08048409	ff 35 08 04 94 e8	pushl	0x0 [i]
0x08048400	e8 08 04 84 3c	call	10 <main+0x10> [f]
0x08048414	83 c4 10	add	$0x10,%esp
0x08048417	89 c0	mov	%eax,%eax
0x08048419	89 45 fc	mov	%eax,0xfffffffc(%ebp)
0x0804841c	83 ec 04	sub	$0x4,%esp
0x0804841f	ff 75 fc	pushl	0xfffffffc(%ebp)
0x08048422	ff 35 08 04 94 e8	pushl	0x0 [i]
0x08048428	68 08 04 94 ec	push	$0x0 [format]
0x0804842d	e8 fc ff ff ff	call	2e <main+0x2e> [printf]
0x08048432	83 c4 10	add	$0x10,%esp
0x08048435	b8 00 00 00 00	mov	$0x0,%eax
0x0804843a	c9	leave
0x0804843b	c3	ret
- f -
0x0804843c	55	push	%ebp
0x0804843d	89 e5	mov	%esp,%ebp
0x0804843f	83 7d 08 01	cmpl	$0x1,0x8(%ebp)
0x08048443	7f 07	jg	10 <f+0x10>
0x08048445	8b 45 08	mov	0x8(%ebp),%eax
0x08048448	89 c0	mov	%eax,%eax
0x0804844a	eb 18	jmp	28 <f+0x28>
0x0804844c	83 ec 0c	sub	$0xc,%esp
0x0804844f	8b 45 08	mov	0x8(%ebp),%eax
0x08048452	48	dec	%eax
0x08048453	50	push	%eax
0x08048454	e8 08 04 84 3c	call	19 <f+0x19> [f]
0x08048459	83 c4 10	add	$0x10,%esp
0x0804845c	89 c0	mov	%eax,%eax
0x0804845d	0f af 45 08	imul	0x8(%ebp),%eax
0x08048461	89 c0	mov	%eax,%eax
0x08048463	c9	leave
0x08048464	c3	ret
0x08048466	89 f6	mov	%esi,%esi
. . .
0x080494e8	00 00 00 00	i
0x080494ec	66 20 28 25 64 29 20 3d 20 25 64 0a 00	"f (%d) = %d\n"

The above table shows the code for main and f after static linking and relocation, into a single load module. Note that the reference to printf (in red) is not shown as resolved. That is because the linker found printf in the (dynamic) shareable object library libc.so. That library will be loaded at run time, if and when it is needed. Filling in the entry address of printf needs to be deferred until that time. In practice, the static loader will replaced by an address that will call the dynamic loader the first time it is referenced. The dynamic loader will load the code for printf (possibly along with other code) or find a copy of printf already in the system memory, and patch the address so that subsequent calls go directly to the loaded copy of printf.

Loading

Before a program can execute as a process, the program load module must be loaded into main memory.

Types of Loaders

absolute loader
relocating loader
dynamic relocating loader
dynamic linking & relocating loader

Loaders have evolved in complexity, and the object file formats have evolved in parallel. We will look at several points in the evolution of loaders and object files.

Absolute Loader

oldest and most primitive
load module always goes into the same location in main memory
location is predetermined at link time
addresses in code are all absolute
addresses are bound at link time

The first loaders only loaded (absolute) code into memory.

An absolute loader requires that the allocation of memory to load modules (not just processes) be fixed at the time the load modules are produced. In a modern operating system this limitation is not acceptable.

However, absolute loaders still have a use, for the first stage (or first few stages) of loading the operating system. This process, known as bootstrap loading, brings the first module (or few modules) of the operating system into main memory.

Relocating Loader

allows module to be loaded into any location in main memory
but not moved after initial loading
location is determined at load time
addresses in code are all relative
load module format must identify locations of addresses that need relocation,
and the bases to which they are relative
the latter is called the relocation dictionary
addresses are bound at load time

This type of loader was the first big advance in memory management. It allowed programs to be loaded at different locations in memory.

Dynamic Relocating Loader

allows module to be moved after initial loading
needed to support swapping
addresses are kept in relative form
base is added to address at run time
special hardware is required
1. base & bounds register model
2. multiple segment/base register model (e.g. IBM 360, Intel x86)

This type of loader was the next big advance in memory management. It allowed programs to be loaded at one location, swapped out and then reloaded at a different location in memory. This function was essential for the development of variable-partition multiprogramming. With the development of hardware support for virtual memory swapping could be performed without asking the loader to do relocation. However, relocation still needs to be done by modern loaders, because of the next step in evolution.

Dynamic Linking Loader

Performs linking on demand
May be done at load time, all at once
May be done at run time, incrementally
- can save main memory space
- can save time
- can defer binding of installation-dependent details
- can make system updates and bug-fixes much simpler
- can break working software
- can also be a security hole
Requires more information in object files (loader tables)

The main motivation for dynamic linking is to allow libraries to be updated without recompiling the load modules of the programs that use them. In order to provide this extra functionality, the loader is extended to include the functionality of a linker. In addition to loading, it must do both linkage and relocation of library modules. This kind of loader was originally developed in non-virtual memory systems, but the functionality is still needed with virtual memory. The only difference is that the relocation and linkage are done in terms of virtual addresses rather than (absolute) physical addresses.

Dynamically linked libraries are very powerful. They allows deferral of installation-dependent details to execution time. For example, see the Solaris/Linux file /etc/nsswitch.conf, which allows the system administrator to modify the functions that are used to look up network-related information, such as the host-to-IP-address mapping. The line of test in file nsswitch.conf shown in red specifies which source of information to use for the mapping between hostnames and IP addresses. Depending on what is set in this configuration file, different versions of functions like gethostbyname are called at run time.

This is also very important as a mechanism for OS vendors to provide bug-fixes and upgrades to a library without forcing relinking of every program that uses the library. You can see how painful it would be if you were in the business of selling licensed application software, and every time the OS vendor updated a library you had to relink your software and send updated copies to all your customers.

A down side of this is that an application that is working fine may stop working when a new library version is installed, because the application depends on an incidental feature of the library, that has changed in the new version. In theory, this should not happen if the application program was written carefully, in strict adherence to the library API. In practice, it still does happen. One way to reduce the chance of this happening is for the compiler/linker to specify the version of the shared object library on which a load module depends, and for the OS to keep around both old and new versions. In this way, and application that wants to protect against library upgrades can do so, at the loss of whatever benefits those upgrades provide.

Another big risk is that if a malefactor manages to replace a shared object library by one with intentional security holes (e.g., back door, trojan horse, time bomb), a trusted application may be compromised. This is especially true because a system with shared object libraries will provide a way to dynamically specify the search path that should be used to find shareable object libraries (in Unix, the environment variable LD_LIBRARY_PATH is used). By changing an environment variable, a malefactor could trick an application program into executing functions from a compromised library created by the malefactor, even without permission to write in the directories normally searched by the dynamic loader.

For the latter reason, system programs that carry with them special privilege (such the "ps" command in Unix, which can read the process table, and the "lpr" program, which can write to the print spool directory), are always statically linked. Take a look in the Unix directory "/sbin" for examples of statically linked programs.

Shareable Object Libraries

several processes can share one copy in memory
code must be reentrant
enabled by dynamic linking and virtual memory (to come)
allows code sharing between processes
allows OS to make better use of main memory
allows OS to reduce the amount of disk I/O needed for loading and swapping processes

Shareable object libraries are now the norm for dynamically linked libraries, because of the obvious advantages. In the Unix environment, a shareable object library is recognizable by the customary filename ending in ".so".

Static vs Dynamic Linking and Loading

In a system that supports dynamic loading and shareable object libraries, the linking and loading is divided into two parts. The static linker links some object modules together to form the initial load image of a program. It does not link in the libraries that are in a shareable format. Once the program is running, when it makes the first call to a function in a shareable library, the dynamic loader brings the necessary module(s) into main memory, and links them with the existing code. If a module has already been loaded, for a different process, and is still in memory, the dynamic loader just links it in. (This asssumes that shareable code modules are not self-modifying, and in memory they are protected against modification by processes, so both processes can share execute-only access to the same copy of the shareable code.)

Review of Binding Times for Functions

Programming/coding
Compilation/assembly
Static linking (load module creation)
Program load
Run time (dynamic loading on first reference)

With all these different options for linking and loading, which are under the shared control of the original application programmer, the operating system, the system administrator and the user, it is possible to bind details that affect various aspects of a program's behavior at a variety of times. Choosing the time at which a given detail is bound is an important part of the design of software. Later binding gives more flexibility to the system administrator or end user, along with more risk of unforseen effects from incorrect bindings.

T. P. Baker. ($Id)