COP4610: Operating Systems & Concurrent Programming up ↑

Linking & Loading

 

Context: Programming Language Implementation


There are various strategies for implementing programming languages. One important division is between direct execution and interpretation. The operating system provides the support for direct execution. Interpreters can be built on that.


Compilation, Assembly, Linking and Loading

*

Before a program can be executed, it must be translated from the programming language in which it was written (source code) to the machine instructions of the hardware that is to execute it, and the executable program must be loaded into main memory.

The first phase of the process is translation from source code to machine code. This is done by a compiler or assembler. Of course, the compilation process itself generally is broken into phases. Of these a C programmer needs to be aware of at least three of the phases, because they may detect different kinds of errors.

The C preprocessor handles the directives that begin with "#", such as #define, #include, and #ifdef. It expands the #includes and the macro calls. The output is still C source code, but without the preprecessor directives and macro calls. If you use the -E option with gcc, only the preprocessing phase is performed, and the expanded source code is put out. On most Unix systems, the preprocessor can also be invoked using the name cpp.

The central part of the compiler (which itself has several internal phases, translates the source code to assembly language. If you use the -S options with gcc, processing stops at the end of this phase, and the assembly language code is put out.

The assembler translates the assembly code to machine code, in the form of an object module. An object module is not an entire program. It is a translation of whatever is fed to the compiler (including preprocessor, compiler and assembler). If the compiler is fed just one function subprogram, the object modules contains just the machine code for that one function subprogram. It the gcc compiler is given assembly code as input, it will invoke just the assembly phase. (Using this and the -S option earlier, a person can hand-optimize the translation.) In Unix systems, the assembler usually can also be invoked using the name as.


Compilation, Linking and Loading

*

In order to make a complete program, various separately compiled (and assembled) modules must be combined. This is done by the linker. Finally, the linked program is loaded into memory by the loader.

In a Unix operating system, a process invokes the loader by calling one of the exec functions. The loader reads the load module from the specified file into memory, performing any necessary dynamic linkage (see further below) with libraries, and the kernel causes the process that called exec to start executing the new program.

The linker is also known as the "linkage editor", because it modifies (edits) the code to link the modules together.

In reality, this process is more complicated. In order to be able to share and reuse copies of common code, the system supports object libraries. The archiver (called "ar" in Unix) combines individual object files together into a library. The library is organized with an index, that allows the linker to find individual subprograms quickly.


Libraries


Libraries have evolved from a static model to a very dynamic model. The types of libraries correspond to the types of linkers and loaders that operate on them. We will cover each type of library in the context of the corresponding type of linking and/or loading.


Static Linker Functions


The linker does these two main things. The details depend on whether it is preparing a relocatable module or an absolute load module.


Example of Static Linkage & Link-Time Relocation: Source Code

f.c:

int f (int x) {
  if (x <= 1) return x;
  return x * f (x - 1);
}

main.c:

#include <stdio.h>
extern int f  (int x);
int i = 2;
char format[] = "f (%d) = %d\n";
int main (int argc, char const *argv[]) {
  int j;
  j = f (i);
  printf (format, i, j);
  return 0;
}

Suppose these two source files are each compiled into separate object files. We will look at what the linker has to do to combine them into a single load module.


Example of Static Linkage & Link-Time Relocation: Linker Actions

*

The linker decides to lay out the code for main and f contiguously, one immediately following the other in memory. This means the address of all the instructions in f need to be relocated by the length of the code for main, which is 59 bytes.

The function f contains a recursive call to itself. The entry address of f relative to its own code is zero, but after the relocation it is 60. Thus, the zero in the address field of the call instruction in the code for f must be replaced by 60. This adjustment is necessary because of the relocation of the code for f relative to the original object module that contained it.

The function main also contains a call to f. At the time main was compiled, that address of f was unknown. The linker now knows the address of f is 60 relative to the origin of the new object module, so it replaces the address field of the call instruction for f in main by 60. Thus, the zero in the address field of the call instruction in the code for f must be replaced by 60. This adjustment is necessary to link of the code for f to the code for main.


Complete Example

In directory examples/linkage:


Not all the linker actions were shown in the previous figure. It needed to relocate references to the variable i, and the string format. It also needed to link in the various references to libraries and C run time system code. The latter code includes the "wrapper" that initializes global data, calls "main", and on return passes the return value back to the operating system.

So that you can see all the details, links to the source code and files produced at various stages of the process are provided here and in the diagrams below.


Compilation Phases & Intermediate Files

*
*

Using various options for the C compiler, we can view the results of the intermediate phases of the compilation process, including the expanded object code, the assembly code, and the individual object files.

Since object files are in binary format, it is not possible to print them out or display them directly. The utility program provided for humans to view the contents of object files is objdump. The objdump output provided here is just one of many forms of report that that objdump utility can produce. It has many parameters and options, that allow an experienced person to extract and present the various data components of an object file in many other different ways.


Linking All the (Static) Components

*

By selecting the linker options "-Xlinker -verbose" and "-Xlinker -print-map" we can see what the linker, which is invoked as the final stage of gcc, did to produce the final executable program. The verbose linker informational output (which one does not ordinarily see) wasredirected to file linkex.out. That output shows that the linker opened the following object and library files:

LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crt1.o
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crti.o
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/crtbegin.o
LOAD main.o
LOAD f.o
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/libgcc.a
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../libc.so
LOAD /lib/libc.so.6
LOAD /usr/lib/libc_nonshared.a
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/libgcc.a
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/crtend.o
LOAD /usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crtn.o

The files with "crt" in their name are components of the C-language run time system. The files with "lib" in their name are libraries.


Function f Before Relocation & Linking

Offset Hex Instr Symbolic Instruction
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 7d 08 01 cmpl $0x1,0x8(%ebp)
7: 7f 07 jg 10 <f+0x10>
9: 8b 45 08 mov 0x8(%ebp),%eax
c: 89 c0 mov %eax,%eax
e: eb 18 jmp 28 <f+0x28>
10: 83 ec 0c sub $0xc,%esp
13: 8b 45 08 mov 0x8(%ebp),%eax
16: 48 dec %eax
17: 50 push %eax
18: e8 fc ff ff ff call 19 <f+0x19> [f]
1d: 83 c4 10 add $0x10,%esp
20: 89 c0 mov %eax,%eax
22: 0f af 45 08 imul 0x8(%ebp),%eax
26: 89 c0 mov %eax,%eax
28: c9 leave  
29: c3 ret  
2a: 89 f6 mov %esi,%esi

The first column of the table contains offsets, relative to the beginning of the object module. The second column of the table displays the actual code on the instruction starting at the given offset, as hexadecimal bytes. The third column displays the op-code of the instruction, in symbolic format. The fourth column displays the operand(s) of the instruction, in symbolic format.

This function is very short and simple. The only address to which the linker and loader need to pay attention is the reference to the entry address of f itself (shown in red), in the recursive call. The actual value of this address will not be known until the program is loaded.


Function main Before Relocation & Linking

Offset Hex Instr Symbolic Instruction
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 08 sub $0x8,%esp
6: 83 ec 0c sub $0xc,%esp
9: ff 35 00 00 00 00 pushl 0x0 [i]
f: e8 fc ff ff ff call 10 <main+0x10> [f]
14: 83 c4 10 add $0x10,%esp
17: 89 c0 mov %eax,%eax
19: 89 45 fc mov %eax,0xfffffffc(%ebp)
1c: 83 ec 04 sub $0x4,%esp
1f: ff 75 fc pushl 0xfffffffc(%ebp)
22: ff 35 00 00 00 00 pushl 0x0 [i]
28: 68 00 00 00 00 push $0x0 [format]
2d: e8 fc ff ff ff call 2e <main+0x2e> [printf]
32: 83 c4 10 add $0x10,%esp
35: b8 00 00 00 00 mov $0x0,%eax
3a: c9 leave  
3b: c3 ret  

This function contains more relocatable addresses, plus some external references. The address references to which the linker and loader need to pay attention are all shown in red. They include references to f, i, format, and printf.


Relocations of Symbols

Offset Symbol
0x08048400 main
0x0804843c f
0x080494ec format
0x080494e8 i

The linker must decide on an order in which to lay out the objects declared in the two source modules, as they will appear in the new load module. The table shows the addresses that it chose for main, f, format, and i. This information came from the output of the linkage phase of gcc, which is in the file linkex.out.


After Relocation and Linking

Offset Hex Instr Symbolic Instruction
- main -
0x08048400 55 push %ebp
0x08048401 89 e5 mov %esp,%ebp
0x08048403 83 ec 08 sub $0x8,%esp
0x08048406 83 ec 0c sub $0xc,%esp
0x08048409 ff 35 08 04 94 e8 pushl 0x0 [i]
0x08048400 e8 08 04 84 3c call 10 <main+0x10> [f]
0x08048414 83 c4 10 add $0x10,%esp
0x08048417 89 c0 mov %eax,%eax
0x08048419 89 45 fc mov %eax,0xfffffffc(%ebp)
0x0804841c 83 ec 04 sub $0x4,%esp
0x0804841f ff 75 fc pushl 0xfffffffc(%ebp)
0x08048422 ff 35 08 04 94 e8 pushl 0x0 [i]
0x08048428 68 08 04 94 ec push $0x0 [format]
0x0804842d e8 fc ff ff ff call 2e <main+0x2e> [printf]
0x08048432 83 c4 10 add $0x10,%esp
0x08048435 b8 00 00 00 00 mov $0x0,%eax
0x0804843a c9 leave  
0x0804843b c3 ret  
- f -
0x0804843c 55 push %ebp
0x0804843d 89 e5 mov %esp,%ebp
0x0804843f 83 7d 08 01 cmpl $0x1,0x8(%ebp)
0x08048443 7f 07 jg 10 <f+0x10>
0x08048445 8b 45 08 mov 0x8(%ebp),%eax
0x08048448 89 c0 mov %eax,%eax
0x0804844a eb 18 jmp 28 <f+0x28>
0x0804844c 83 ec 0c sub $0xc,%esp
0x0804844f 8b 45 08 mov 0x8(%ebp),%eax
0x08048452 48 dec %eax
0x08048453 50 push %eax
0x08048454 e8 08 04 84 3c call 19 <f+0x19> [f]
0x08048459 83 c4 10 add $0x10,%esp
0x0804845c 89 c0 mov %eax,%eax
0x0804845d 0f af 45 08 imul 0x8(%ebp),%eax
0x08048461 89 c0 mov %eax,%eax
0x08048463 c9 leave  
0x08048464 c3 ret  
0x08048466 89 f6 mov %esi,%esi
. . .
0x080494e8 00 00 00 00 i
0x080494ec 66 20 28 25 64 29 20 3d 20 25 64 0a 00 "f (%d) = %d\n"

The above table shows the code for main and f after static linking and relocation, into a single load module. Note that the reference to printf (in red) is not shown as resolved. That is because the linker found printf in the (dynamic) shareable object library libc.so. That library will be loaded at run time, if and when it is needed. Filling in the entry address of printf needs to be deferred until that time. In practice, the static loader will replaced by an address that will call the dynamic loader the first time it is referenced. The dynamic loader will load the code for printf (possibly along with other code) or find a copy of printf already in the system memory, and patch the address so that subsequent calls go directly to the loaded copy of printf.


Loading

*

Before a program can execute as a process, the program load module must be loaded into main memory.


Types of Loaders


Loaders have evolved in complexity, and the object file formats have evolved in parallel. We will look at several points in the evolution of loaders and object files.


Absolute Loader


The first loaders only loaded (absolute) code into memory.

An absolute loader requires that the allocation of memory to load modules (not just processes) be fixed at the time the load modules are produced. In a modern operating system this limitation is not acceptable.

However, absolute loaders still have a use, for the first stage (or first few stages) of loading the operating system. This process, known as bootstrap loading, brings the first module (or few modules) of the operating system into main memory.


Relocating Loader

  • allows module to be loaded into any location in main memory
  • but not moved after initial loading
  • location is determined at load time
  • addresses in code are all relative
  • load module format must identify locations of addresses that need relocation,
    and the bases to which they are relative
  • the latter is called the relocation dictionary
  • addresses are bound at load time
*

This type of loader was the first big advance in memory management. It allowed programs to be loaded at different locations in memory.


Dynamic Relocating Loader


This type of loader was the next big advance in memory management. It allowed programs to be loaded at one location, swapped out and then reloaded at a different location in memory. This function was essential for the development of variable-partition multiprogramming. With the development of hardware support for virtual memory swapping could be performed without asking the loader to do relocation. However, relocation still needs to be done by modern loaders, because of the next step in evolution.


Dynamic Linking Loader


The main motivation for dynamic linking is to allow libraries to be updated without recompiling the load modules of the programs that use them. In order to provide this extra functionality, the loader is extended to include the functionality of a linker. In addition to loading, it must do both linkage and relocation of library modules. This kind of loader was originally developed in non-virtual memory systems, but the functionality is still needed with virtual memory. The only difference is that the relocation and linkage are done in terms of virtual addresses rather than (absolute) physical addresses.

Dynamically linked libraries are very powerful. They allows deferral of installation-dependent details to execution time. For example, see the Solaris/Linux file /etc/nsswitch.conf, which allows the system administrator to modify the functions that are used to look up network-related information, such as the host-to-IP-address mapping. The line of test in file nsswitch.conf shown in red specifies which source of information to use for the mapping between hostnames and IP addresses. Depending on what is set in this configuration file, different versions of functions like gethostbyname are called at run time.

This is also very important as a mechanism for OS vendors to provide bug-fixes and upgrades to a library without forcing relinking of every program that uses the library. You can see how painful it would be if you were in the business of selling licensed application software, and every time the OS vendor updated a library you had to relink your software and send updated copies to all your customers.

A down side of this is that an application that is working fine may stop working when a new library version is installed, because the application depends on an incidental feature of the library, that has changed in the new version. In theory, this should not happen if the application program was written carefully, in strict adherence to the library API. In practice, it still does happen. One way to reduce the chance of this happening is for the compiler/linker to specify the version of the shared object library on which a load module depends, and for the OS to keep around both old and new versions. In this way, and application that wants to protect against library upgrades can do so, at the loss of whatever benefits those upgrades provide.

Another big risk is that if a malefactor manages to replace a shared object library by one with intentional security holes (e.g., back door, trojan horse, time bomb), a trusted application may be compromised. This is especially true because a system with shared object libraries will provide a way to dynamically specify the search path that should be used to find shareable object libraries (in Unix, the environment variable LD_LIBRARY_PATH is used). By changing an environment variable, a malefactor could trick an application program into executing functions from a compromised library created by the malefactor, even without permission to write in the directories normally searched by the dynamic loader.

For the latter reason, system programs that carry with them special privilege (such the "ps" command in Unix, which can read the process table, and the "lpr" program, which can write to the print spool directory), are always statically linked. Take a look in the Unix directory "/sbin" for examples of statically linked programs.


Shareable Object Libraries


Shareable object libraries are now the norm for dynamically linked libraries, because of the obvious advantages. In the Unix environment, a shareable object library is recognizable by the customary filename ending in ".so".


Static vs Dynamic Linking and Loading

*

In a system that supports dynamic loading and shareable object libraries, the linking and loading is divided into two parts. The static linker links some object modules together to form the initial load image of a program. It does not link in the libraries that are in a shareable format. Once the program is running, when it makes the first call to a function in a shareable library, the dynamic loader brings the necessary module(s) into main memory, and links them with the existing code. If a module has already been loaded, for a different process, and is still in memory, the dynamic loader just links it in. (This asssumes that shareable code modules are not self-modifying, and in memory they are protected against modification by processes, so both processes can share execute-only access to the same copy of the shareable code.)


Review of Binding Times for Functions


With all these different options for linking and loading, which are under the shared control of the original application programmer, the operating system, the system administrator and the user, it is possible to bind details that affect various aspects of a program's behavior at a variety of times. Choosing the time at which a given detail is bound is an important part of the design of software. Later binding gives more flexibility to the system administrator or end user, along with more risk of unforseen effects from incorrect bindings.

T. P. Baker. ($Id)