| Linux Kernel & Device Driver Programming |
Examples of User-space solution:
Advantages to user-space solution:
Drawbacks/limitations of user-space solution:
These all boil down to primarily to performance and secondarily to security. That is, you can create kernel services to intermediate or otherwise export the required memory and device access to user space, but these mechanisms add overhead (time delay and code size). They also must be protected carefully, since the they can be abused (just as kernel modules can abuse direct access to kernel internals).
See example "Hello, World" module hello.c. Review features of this module.
See how preprocessor symbols are defined in Makefile.
With earlier kernel versions I would have suggested reading this Makefile to see and understand the make features used, the style, and the techniques. This still may be a good idea, but it has grown so complicated that too fully understand it would take more time and effort than can be expected in this course.
Do you understand what happens with namespace pollution?
What if two modules define and export the same symbol?
The following older-style module macros are deprecated, but you are still likely to see them used:
Beware that the terms "kernel space" and "user space" are used sometimes broadly and sometimes narrowly.
The broad use corresponds to hardware execution modes. Kernel code is executed in "kernel mode" (a.k.a. "supervisor mode" and "privileged mode", which means it has access to a larger set of hardware instructions and a larger range of addresses. called "supervisor mode" or "privileged mode") than user code.
The narrow use corresponds to virtual address spaces. The Unix/Linux convention is to divide the mapped portion of a process's virtual address space int two parts: "kernel memory" and "user memory". In kernel mode both parts are accessible, but in user mode only the user memory is accessible.
Because Unix/Linux links access to kernel memory with kernel execution mode, the terms "kernel space", "kernel mode", and "kernel memory" are interchangeable in many contexts.
Are you clear on synchronous versus asynchronous transfers into the OS kernel?
Are you clear on what reentrancy means?
Are you clear on the relationships and limitations of the different forms of concurrency control: nonpreemption, interrupt masking, and spin locks?
printk ("The process is \"%s\" (pid %i)\n", current->comm, current->pid);
Why is current a macro?
See one definition of this macro in linux/include/asm-i386/current.h.
The actual implementation has evolved, and is likely to continue to evolve. It started out as a simple global variable, but that did not work when Linux was extended to SMP systems, since each processor has its own current process. In the SMP version of Kernel 2.4 it was implemented via a hardware instruction that identified the current CPU and then used that as an index to find the right current process/task in an array. In Kernel 2.6.11 it seems the implementation has evolved further, based on a convention that the kernel stack space of each thread (the term that has replaced process and task) has a fixed (small) size. The task descriptor of each thread is laid out contiguous with the kernel stack of the thread, so it can be found by looking at the high-order bits of the stack pointer register value. At least this is what is done on the i-386 architecture. It will vary on other architectures. For example, on the SPARC there is a dedicated register for this purpose.
For information on how to read (and write) gcc inline assembly code, see the GCC Inline Assembly Howto or the gcc info pages.
In Kernel 2.4 it was fairly easy to compile kernel modules independently from the kernel source tree.
In Kernel 2.6 module compilation is done relative to a kernel source tree, and the kernel Makefile takes care of these details for you.
The misc-modules Makefile shows how you can invoke that makefile from a location outside the kernel source tree to compile code outside the source tree.
Observe that this makefile is called recursively. That is, one first calls make with this makefile; inside, there is a second call to make, using the kernel makefile; the kernel makefile then calls/includes this makefile back again to find out the set of modules to be compiled in the current directory.
If you try to compile a kernel module independently, you need to do the set-up work that is done by the kernel Makefile, including the following:
#include <linux/config.h> #ifdef CONFIG_SMP #define __SMP__ #endif
See also the file Documentation/CodingStyle for Linus' recommendations on coding style. Not everyone agrees with all of it. (I happen to agree with all but the rule about always indenting 8 spaces and the rule about putting start-function braces on a new line, and would add to it the rule to *never* use tabs.) In any case, when you are maintaining somebody else's code you need to preserve the established coding conventions. If you ever want your kernel module to be included in the baseline Linux distribution, you would be wise to follow Linus' style guideliness.
The introduction of a new ".ko" format for kernel object files came after kernel 2.4. With the 2.4 kernel the convention was to use the ".o" format for kernel modules.
These are usually used with preprocessor conditionals (#ifdef) to write modules code that will compile and run with both older and newer kernel versions.
VERSIONFILE = $(INCLUDEDIR)/linux/version.h
VERSION = $(shell awk -F\" '/REFL/ {print $$2}' $(VERSIONFILE))
INSTALLDIR = /lib/modules/$(VERSION)/misc
...
install:
install -d $(INSTALLDIR)
install -c $(OBJS) $(INSTALLDIR)
The modules go into a subdirectory of /lib/modules whose name matches the kernel version.
int __init my_init (void) {
int err;
/* registration takes a pointer and a name */
err = register_this (ptr1, "skull");
if (err) goto fail_this;
err = register_that (ptr2, "skull");
if (err) goto fail_that;
err = register_those (ptr3, "skull");
if (err) goto fail_those;
return 0; /* success */
fail_those:
unregister_that (ptr2, "skull");
fail_that:
unregister_this (ptr1, "skull");
fail_this:
return err; /* propagate the error */
}
"__init" tells gcc to put the code into a special section of the load module, which the kernel may unload after the code executes.
Standard error codes defined in <linux/errno.h>.
void __exit my_cleanup (void) {
unregister_those (ptr3, "skull");
unregister_that (ptr2, "skull");
unregister_this (ptr1, "skull");
}
It is customary, but not required, to unregister in reverse order of registration.
The "__exit" in the example above tells gcc to put the code into a special section of the load module, which does not need to be loaded if the module is statically linked into the kernel. In the example below there is not "__exit" because the function my_cleanup may be called during module initialization.
struct something *item1;
struct somethingelse *item2;
int stuff_ok;
void my_cleanup (void) {
if (item1) release_thing (item1);
if (item2) release_thing2 (item2);
if (stuff_ok) unregister_stuff ();
return;
}
int __init my_init (void) {
int err = -ENOMEM;
item1 = allocate_thing (arguments);
item2 = allocate_thing2 (arguments2);
if (!item1 || !item2) goto fail;
err = register_stuff (item1, item2);
if (!err) stuff_ok = 1; else goto fail;
return 0; /* success */
fail:
my_cleanup ();
return err;
}
Also, look out if you find yourself wanting to write co-dependent modules. "Therein lies madness".
What happens with double decrement?
Take a look at the /proc/modules file while some modules are in the kernel.
autofs 13700 0 (autoclean) (unused) 3c59x 31312 1 iptable_filter 2412 0 (autoclean) (unused) ip_tables 15864 1 [iptable_filter] mousedev 5688 0 (unused) keybdev 2976 0 (unused) hid 22404 0 (unused) input 6240 0 [mousedev keybdev hid] usb-ohci 22088 0 (unused) usbcore 80512 1 [hid usb-ohci] ext3 72960 3 jbd 56752 3 [ext3] raid1 16300 3
0000-001f : dma1 0020-003f : pic1 0040-005f : timer 0060-006f : keyboard 0070-007f : rtc 0080-008f : dma page reg 00a0-00bf : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : ide1 01f0-01f7 : ide0 02f8-02ff : serial(auto) 0376-0376 : ide1 03c0-03df : vga+ 03f6-03f6 : ide0 03f8-03ff : serial(auto) 0cf8-0cff : PCI conf1 1000-103f : 3Com Corporation 3c905 100BaseTX [Boomerang] 1000-103f : 00:0c.0 1050-1053 : Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System Controller 2000-2fff : PCI Bus #01 2000-20ff : ATI Technologies Inc 3D Rage Pro AGP 1X/2X f000-f00f : Advanced Micro Devices [AMD] AMD-766 [ViperPlus] IDE f000-f007 : ide0 f008-f00f : ide1
int check_region(unsigned long start, unsinged long len); struct resource *request_region(unsigned long start, unsigned long len, char *name); void release_region(unsigned long start, unsigned long len);
This is a portable simplification. In kernel version 2.6.11 linux/ioport.h these are actually macros:
#define request_region(start,n,name) __request_region(&ioport_resource, (start), (n), (name)) extern struct resource * __request_region(struct resource *, unsigned long start, unsigned long n, const char *name); #define check_region(start,n) __check_region(&ioport_resource, (start), (n)) extern int __check_region(struct resource *, unsigned long, unsigned long); #define release_region(start,n) __release_region(&ioport_resource, (start), (n)) extern void __release_region(struct resource *, unsigned long, unsigned long);
#include <linux/ioport.h>
#include <linux/errno.h>
static int skull_detect (unsigned int port; unsigned int range)
{
int err;
if ((err = check_region (port, range)) = 0) return err; /* busy */
if (skull_probe_hw (port, range) != 0) return -ENODEV; /* not found */
request_region (port, range, "skull"); /* "can't fail" */
return 0;
}
static void skull_release (unsigned int port, unsigned int range)
{
release_region (port, range);
}
00000000-0009f7ff : System RAM 0009f800-0009ffff : reserved 000a0000-000bffff : Video RAM area 000c0000-000c7fff : Video ROM 000dc000-000dcfff : Advanced Micro Devices [AMD] AMD-766 [ViperPlus] USB 000dc000-000dcfff : usb-ohci 000e0000-000effff : Extension ROM 000f0000-000fffff : System ROM 00100000-3ffeffff : System RAM 00100000-0026b019 : Kernel code 0026b01a-0037b9c3 : Kernel data 3fff0000-3ffffbff : ACPI Tables 3ffffc00-3fffffff : ACPI Non-volatile Storage f4001000-f4001fff : Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System Controller f4100000-f41fffff : PCI Bus #01 f4100000-f4100fff : ATI Technologies Inc 3D Rage Pro AGP 1X/2X f5000000-f5ffffff : PCI Bus #01 f5000000-f5ffffff : ATI Technologies Inc 3D Rage Pro AGP 1X/2X f8000000-fbffffff : Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System Controller fec00000-fec0ffff : reserved fee00000-fee00fff : reserved fff80000-ffffffff : reserved
int check_mem_region (unsigned long start, unsigned long len); int request_mem_region (unsigned long start, unsigend long len, char * name); int release_mem_region (unsgined long start, unsigned long len);
if (check_mem_region (mem_add,mem_size)) {
printk ("drivername: memory already in use\n"); return -EBUSY;
}
request_mem_region (mem_addr, mem_size, "drivername");
Is there a dangerous race condition here?
declared in linux/ioport.h:
struct resource {
const char *name;
unsigned long start, end;
unsigned long flags;
struct resource *parent, *sibling, *child;
}
/*
* port range: the device can reside between 0x280 and 0x300, in steps of 0x10.
* It uses 0x10 ports.
*/
#define SKULL_PORT_FLOOR 0x280
#define SKULL_PORT_CEIL 0x300
#define SKULL_PORT_RANGE 0x010
/*
* the following function performs autodetection, unless a speciic
* value was assigned by insmod to "skull_port_base"
*/
static int skull_port_base = 0; /* 0 forces autodetection */
MODULE_PARM (skull_port_base, "i");
MODULE_PARM_DESC (skull_port_base, "Base I/O port for skull");
static int skull_find_hw (void) /* returns the # of devices */
{
/* base is either the load-time value of the first trial */
int base = skull_port_base ? skull_port_base : SKULL_PORT_FLOOR;
int result = 0;
/* loop one time if value assigned; try them all if autodetecting */
do {
if (skull_detect (base, SKULL_PORT_RANGE) == 0) {
skull_init_board (base); result++;
}
base += SKULL_PORT_RANGE: /* prepare for next trial */
} while (skull_port_base == 0 && base < SKULL_PORT_CEIL);
return result;
}
The above concepts are illustrated in the example module skull.
| © 2003, 2004, 2005 T. P. Baker. ($Id: ch2.html,v 1.1 2010/06/07 14:29:15 baker Exp baker $) |