memory.html

COP 4610: OPERATING SYSTEMS & CONCURRENT PROGRAMMING

Memory Management

Context and Motivation

Main memory management to processes
- the original problem
- no longer exists (per se) with paged virtual memory systems
- still occurs in other contexts
Other contexts
- kernel memory management
- within-process memory management
  - programming language support (e.g., malloc(), free()
  - application-specific sub-allocators
- disk space management

As we study memory management, we will focus mainly on the original problem for which memory management techniques and algorithms were developed: the allocation of main (originally core) memory to processes. As we will see, work on that problem led to the development of hierarchical paged virtual memory, which has replaced the earlier-developed techniques for allocating main memory to processes.

We study these techniques in the context of the original problem, to provide historical context and motivation for the developments that came later, but that is not all, by far. All of the techniques, data structures, and algorithms, developed for allocating main memory tor processes still have other important applications, outside of the allocation of main memory to processes.

These outline notes are based largely on the presentation in Chapter 7 of Stallings' book. Please see the book for the full-text explanation of most of the outline points. Detailed notes are provided here on just a few points, where we want to go beyond Stallings.

Main Memory Management

part of memory reserved for the OS
rest of memory allocated by OS to processes
goal: allocate efficiently
maximize system throughput => keep CPU busy => accomodate as many processes as possible

The complexity of memory management has grown as the functional complexity of operating systems has grown. The original problem was limited to allocation of main memory.

Memory Implementation Hierarchy

Cache (may also have several levels)
generally managed by hardware
Main memory (RAM)
managed by the OS
Secondary storage (e.g., disk), a.k.a. swap space
managed by the OS

Over time, the searches for faster performance and lower cost have led to the development of hierarchical memory implementations. This has extended the scope of the allocation problem, to the different levels individually, and to the interactions between levels. For the moment, we limit ourselves to one level of memory.

Memory Management Requirements

Relocation
Protection
Sharing
Logical organization
Physical organization

Stallings quotes and earlier textbook (Lister & Eager) as characterizing the memory management problem in terms of five requirements, each of which we will consider in some detail.

Memory Management Requirements: Relocation

Programmer (compiler, assembler, linker) does not know where the program will be placed in memory when it is executed
During execution program may be swapped to disk and returned to main memory at a different location (relocated)
Memory references must be translated in the code to actual physical memory address

Addressing Requirements for a Process

The figure shows a schematic "process image", i.e. all the memory-resident system data structures, the executable code of the process, the executing stack (assuming a single-threaded process), and the global static and dynamically allocated data areas. It is shown as a contiguous region of memory for simplicity, not to imply that must be so. The arrows from point to point within the process image indicate several different ways in which the code or data may depend on the location where the process image is loaded into memory. In particular, the red arrows indicate cases of addresses embedded in code, which will need adjustment to reflect the (re)location of the code in memory.

In order to fully understand the relocation requirement, one needs to understand the program compilation, assembly, linkage, and loading process. To keep the flow of the presentation on memory management going, Stallings' places his explanation of this material in an appendix. We also choose to cover that material in a separate set of notes, but choose to provide a bit more detail than Stallings. Please see the separate notes on Linking and Loading.

Memory Management Requirements: Protection

Prevent referencing memory locations in another process without permission
Cannot predict and check addresses of memory references prior to execution, since
- The program could be relocated
- Programs can compute addresses dynamically
Must be checked during execution
Requires hardware support

Memory Management Requirements: Sharing

Should alllow several processes to access the same portion of memory, with appropriate protection
Allow processes to share (execute-only) access to a single copy of a program in memory
Allow processes to share read-write access to some region, e.g., a database
Relocation mechanisms can be adapted to allow sharing

Memory Management Requirements: Logical Organization

Programs are written in modules
Modules can be written and compiled independently, into libraries
Modules can be loaded into memory dynamically, if and when needed
Copies of modules can be shared between processes
Different degrees of protection can be given to modules (read-only, execute-only)
This requirement is the main reason for segmented memory architectures

Memory Management Requirements: Physical Organization

Memory is organized in to levels: main memory and secondary memory
Main memory: fast, smaller, the only one directly accessible
Secondary memory: slower, larger, must have data copied to main memory before use
Not enough main memory may be available for a program and its data
This means data and code must be migrated dynamically between levels
OS -- not the application program -- must manage this
- Overlaying under control of application is too complicated and error prone
- Programmer does not know how much space will be available or where it will be

Historical Evolution Leading to Virtual Memory Systems

Fixed Partitioning
Dynamic Partitioning
Simple Paging
Simple Segmentation
Virtual Memory with Paging
Virtual Memory with Segmentation
Virtual Memory with Paging and Segmentation

We will review the historical evolution of memory management in operating systems, leading up to the virtual memory organization found in most present-day systems.

Fixed Partitioning

Any process whose size is less than or equal to the partition size can be loaded into an available partition
If all partitions are full, the operating system can swap a process out of a partition
If a program does not fit in a partition the programmer must use overlays
Main memory use is inefficient
Any program, no matter how small, occupies an entire partition
This wasting of space is called internal fragmentation

The Placement Problem

Decide which of the free memory partitions to allocate to fill a request

Equal-size partitions

because all partitions are of equal size, it does not matter which partition is used

Unequal-size partitions

can assign each process to the smallest partition within which it will fit
queue for each partition
processes are assigned in such a way as to minimize wasted memory within a partition

Placement Algorithms

Dynamic Partitioning

Partitions are of variable length and number
Process is allocated exactly as much memory as required
Eventually get holes in the memory. This is called external fragmentation
Must use compaction to shift processes so they are contiguous and all free memory is in one block
This requires dynamic relocation

Dynamic Partitioning

Dynamic Relocation

When program loaded into memory the actual (absolute) memory locations are determined
If a process may occupy different partitions (after swapping), that means different absolute memory locations during execution
Compaction will also cause a program to occupy a different location in memory
Moving to a different absolute memory location means relocation

Different Kinds of Addresses

Logical

reference to a memory location that is independent of the current location of the process code and data in memory
translation must be made to the current physical address

Relative

special case of logical address
address expressed as a location relative to some known "zero" point, such as the start of the memory occupied by the process

Physical

the absolute address or actual location in main memory

In order to discuss dynamic relocation precisely, we must distinguish several different kinds of addresses.

Hardware Support for Dynamic Relocation

The hardware support for dynamic relocation shown in the figure includes a base register and the bounds register, which define the starting and ending addresses of the region of memory addressable by the process. These values are set when the process is loaded and when the process is swapped in. The value of the base register is added (by the hardware) to a relative address to produce an absolute address. The resulting address is compared with the value in the bounds register. If the address is not within bounds, an interrupt is generated to the operating system.

Is there a good reason for the base register to point to the location just past the process control block?

Is there a good reason for the bounds register to point to a location before the stack?

Three Simple Placement Algorithms for Dynamic Partitioning

Best-fit

Chooses the block that is closest in size to the request
Worst performer overall
Smallest-sized fragment is left, requires
- longer searches
- more frequent memory compaction

First-fit

Fastest
May have many process loaded in the front end of memory that must be searched over when trying to find a free block

Next-fit

More often allocate a block of memory at the end of memory where the largest block is found
The largest block of memory is broken up into smaller blocks
Compaction is required to obtain a large block at the end of memory

These are three obvious simple algorithms that an operating system could apply when it must decide which of the currently free blocks of memory to allocate to a given process.

The implementation of C malloc() and C++ new must solve the same problems for memory management within the virtual address space of a single process as the memory management portions of operating systems did for management of memory between processes before the advent of virtual memory. In other words, understanding the available algorithms and the performance issues here will help you understand performance of programs and classes/libraries that use dynamic memory allocation. One big difference is that within a process we generally cannot compact free memory (Why not?), so external fragmentation is a more serious matter.

The same issues come up again inside the OS kernel, for allocation of data structures that are used to keep track of processes, files, etc.. An understanding if issues such as internal/external fragmentation and runtime overheads of placement algorithms here will help you understand why kernel-internal data structures are mostly equal-sized, and pre-allocated.

Example Memory Configuration Before and After Allocation

The Replacement Problem

Choose which process to preempt memory from (i.e., swap out) when space is needed

When is replacement needed with dynamic memory partitioning? That is, under what circumstances would an OS need to swap out a process before the process has completed execution?

What are some reasonable policies for replacement with dynamic memory partitioning?

Replacement is discussed in more detail under virtual memory systems.

Buddy System

One of several schemes for managing memory
A compromise between fixed and dynamic partitioning
Has many applications, not limited to allocating main memory to processes
Entire space available is treated as a single block of size 2^m
If a request of size s such that 2^k-1 < s <= 2^k, entire block of size 2^k is allocated

Otherwise block is split into two equal buddies
Process continues until smallest block of size greater than or equal to s is generated

For comparison, see the boundary tag scheme for storage management, explained in the notes on memory allocation.

Example of Buddy System

The figure shows how a buddy system would work on a storage area of size 64K, though a series of three allocations and two deallocations. Notice that any request for a block whose size is not a power of two results in some wasted storage (fragmentation). This is usually classified as internal fragmentation, since we effectively allocate the entire block os size 2^k.

Addresses of Buddy Blocks Viewed as Implicit Tree

Observe that the pairs of buddies form a conceptual binary tree, and that the starting address of a block can be read off the path from the root of the tree.

There is no need to build an explicit tree using pointers.

The address of the buddy of a block of size 2^k is obtained by "flipping" (i.e., changing from zero to one or from one to zero) the kth bit of the block's address.

For example, the size 4 (2²) buddy of the block with address 01000 is the block with address 01100, and vice versa.

The Gnu malloc() Implementation

See the code in fsuthreads/malloc.

The Gnu implementation of malloc() is an example of a full-fledged implementation of a storage management system. This version is not the latest, but it should still be useful as an example. You may benefit from reading it, in particular, free.h and malloc.c.

Paging

Partition memory into small equal-size chunks and divide each process into the same size chunks
The chunks of a process's logical address space are called pages
and chunks of real memory are called frames
Operating system maintains a page table for each process

contains the frame location for each page in the process
logical address = (page_number, offset_within_page)
physical address = page_table[page_number] * page_size + offset_within_page

Hardware support is needed to rapidly translate logical address to physical address

associative memory, sometimes called translation lookaside buffer speeds lookup
trap to OS for unmapped pages

Paging is a technique that solves the placement problem by dividing memory up into equal-sized chunks, called pages. These days, it is applied along with virtual memory (coming up in Chapter 8), but it can also be done without virtual memory. Take care to be sure you understand the difference between paging and virtual memory.

Assignment of Process Pages to Free Frames

The figure shows how (external) fragmentation is no longer a problem, without any need for compaction. When 5-page process D needs to be brought in, it can use the three pages from process B, plus two more pages. They do not need to be contiguous.

Page Tables for Example

To implement paging, the system needs to maintain a page table for each process, which contains the beginning address of each page of logically contiguous region of memory occupied by the process.

Segmentation

Segments correspond to data and code modules
All segments of all programs do not have to be of the same length
There is a maximum segment length
Logical address = (segment, offset)
Segment table serves similar role to page table
physical address = segment_table[segment_number] + offset_within_segment
Since segments are not equal, segmentation is similar to dynamic partitioning
May be combined with paging, to solve placement fragmentation problem
Logical address = (segment, page, offset)

© 2002, 2005 T. P. Baker & Florida State University. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means without written permission. (Last updated by $Author: cop4610 $ on $Date: 2002/09/26 20:06:06 $.)