servers.html

++ servers.html

Aperiodic Scheduling

These notes make use of mathematical symbols that may not be visible with the default settings on some versions of some browsers.

Symbol	Symbol Name	Symbol	Symbol Name	Symbol	Symbol Name
∼	sim	∑	sum	⌈	lceil
⌉	rceil	⌊	lfloor	⌋	rfloor
→	rarr	⇒	rArr	∞	infin
<	lt	≤	le	>	gt
≥	ge	≠	ne	δ	delta
Δ	Delta	∂	part	τ	tau
π	pi	φ	phi	∞	infin

If a blank space appears to the left of any of the names in the table above, or the symbol looks strange, try changing the font settings on your browser or using a different browser.

Two Kinds of Non-Periodic Tasks

sporadic: released randomly, with no useful upper bound on interarrival time, but has a minimum interarrival time
aperiodic: released randomly, with no useful upper or lower bound on interarrival time

Usage of the terms "sporadic" and "aperiodic" is not uniform across all authors in real time systems.

In these notes, I use the term "aperiodic task" with the above meaning, and specify whether we have hard or soft deadlines if that is important. Note, however, that Jane Liu's terminology is different, and diverges from that of most authors on this subject. She classifies as "periodic task" what we are calling a sporadic task, classifies as "sporadic task" what we are calling an aperiodic task with hard deadlines, and classifies as "aperiodic task" what we are calling an aperiodic task with soft deadlines.

Some writers use "aperiodic" as a synonym for "not periodic", in which case sporadic tasks would be a subset of aperiodic tasks.

Note also that the "sporadic server" scheduling algorithm, which we will cover below is not specifically linked to any of the above definitions of "sporadic task".

By a "useful" lower bound I mean a bound that can be used effectively to analyze the task's schedulability as a periodic task. For example, we may know that the minimum interarrival time for an aperiodic task is ten microseconds, because that is the minimum hardware interrupt latency, but that too short to be a useful lower bound for schedulability analysis. It would be better to analyze that task as of type 2.

Aperiodic Tasks

are not periodic
therefore, cannot be guaranteed to be served within a hard deadline — the deadlines must be soft
arrivals and executions times are generally viewed as random
if we know the arrival and service time distributions we may be able to predict the response time distribution
in particular, we may be able to determine the average response time

How to Determine Average Case Response Times

Build a system, instrument it, and test it
Build a software simulation model, instrument it, and run it
Build a mathematical model, and analyze it (Queueing Theory)

Projected Versus Actual Response Time

Projecting performance from limited experimental or simulation data can be inaccurate. The figure (from William Stallings' Operating Systems text) compares actual response time of a system against a projection made from actual data up to a load of 0.5. The projection is based on fitting a 3rd-order polynomial to the known data.

Queueing Theory

The statistical performance of an aperiodic server can be analyzed using techniques from queueing theory.

(Please see separate notes on Queueing Theory for background.)

In particular, all of the aperiodic server algorithms approximate the performance of an ideal processor sharing model in which the aperiodic server executes on its own dedicated processor at speed e_s/p_s.

This approximation can be used to estimate the values of e_s and p_s required to achieve a given average response time.

In general, an aperiodic server cannot do better than such an ideal queueing theory model over a long time interval. However, certain aperiodic server scheduling techniques can do a little bit better locally, since during a burst of server activity the server is actually executing at speed 1, rather than e_s/p_s.

Aperiodic Server Scheduling Algorithms

General (compatible with RM or EDF scheduling of periodics)
- Background server
- Polling server
- Deferrable server
- Sporadic server
Other deadline based (compatible with EDF scheduling of periodics)
- Constant bandwidth server
- Total bandwidth server

The basic idea of aperiodic server scheduling algorithms is for the aperiodic service requests to be executed by a server thread. The server is a sort of virtual processor, which is allocated a limited fraction of the processor bandwidth. Aperiodic service requests (aperiodic jobs) are put onto a queue (buffer) for the server. The server takes jobs from its queue and executes them, as fast as it can using its available CPU time. How long it takes for the server to get around to serving a given job depends on the server's bandwidth and the amount of other work that is ahead of it in the server's queue.

The differences in algorithms have to do with how the processor bandwidth is allocated to the server. In all cases, the server is supposed to be able to be treated (more or less) as if it were a periodic task, insofar as its effects of the server's workload on the scheduling of other tasks in the system.

We will look at several of the aperiodic server scheduling algorithms in more detail.

Basic Idea of Aperiodic Server

Jobs of an aperiodic task are not required to complete before the next job is released. Jobs are queued up, and executed by a server (a schedulable entity, e.g., thread) as soon as the server can get to them.

The server is allocated a budget, which is consumed when aperiodic jobs are executed, and replenished from time to time. When the server's budget is down to zero, or the server is preempted by some higher priority task, requests must wait to be served. Different server scheduling schemes use different rules to determine when the server budget is replenished, when the server budget can be carried forward, and the priority at which the server executes.

The ideal is that the server be scheduled in a way that allows us to model the server's impact on the schedulability of other tasks as if the server were a periodic task. In particular, we would like to have it be true that in any time interval of length p_s the server cannot preempt other tasks for longer than e_s execution time.

Kinds of Aperiodic Servers

We can use the aperiodic server concept with:

fixed-priority (e.g., rate or deadline monotonic) scheduling
each server has a fixed priority
deadline-based scheduling
each server has a virtual deadline, which is updated dynamically
(this is not the deadline of the jobs executed by the server)

Deadline server principle: In a deadline scheduling environment, each allocation of server budget comes with an associated deadline, which governs the scheduling of the server while it is using that portion of its budget.

Background Server

Aperiodic requests are served at lowest priority. They do not interfere with any periodic task, but aperiodic response times are quite variable and generally long.

Polling (Periodic) Server

Unused budget is given up whenever all requests are served.

Budget can never exceed e_s.

Scheduling interference on other tasks is no worse than that of periodic task with period p_s and e_s.

EDF Polling Server Example

In this example, the server polls with period 5 and execution budget 2, for a total utilization of 40%. There are two periodic tasks. &tau₁ (light grey) has period 10, execution time 2, and utilization 20%. τ₂ (dark grey) has period 15, execution time 6, and utilization 40%. Ties in deadline are resolved in favor of the polling server.

Fixed Priority and EDF Versions of Polling Server

fixed priority: server is a periodic fixed-priority task
the priority of the server determines its effect on schedulability of other tasks
deadline: server is a periodic task, scheduled by EDF
the execution time, period, and relative deadline of the server determine is effect on schedulability of other tasks

In both cases, the effect of the server on the delays encountered by other tasks is no worse than as if the server were replaced by a normal periodic task with the same period and execution time.

Polling Server Problem

Response time may be nearly a whole period later, even under very light load.

This comparatively long response time is one of the factors that motivated the invention of other server scheduling models, including the deferrable servers.

Polling Server Response Time vs. Period

The segmented lines are average response times of a simulated polling server, with average interarrival time 3600 and 69% periodic load, for several different server periods.

The lower smooth curve is the expected performance of a processor 100% dedicated to serving the aperiodic load, with no periodic load, as predicted by the M/M/1 queueing theory model.

The limiting case of a time-sliced server, with infinitesimal time slicing interval and the same processor utilization as the polling server, as predicted by the M/M/0.31 queueing theory model, is shown by the higher one of the two smooth curves.

The test involved the following set of ten periodic tasks:

i	p_i	e_i
1	5400	600
2	14400	600
3	24000	500
4	43200	6400
5	54000	3000
6	67500	4500
7	72000	7200
8	90000	3600
9	108000	2000
10	120000	10500

Deferrable Server

This is a form of "bandwidth preserving" server. The unused budget is retained until the next replenishment point. This allows requests to be served earlier.

The deferrable server may be scheduled with a fixed priority, or scheduled according to EDF. In the latter case, the deadline is incremented by p_s at each budget replenishment.

Deferrable Server Problem

Workload of server may be as high as 2e_s in a single interval of length p_s. This means impact of server on schedulability of other tasks is greater than that of a periodic task with period p_s and e_s.

The deferrable server model falls short of our ideal for schedulability analysis, in that we need to allow for the possibility of the above case, even though it probably will not happen very often. This reduces the schedulable utilization (i.e., the utilization at which we know all tasks are schedulable) of the system.

Fixed Priority Deferrable Server Analysis

The worst-case interference caused by the server occurs when there are back-to-back server executions at the end of the time interval.

w_i(t) = e_i + e_s +

⌈

t-e_s

p_s

⌉

e_s+

i-1

&sum

k=1

⌈

p_k

⌉

e_k

EDF Deferrable Server Analysis

The main difference is that the worst-case deadline DS time demand is not quite as high as the fixed-priority DS time demand.

Fixed Priority						Deadline
e_s +	⌈	t-e_s p_s	⌉	e_s		e_s +	⌊	t-e_s p_s	⌋	e_s

That is, a job with deadline past the interval will not have high enough priority to preempt a job with deadline inside the interval.

Deadline Deferrable Server example

In this example, the server polls with period 5 and execution budget 1.63, for a total utilization of 32.6%. There are two periodic tasks. &tau₁ (light grey) has period 10, execution time 2, and utilization 20%. τ₂ (dark grey) has period 15, execution time 6, and utilization 40%. Ties in deadline are resolved in favor of the deferrable server.

Deadline Deferrable Schedulability Test (Ghazalie & Baker)

A set of periodic hard-deadline tasks is schedulable by EDF with a DDS server of period p_s and replenishment e_s if the following holds for each task τ_k.

∑

i=1

(

e_i

min(d_i,p_i)

)+(1+

p_s-e_s

d_k

)

e_s

t_s

≤ 1

This follows from the interfering demand bound above, using the fact that &lfloor x ⌋ ≤ x, and that for task τ_k to miss a deadline the interfering demand within any busy interval of length d_k must exceed e_k.

Maximum allowable DDS utilization related to server period

These are simulation results for three specific periodic task sets, at three different utilization levels. The the server utilization level is that above which periodic tasks start to miss deadlines.

Note that the effect of back-to-back server executions, reflected in the term (p_s-e_s)/d_k) of the schedulability test, requires a reduction of the server utilization for large server periods to prevent missed periodic deadlines.

Sporadic Server

Replenishments are done in chunks, delayed according to when the chunks were used. Load never exceeds e_s in any interval of length p_s, so impact on schedulability of other tasks is no greater than that of a collection of periodic tasks with period p_s and total execution time e_s.

The above figure shows what J. Liu calls the SpSL sporadic server variant. There are several other variants. One reason for the variations is that the original version, which was published by Sprunt, Sha, and Lehoczky (SpSL), had a defect not caught by the referees, which (under certain circumstances) would allow the server interference to exceed that of a periodic task with the same period and budget, due to early replenishment. The POSIX variant also a defect with a similar effect.

Note that the term "sporadic" in Sporadic Server scheduling does not imply the method is limited to scheduling sporadic tasks, or that the serve behaves exactly like a sporadic task. Unfortunately, this does add to the confusion over exactly what "sporadic task" means.

Variations on Fixed-Priority Sporadic Server

J. Liu's "Simple" Sporadic Server
Sporadic/Background Server
Cumulative Replenishment
Sprunt's (SpSL) Sporadic Server (the original, but defective)
J. Liu's corrected SpSLSporadic Server
IEEE POSIX Sporadic Server

"Simple" Fixed-Priority Sporadic Server

The server only executes when it has a nonzero budget
The server starts out with budget e_s
The server retains its full budget from the point of replenishment until the next time the server starts to execute
While the server executes it consumes its budget at the rate of real time
The server retains the unused portion of its budget so long as the system is executing at or above the server priority, i.e., when it is executing the server or any task with higher priority
After the server has executed at least once since its last replenishment, it consumes its budget at the rate of real time whenever the system is running below the server priority
t_r = the latest actual replenishment time
(Every time the scheduler replenishes the server's budget, the budget is set to e_s and t_r is set to the current time.)
t_f = first instant after t_r at which server begins to execute
(Every time the server first begins to execute after a replenishment t_f is set to the current time and a replenishment is scheduled for time t_e+p_s
t_e = latest effective replenishment time
This is set to at time t_f.
It is set to t_f, unless the system was running at priority ≥ just before t_f.
In the latter case it is set to max (BEGIN, t_r) if END = t_f, where:
- BEGIN = the beginning of the most recent level-s busy interval, meaning the system is busy running jobs with priority ≥ the server.
  This will be a time at which the system goes from being idle or executing a task with lower priority than the server to executing the server or a higher priority task.
- END = the end of the most recent level-s busy interval, or ∞ if the interval has not yet ended
Replenishment is done early, at the start of the next busy period, if the system becomes idle before a scheduled replenishment.
A scheduled replenishment is postponed, until the server's remaining budget is consumed, if the scheduled replenishment time occurs while the server still retains budget.

J. Liu calls this server "simple" because there is never more than one replenishment pending and the replenishment always brings the budget all the way up to e_s.

However, as she says later, this "simple" version is actually more complicated to implement.

For this reason, and because we have limited classroom time, I may decide to skip over this version of the algorithm in class.

Warning: Since I found Jane's description rather complicated and difficult to follow, I tried to rewrite her description in my own words, above. I hope I have not changed the meaning. If in doubt, please refer to the textbook for her exact description.

Also please see the textbook for the informal proof of correctness of this algorithm, i.e., the proof that there can be no time interval of length p_s in which the server uses more than e_s execution time.

Background Simple Sporadic Server

Uses available background time. Same rules apply as for SSS, except:

As long as the periodic task set is idle, the server keeps an execution budget of e_s
(i.e., so long as it can use background time it does not consume any of its budget)
The budget is replenished at the start of every idle interval, and t_r is set at the end of the idle interval.

This makes sense when there is only one sporadic server in the system.

Cumulative Replenishment

J. Liu explains another variation on the SS, in which the server is allowed to accumulate replenishments, up to a point. The rules are subtle, to prevent the problem we had with the deferrable server. We will not cover these rules in class. You should read them in the text.

Sprunt, Sha & Lehoczky (SpSL) Sporadic Server

The basic idea is to allow the server to consume and replenish its budget in chunks of less than e_s.

As J. Liu explains, the original published description of the algorithm was over-simplified, to the point of being incorrect. This is a valuable lesson.

The server has a capacity of e_s, with initial replenishment time t_r = 0

Consumption:

The server consumes its budget whenever it executes.

Replenishment:

The time for the next replenishment to occur is set when:
1. the priority level of the server becomes active, if the server has replenishment time available, or
2. the server budget is replenished, if the server's priority level is active
Replenishment is scheduled whenever:
1. the server's priority level becomes inactive, or
2. the server's budget has been exhausted
The amount of the replenishment is the amount of time consumed by the server since the last time a replenishment was scheduled for it

An anomaly with the SpSL algorithm

What Sprunt, Sha & Lehoczky probably had in mind

How to Correct the Definition?

The server budget is maintained in chunks, each of which has an associated availability time.
1. If the availability time is the future, the chunk is pending
2. If the availability time is in the present or past, the chunk is available
Initially, there is only one chunk of budget, of size e_s and with availability time t_r = 0
The server only executes when all of the following are true:
1. It has one or more non-zero chunks of budget available
2. It has pending work
3. It is not preempted by some higher priority task.
The server is said to be active whenever conditions a and b are satisfied.
While the server executes, the processor time it uses is charged against the chunk of available budget with the earliest availability time, which we call the current chunk.
Whenever the charged time equals the chunk size
1. The availability time of the chunk is postponed by p_s
2. The size of the chunk remains unchanged
3. If the server has another available chunk, it continues execution using that chunk; otherwise, it stops executing until a chunk becomes available
Whenever the server completes all its pending work it stops executing until new work arrives.
1. If the amount of time left in the chunk currently being used by the server is below a specified threshold or the number of chunks has reached a specified limit, the availablity time of the chunk is advanced by p_s
  This rule is solely to reduce proliferation of small chunks.
2. Otherwise, the chunk is split into two chunks
  1. A new chunk is created with size equal to the the amount of time X consumed from the current chunk,and the size of the current chunk is reduced by X.
  2. The availability time of new chunk is postponed by p_s and the availability time of the current chunk is unchanged
    This guarantees that the server will not be able to use this portion of the budget again until the earliest time that a periodic task with execution time X and period p_s could execute again.
Whenever the active priority of the system climbs from below the server priority to a level ≥ the server priority (because the server or a task with higher priority has begun to execute), all available chunks are consolidated into one chunk, if necessary, and the availability time is set equal to the current time.
This guarantees that the server cannot use any chunks that would have been used earlier by a periodic task with the same priority.
Whenever the system becomes entirely idle all future chunks are consolidated into a single chunk of size e_s with availability time equal to the current time.
This helps to compensate for fragmentation of the budget caused by rule 6b above, and can only improve response times for aperiodic jobs. It is safe for analyses based on the Critical Zone Theorem, since it allows us to consider the schedule between any two idle points as a separate, independent, schedule.

Jane Liu shows some ways to correct the SpSL sporadic server anomaly, including her "simple" sporadic server. The above is my own attempt to correct the SpSL anomaly. Does it ensure the property we want? That is, is it true that the interference caused by the server in any busy time interval can never exceed that of a perodic task with periodic p_s and execution time e_s?

IEEE POSIX Sporadic Server

The following is from the IEEE POSIX and Open Group Unix 98 standards documents:

If _POSIX_SPORADIC_SERVER or _POSIX_THREAD_SPORADIC_SERVER is defined, the implementation shall include a scheduling policy identified by the value SCHED_SPORADIC.

The sporadic server policy is based primarily on two parameters: the replenishment period and the available execution capacity. The replenishment period is given by the sched_ss_repl_period member of the sched_param structure. The available execution capacity is initialized to the value given by the sched_ss_init_budget member of the same parameter. The sporadic server policy is identical to the SCHED_FIFO policy with some additional conditions that cause the thread's assigned priority to be switched between the values specified by the sched_priority and sched_ss_low_priority members of the sched_param structure.

The priority assigned to a thread using the sporadic server scheduling policy is determined in the following manner: if the available execution capacity is greater than zero and the number of pending replenishment operations is strictly less than sched_ss_max_repl, the thread is assigned the priority specified by sched_priority; otherwise, the assigned priority shall be sched_ss_low_priority. If the value of sched_priority is less than or equal to the value of sched_ss_low_priority, the results are undefined. When active, the thread shall belong to the thread list corresponding to its assigned priority level, according to the mentioned priority assignment. The modification of the available execution capacity and, consequently of the assigned priority, is done as follows:

When the thread at the head of the sched_priority list becomes a running thread, its execution time shall be limited to at most its available execution capacity, plus the resolution of the execution time clock used for this scheduling policy. This resolution shall be implementation-defined.

Each time the thread is inserted at the tail of the list associated with sched_priority because as a blocked thread it became runnable with priority sched_priority or because a replenishment operation was performed the time at which this operation is done is posted as the activation_time.

When the running thread with assigned priority equal to sched_priority becomes a preempted thread, it becomes the head of the thread list for its priority, and the execution time consumed is subtracted from the available execution capacity. If the available execution capacity would become negative by this operation, it shall be set to zero.

When the running thread with assigned priority equal to sched_priority becomes a blocked thread, the execution time consumed is subtracted from the available execution capacity, and a replenishment operation is scheduled, as described in 6 and 7. If the available execution capacity would become negative by this operation, it shall be set to zero.

When the running thread with assigned priority equal to sched_priority reaches the limit imposed on its execution time, it becomes the tail of the thread list for sched_ss_low_priority, the execution time consumed is subtracted from the available execution capacity (which becomes zero), and a replenishment operation is scheduled, as described in 6 and 7.

Each time a replenishment operation is scheduled, the amount of execution capacity to be replenished, replenish_amount, is set equal to the execution time consumed by the thread since the activation_time. The replenishment is scheduled to occur at activation_time plus sched_ss_repl_period. If the scheduled time obtained is before the current time, the replenishment operation is carried out immediately. Several replenishments may be pending at the same time, each of which will be serviced at its time. With the above rules, the number of replenishment pending for a given thread that is scheduled under the sporadic server policy shall not be greater than sched_ss_max_repl.

A replenishment operation consists of adding the corresponding available execution capacity at the scheduled time. If, as a consequence of this operation, the execution capacity would become larger than sched_ss_initial_budget, it shall be rounded down to a value equal to sched_ss_initial_budget. Additionally, if the thread was runnable or running, and had assigned priority equal to sched_ss_priority, then it becomes the tail of the thread list for sched_priority.

Execution time is defined in Section 2.2.2 (on page 462).

For this policy, changing the value of a CPU-time clock via clock_settime( ) shall have no effect on its behavior.

For this policy, valid priorities shall be within the range returned by the and sched_get_priority_max( ) functions when SCHED_SPORADIC is provided as the parameter. Conforming implementations shall provide a priority range of at least 32 distinct priorities for this policy.

POSIX Sporadic Server in J. Liu's Notation

Consumption:

The server consumes its budget only when it executes at the server priority (sched_priority). Otherwise, the budget is retained.
When the server is not suspended and has a nonempty budget, it is scheduled at the server (higher) priority.
When the server is not suspended but has an empty budget, it is scheduled at a lower priority (sched_ss_low_priority).

Replenishment:

When the server becomes runnable at the server priority (because it wakes up with a nonempty budget, or was already awake but had a zero budget replenished) t_e is set to the current time.
When the server suspends itself (e.g., the server job queue becomes empty) or runs out of execution budget a replenishment is scheduled.
Whenever a replenishment is scheduled, the scheduled replenishment amount is the total execution time consumed by the server since t_e and the replenishment time t_r = t_e + p_s.
If t_r ≤ now, the replenishment is done immediately

In addition, the POSIX version of the Sporadic Server algorithm has a limit on the number of replenishments that a thread may have outstanding at any instant. If this bound is reached, the remaining execution time budget of the thread is given up and added to the amount of the last replenishment. This helps to combat fragmentation of the budget.

Q: Does the POSIX sporadic scheduling policy suffer from the defect of the original publication SpSL version?

Deadline Sporadic Server

Consumption:

The server consumes its budget when and only when it executes.
The server budget at any time consists of a set of chunks χ_i,...χ_j, where each chunk is a pair consisting of an execution time and a replenishment time. Chunks must be consumed in order of replenishment time.
The effective server deadline is t_r+p_s, where t_r is the replenishment time of the earliest available server budget chunk and p_s is the server period.

Replenishment:

Initially, the server has one chunk χ₀ = (e_s, 0)
When the server finishes consuming a chunk χ_m or its queue becomes empty, the δ units of χ_m that the server has consumed is split off and scheduled for replenishment
The replenishment time is the current deadline t_s + p_s of the server at the time the replenishment is scheduled

Server priority activation time:

The most recent server priority activation time t_s is defined as follows:
1. If t_s is undefined, and the server becomes eligible to execute, t_s := Now
2. If t_s is undefined, and the processor starts to execute a task with deadline d ≤ Now + p_s, t_s := Now
3. If t_s is defined, and the processor starts to execute a task with deadline d such that t_s + p_s < d ≤ Now + p_s, t_s := d - p_s
4. If t_s is defined, and the processor becomes idle or starts to execute a task with deadline d such that Now + p_s < d, t_s := undefined
5. If t_s is defined, and the server begins a new chunk χ_m with replenishment time ρ_m > t_s, t_s := ρ_m

Coalescing chunks:

At any time t that the server deadline becomes undefined, (i.e., the server priority becomes inactive) coalesce all the available chunks into a single chunk with replenishment time t.

The server has an initial budget e_s and a period p_s, which define the server's utilization (a.k.a. size), u_s = e_s/p_s.

The server is "eligible to execute" when its queue and budget are both non-empty.

The chunk χ_m is the available chunk with the earliest replenishment time. The execution time used by the server is charged to χ_m.

When the server consumes all of the chunk χ_m or completes all the queued requests, the portion of χ_m that the server has consumed is split off and scheduled for replenishment. Suppose the server has used δ execution time. A new chunk, χ_m', is created, with size σ_m' = δ and replenishment time ρ_m' = t_s, where t_s is the current deadline of the server (which is defined further below). The size of χ_m is reduced by δ. If the new size is zero, the chunk χ_m is deleted, and its role is then played by the remaining chunk with earliest replenishment time.

In each case of the definition of t_s the variable Now denotes the time at which the condition becomes true. Assume there is an idle-task, with deadline ∞, that executes when no other task is eligible to execute; by the fifth case above, t_s is undefined when the idle task is running.

Stated informally, t_s is p_s later than the most recent time at which the (current) server priority became active. However, we are forced define it without reference to the server priority, to avoid circularity in the definition, since the priority depends on the deadline, which depends on t_s.

Note that t_s is always defined when the server is eligible to execute, and is never later than the time the server last became eligible to to execute. Sometimes it may be earlier, if the processor was continuously occupied with higher priority tasks just before the server became eligible to execute.

For any time t when t_s is defined, it is true that in the interval [t_s,t] the processor is continuously busy executing tasks with deadlines ≤ t_s+p_s. It follows that the effect on the schedulability of other tasks is the same as if the server had become eligible to execute at t_s and remained eligible to execute during the entire interval [t_s,t].

Note that the replenishment method of the DSS algorithm forces the server not to re-use a chunk of execution time until at least p_s time units later than the server last could have become eligible to use that chunk. Thus, the execution time of the server within any time interval cannot exceed the amount that would be used by a periodic task with the same period and execution time.

The chunk merging step is to prevent the execution time budget from being fragmented into smaller and smaller chunks. This cannot affect the scheduling outcome, since ρ_i is only used in updating t_s (and computing the server deadline) when ρ_i > t_s, and we know t_s ≥ t.

Based on the arguments above, we have the following lemma, which characterizes the deadline sporadic server load.

DSS Processor Demand

Lemma: For every time interval [t',t] such that t is a missed hard deadline, t' ≤ t-e_s and there is no task with deadline later than t executing in the interval, an upper bound of the deadline sporadic server's demand for CPU time in the interval is

⌊

t - t'

p_s

⌋

e_s

The following theorem provides a sufficient condition for feasibility of a set of (periodic and aperiodic) tasks under the DSS algorithm.

DSS Sufficient Schedulability Condition

Theorem: A set of hard-deadline tasks is schedulable by EDF scheduling and the DSS algorithm if

n
∀k
k=1

(

k
∑
i=1

e_i

min {D_i,p_i}

)

+ u_s ≤ 1

where u_s = e_s/p_s is the aperiodic server utilization.

Deadline Sporadic Server example

Deadline Exchange Server (DXS)

Consumption:

The server consumes its budget when and only when it executes.
The server budget at any time consists of one chunk, of size b.
The effective server deadline is t_r+p_s, where t_r is the last replenishment time and p_s is the server period.

Replenishment:

Initially, the server has budget b=e_s
When the server budget becomes zero or the server queue becomes empty, b := 0 and a replenishment is scheduled to occur at time t_s + p_s, the current deadline of the server at the time the replenishment is scheduled

Server priority activation time:

The most recent server priority activation time t_s is defined as follows:
1. If t_s is undefined, and the server becomes eligible to execute, t_s := Now
2. If t_s is undefined, and the processor starts to execute a task with deadline d ≤ Now + p_s, t_s := Now
3. If t_s is defined, and the processor starts to execute a task with deadline d such that t_s + p_s < d ≤ Now + p_s, t_s := d - p_s
4. If t_s is defined, and the processor becomes idle or starts to execute a task with deadline d such that Now +p_s < d, t_s := undefined

The server deadline is computed the same as for DSS.

The replenishment is not done in chunks. Instead, each time t that the server suspends (either because its queue is empty or because it has consumed all of its budget) the budget is set to zero and a single replenishment of size e_s is scheduled.

The time of this replenishment is t + (c_s - r)p_s/e_s, where r is the amount of the server budget that is remaining at time t.

This algorithm is much simpler than the DSS because it does not have to manage more than one replenishment event, but the performance is generally about as good.

Do not be misled by the work "exchange" to assume that the Deadline Exchange Server is the same idea as the Priority Exchange Server of Lehoczky, Sha, and Strosnider or the Dynamic Exchange Algorithm of Spuri and Buttazzo.

Deadline Exchange Server example

The following graphs show the comparative performance of four different deadline-based aperiodic server scheduling algorithms, based on a simulation.

The simulations used three different simulated periodic task sets. The aperiodic task arrivals and execution times were generated pseudo-randomly, according to an inverse exponential distribution.

The M/M/1 curve indicates the response time if the aperiodic tasks were served by a dedicated CPU of their own. The M/M/(1-U) curve indicates the response time if the aperiodic tasks were served by a dedicated CPU with speed U, where U is the bandwidth allocated to the aperiodic server.

There is something wrong about the labeling on some of the plots below, in particular where the periodic load is claimed to be 69% and the M/M/0.12 server is claimed for comparison. I should re-run these experiments to see what is the real problem.

Mean aperiodic IAT 1800, 40% periodic load

Mean aperiodic IAT 5400, 40% periodic load

Mean aperiodic IAT 1800, 88% periodic load

Mean aperiodic IAT 5400, 69% periodic load

Mean aperiodic IAT 1800, 88% periodic load

Mean aperiodic IAT 5400, 88% periodic load

DDS response time versus period, IAT 3600, 69% periodic load

I am no longer satisfied with the explanation in the DSS paper about why the response time at low load levels suddenly jumps up between period 43200 and 86400. This deserves further explanation.

DSS response time versus period, IAT 3600, 69% periodic load

DXS response time versus period, IAT 3600, 69% periodic load

RM versus deadline sporadic servers, IAT 1800

RM versus deadline sporadic servers, IAT 3600

RM versus deadline sporadic servers, IAT 5400

The work on the EDF deferrable and sporadic servers was done by an FSU student, Teguh Ghazalie, with my supervision and assistance, as part of a master's thesis around 1993-1994.

Generalized EDF Schedulability Theorem

Theorem: A system of independent preemptable sporadic jobs is schedulable according to the EDF algorithm if the total density of all active jobs in the system is no greater than 1 at all times.

Here the density of sporadic job J_i with release time r_i maximum execution time e_i and absolute deadline d_i is e_i/(d_i-r_i).

Variations on definition of density:

This refers to jobs, not tasks.
The term "density" is define here for a job.

The mixure of notation is a bit confusing, since in this context d_i - r_i is the relative deadline, and for a task the realtive deadline is d_i.
The density of a sporadic task is defined to be e_i/min d_i if we assume d_i ≤ p_i, or e_i/min (d_i,p_i) in general.

Constant Utilization Server

Consumption Rules:
- Server consumes budget only when it executes
Replenishment Rules:
- R1: initially, e_s = d_s = 0
- R2: when an aperiodic job with execution time e arrives at time t to an empty server queue
  1. if t < d_s, do nothing
  2. if t ≥ d_s, e_s := e and d_s := t + e_s/u_s
- R3: at the deadline d_s of the server
  1. if the server is idle, do nothing
  2. if the server is backlogged, e_s := e and d_s := d_s+ e_s/u_s, where e is the execution time of the top job in the server queue

A CUS is always given enough budget to complete the job at the head of its queue each time its budget is replenished. The deadline is adjusted so that the instantaneous utlization (density) is always equal to u_s.

Here we deviated from J. Liu's notation by omitting the tilde over the 'u' in the notation u_s for the instantaneous utilization of the server, because it is difficult to produce in HTML.

Question in class: Why not combine all the queued jobs into one big job, and set the budget big enough to complete them all?

Answer: This could be done, but to do so would lower the priority (postpone the deadline) of all but the last job in the group. Thus, we would have potentially poorer reponse time for all but the last job in the group.

Constant Utilization Server with Unknown Exec Time

On page 223, Jane Liu mentions a way of removing the assumption that the execution times of jobs are known at the times of their arrivals. The wording is a bit cryptic, but I believe what she intends is the following.

I believe this is less effective than the DXS, since with the DXS replenishments occur sooner.

For example, suppose we have server budget u and use a fixed nominal aperiodic task execution time (i.e., quantum) e. With the modified CBS when a task with execution time e₁ arrives at time t₁ and finds an empty system, the server is initialized with budget e and deadline t+e/u. If e_i ≤ e, the server will complete the job no later than the deadline, say at time t'₁. If by the deadline there is another job in the server queue, with execution time e₂, the server budget will be replenished to e and the deadline will be adjusted to t + (e₁ + e)/u.

With the DXS, the server also starts out with budget e and (if a lower priority task was executing) deadline t+e/u. If e_i ≤ e, the server will complete the job no later than the deadline, say at time t'₁. At that time the budget will be set to zero and a replenishment will be scheduled for time t+e₁/u. At time t+e₁/u, the replenishment will occur and the budget will be set to e. This replenishment will occur earlier than with the CBS.

Consider in class going through the algorithm to see what happens with this algorithm if we don't know the execution times of aperiodic jobs, and simply use a fixed constant for the budget e.

Total Bandwidth Server

Replenishment Rules:

R1: Initially, e_s = 0 and d_s = 0
R2: When an aperiodic job with exec. time e arrives at time t to an empty aperiodic job queue, e_s := e; d_s := max{d_s,t}+e_s/u_s
R3: When the server completes the current aperiodic job, the job is removed from its queue.
1. if the server is backlogged, e_s := e; d_s := d_s + e_s/u_s
2. if the server is idle, do nothing

Note that the main difference between this algorithm (TBS) and the preceding one (CBS) is that replenishments can occur before the period has expired. When a job completes before the server deadline, and there is another job waiting, the two are effectively combined into one. That is, the server is immediately replenished, but the deadline is dropped back to what it would have been if the two jobs had arrived together and been treated as a single job.

Jane Liu mentions that her comment on the Constant Bandwidth Server (about adapting it to a timeslicing model for situations where the execution times of aperiodic jobs are not known a priori) also applies to the TBS.

Reconsidering our earlier, suppose we have server budget u and use a fixed nominal aperiodic task execution time (i.e., quantum) e. With the modified TBS when a task with execution time e₁ arrives at time t₁ and finds an empty system, the server is initialized with budget e and deadline t+e/u. If e_i ≤ e, the server will complete the job no later than the deadline, say at time t'₁. If at that time there is another job in the server queue, with execution time e₂, the server budget will be replenished to e and the deadline will be adjusted to t + (e₁ + e)/u.

Basic Principles of Servers

Reclaiming deadline, when job completes before exhausing server budget —
It seems this is always a win.
Using start of busy interval, rather than job arrival time, to base server deadline —
It seems this is always a win.
Allowing server to start new job immediately, at the cost of a later deadline (TBS), versus waiting until expiration of server period and getting a shorter relative deadline (CBS) — It seems the former is always a win, since it makes better use of idle time, and the absolute deadline ends up being no later.
Giving server smaller budget in order to get shorter deadline, versus larger budget in order to complete without needing replenishment —
It seems the former allows faster response time if the job is short, but pays a price in the overhead of rescheduling if the job does not complete within the budget. Therefore, adjustment of the server budget and period needs to be done to fit the job execution time distribution.

It seems that there are too many different aperiodic server algorithms, which no clear guidance which one to use. Is one of these "best" overall? I'm not sure anyone has actually yet described the best algorithm, or done a good job of comparing the merits of the algorithms that have already been described. The above is my attempt to cut through this confusion.

Based on these principles, I proposed the following "ideal" aperiodic server algorithm.

Baker's Ideal Aperiodic Server Scheduling Algorithm

The server has a fixed budget quantum q_s, a fixed utilization u_s. Define p_s = q_s/u_s, which we call the "period" of the server.

R1: Initially the server queue is empty and the server is inactive, so e_s = 0 and d_s = ∞.
R2: When an aperiodic job arrives at time t
1. if the server queue is empty the server activates, e_s := q_s and d_s := d_s + p_s.
2. if the server queue is nonempty, no action is required
R3: If the server uses up its budget (i.e, e_s becomes zero) before the server's current job is done, e_s := q_s and d_s := d_s + p_s
R4: If the server completes its current job, the job is removed from its queue and
1. if the server queue is nonempty the server stays active, and begins the next job in its queue, with the same deadline and remaining budget
2. if the server queue is empty, the server deadline is adjusted backward to account for the unused portion of the budget, and the budget is set to zero. That is, if e_s is the remaining server budget the server deadline is set back as d_s := d_s - p_s * e_s / q_s, and then e_s :=0.
R5: If the system becomes idle, the server deadline is reset to d_s := now + p_s.

The proposal above is intended as a challenge. It may or may not describe a valid EDF server scheduling policy, in the sense that it may or may not allow the server to cause more interference for other tasks than could a sporadic task with the same utilization. In other words, no one has proved that the EDF utilization bound result would not apply with this server scheduling policy.

I also don't know for certain that the policy described above will not turn out to be equivalent to one of the published algorithms, and (assuming it is correct and different) I certainly don't know how the performance compares. I'm suggesting that finding the answer these questions as a final student project for the course.

Note that the version of the algorithm above is a modification of a different proposal made in the previous offering of this course. One student took up the challenge, and gave evidence that led to discovery that the previous proposal was overly aggressive. The new proposal also has not been proven, and so may also suffer from such a bug.

Imagine what would happen with the above algorithm, or with the other bandwidth preserving algorithms, if we hold the server utilization constant and decrease the server period (and budget quantum) down toward zero?

I believe the response time for short jobs, and the average response time should improve. (Why do I believe this? Do you believe it?) On the other hand, the scheduling overhead will eventually become larger than the useful work that is done.

Fairness

Consider whether the above are "fair", using examples.

Ideal Processor Sharing (PS)

a.k.a. fluid-flow processor sharing
a.k.a. bit-by-bit or instruction-by-instruction round-robin
each of the n currently backlogged servers gets 1/n of CPU capacity
modeled by infinitesimally fine time slicing between the N currently backlogged servers

Generalized Processor Sharing (GPS)

instead of equal slices, the servers are given slices proportional to the server size
instead of 1/n, the kth of the n currently backlogged servers gets

u_k / ( n
∑
i=1 u_i)

of CPU capacity

Preemptive Weighted Fair Queueing (WFQ)

preemptive and non-preemptive versions are both used
- non-preemptive version is used for packet scheduling^*
- preemptive version is used here, for CPU scheduling
replenishmment rules are like Total Bandwidth server
- WFQ server consumes budget only when it executes
- budget replenished when server first becomes backlogged after being idle
  (i.e., when a job arrives and finds the serverqueue empty)
- budget also replenished whenever a job is completed
- replenishment amount equals exec. time of job at the head of the queue
deadlines are set differently
- emulates effect of GPS
- conceptually, deadline of server k is time server would use up its budget with GPS
- doing this directly would require recomputing deadlines every time a job arrives or completes
  (since the number of ways processor is shared changes)
- actual rules avoid recomputation by using "finish numbers" rather than actual completion times
- finish numbers are a form of "virtual time", i.e., the finish times are not the same as the completion times, but they stand in the same ordering relationship

^* Nonpreemptive GPS for packet scheduling is covered in Chapter 11, along with some of the the rest of the huge amount of research that has been done on real-time data communication and networking. We will probably not have time to get to that this term, but you should be aware that it exists.

The effect of the WFQ deadline mechanism is to achieve fairness among the processors that are active at any instant.

T. P. Baker. ($Id: servers.html,v 1.1 2008/08/25 11:18:48 baker Exp baker $)