Topics
- Correct use of locks
- Common locking errors
- Choosing the right lock
- Lock implementations
- Lock usage examples in the kernel
|
|
Most of these apply equally to mutexes, spinlocks, semaphores, etc. We will point out cases
where the type of lock makes a difference.
Image from http://www.flickr.com/photos/hectic/1018015429/
Pattern: Basic Critical Section
unsigned long flags;
...
spin_lock_irqsave(&task_group_lock, flags);
for_each_possible_cpu(i)
unregister_fair_sched_group(tg, i);
list_del_rcu(&tg->siblings);
spin_unlock_irqrestore(&task_group_lock, flags);
- Lock and unlock operations are close enough to be visibly paired
- No flow of control can escape section without unlock
- Prevents failure to unlock, which leads to deadlock
Pattern: Error Recovery in Critical Section
mutex_lock(&shares_mutex);
if (tg->shares == shares) goto done;
...several lines of code...
done:
mutex_unlock(&shares_mutex);
return 0;
- Be sure to release lock on all exit paths from critical section
- Check all exceptional control transfers,
including goto, break, continue, return, and longjmps
- Prevents deadlocks
- Use of "goto done" in this case seems clearer than "unlock and return"
Pattern: Associating Locks with Objects
struct kset {
struct list_head list;
spinlock_t list_lock;
struct kobject kobj;
struct kset_uevent_ops *uevent_ops;
};
- The data protected by the lock are grouped into one structure
- The association of lock with the data it protects is clear
Pattern: Associating Locks with Modules
static DEFINE_MUTEX(mm_all_locks_mutex);
- The data protected by the lock are grouped into one module
- All static data in the module is protected by the same lock
- The association of lock with the data it protects is clear
Pattern: Associating Locks with Lists
/* add the kobject to its kset's list */
static void kobj_kset_join(struct kobject *kobj)
{
if (!kobj->kset) return;
kset_get(kobj->kset);
spin_lock(&kobj->kset->list_lock);
list_add_tail(&kobj->entry, &kobj->kset->list);
spin_unlock(&kobj->kset->list_lock);
}
- Use the lock of the list head to protect changes to the list structure
- All operations on the list use this lock
- The association of lock with the data it protects is clear
This generalizes to other data structures, such as trees.
Pattern: Moving Objects Between Lists
- Problem:
- Suppose different threads may try to move an object onto different lists, concurrently
- With only list locks, these operations may interfere
- Solution:
- Put a lock into each objects, as well as each list head
- To move an object
- Lock the object
- Lock the list
- Remove/add object to list
- Unlock the list
- Unlock the object
Pattern: Ordered Locking
mutex_lock(&rt_constraints_mutex);
read_lock(&tasklist_lock);
ret = __rt_schedulable(NULL, 0, 0);
read_unlock(&tasklist_lock);
mutex_unlock(&rt_constraints_mutex);
- Lock-unlock pairs consistently nested
- In a consistent order
- Spinlocks with irq-save inside ordinary spinlocks
- Spinlocks inside blocking locks (e.g. mutexes)
- Consistent ordering within individual locks of each group
- Prevents deadlock
Pattern: Document Locking Policies
/*
* This is the main, per-CPU runqueue data structure.
*
* Locking rule: those places that want to lock multiple runqueues
* (such as the load balancing or the thread migration code), lock
* acquire operations must be ordered by ascending &runqueue.
*/
- Document what the lock protects, and rules for its usage
- Place the comments where the lock and the protected structure are declared
Pattern: Complete Example
static int tg_set_bandwidth(struct task_group *tg,
u64 rt_period, u64 rt_runtime)
{
int i, err = 0;
mutex_lock(&rt_constraints_mutex);
read_lock(&tasklist_lock);
err = __rt_schedulable(tg, rt_period, rt_runtime);
if (err)
goto unlock;
spin_lock_irq(&tg->rt_bandwidth.rt_runtime_lock);
tg->rt_bandwidth.rt_period = ns_to_ktime(rt_period);
tg->rt_bandwidth.rt_runtime = rt_runtime;
for_each_possible_cpu(i) {
struct rt_rq *rt_rq = tg->rt_rq[i];
spin_lock(&rt_rq->rt_runtime_lock);
rt_rq->rt_runtime = rt_runtime;
spin_unlock(&rt_rq->rt_runtime_lock);
}
spin_unlock_irq(&tg->rt_bandwidth.rt_runtime_lock);
unlock:
read_unlock(&tasklist_lock);
mutex_unlock(&rt_constraints_mutex);
return err;
}
Anti-Pattern: Releasing Lock Too Early
mutex_lock(&somelock);
result = needs_service (&aome_object);
mutex_unlock(&somelock);
if (result) perform_service (&some_object);
- Programmer holds lock to test for condition
- Then releases lock before taking action
- Another thread detects same condition, and takes concurrent action
- The two actions conflict
- Race causes data corruption
Anti-Pattern: Holding Lock While Sleeping
mutex_lock(&somelock);
...
wait_event(some_wait_queue, some_test(some_object);
...
mutex_unlock(&somelock);
- Programmer performs operation that causes process to sleep while holding lock
- Other threads cannot obtain the lock while during the sleep (bad enough)
- The thread responsible for the wakeup can't get the lock to cause the wakeup (worse)
- Long delays, possible deadlock
Anti-Pattern: Not Documenting Lock Coverage
- Programmer relies on memory to know which locks protect what, and locking order rules
- Programmer forgets, or new programmer starts to work on code
- Reasoning behind correct locking is not followed
- Leads to data corruption, deadlock, and other lock problems
Anti-Pattern Forgetting to Lock
- You add code that touches an object that is protected by a lock
- The new code fails to obtain the proper lock(s) before operating on it
- Leads to data corruption
Anti-Pattern Forgetting to Lock
- You add code that touches an object that is protected by a lock
- The new code fails to obtain the proper lock(s) before operating on it
- Leads to data corruption
Anti-Pattern: Blocking Lock in Handler
void handler(void * arg) {
...
mutex_lock(m);
...
}
- Handler interrupts thread holding lock
- Handler attempts to lock same lock
- Handler blocks interrupted thread (by not returning)
- Leads to deadlock
- Also applies to spinlocks without irq masking, and sleeping
- Antidotes:
- Use spinlock_irqsave() and spinlock_irqrestore()
- Move work that might require sleeping to a server thread or work queue item (served by a thread)
Anti-Pattern: Too-late Initialization
- Code that requires lock executes before lock is initialized
- Lock fails
- Data is corrupted, possibly also the lock
- Antidote:
- Do all initialization of locks before exporting any functions that use them
- Initialize dependent modules hierarchically, callee-first
- Don't register a service until all initialization of locks is done
Anti-Pattern: Splitting Critical Sections Arbitrarily
- Programmer is concerned about long appearance of a critical section
- Decides to break it into sections, by releasing and re-acquiring lock(s)
- May result in error if races and interleaving effects not taken into account
- May increase run-time overhead unnecessarily
- Antidote:
- Release locks at logical end of a critical section
- Before sleeping
- Before starting a long computation
- When a transaction is complete
- Otherwise, only think about splitting if the critical section is really long
- When lock(s) has been released and is reaquired, don't forget to re-check all
preconditions. You must reason the same as if the process had been sleeping a long time.
Anti-Pattern: Unnecessary Locking
mutex_lock(&some_lock);
some_int_variable = some_constant_value;
mutex_unlock(&some_lock);
- Programmer uses lock to protect an operation that can be implemented atomically
- Or that does not need protection for some other reason
- Antidote:
- Use atomic types, where they make sense
- If the protected code is very simple, such as a single assignment statement,
think hard about it. There is a good chance that either you do not need the
lock at all, or you need to hold the lock over a larger region of code.
See also the introductory page on
patterns and the page on miscellaneous patterns.