Lock

Lock

Lock

Basic Concept

It is a traditional varaible change.

1
2
3
4
lock_t mutex;
lock(&mutex);
balance = balance + 1; // Critical Section
unlock(&mutex);

Since lock is a varaible, it must be initialised to be used.
Lock has status:

  • Available/Unlock/free: no thread has the lock.
  • Acquired: Excatly one thread has the lock (owner) within the critical section.

Lock gives the least control at scheduling for programmer.
Generally, programmer creates the thread, and OS controls it. Lock gives partial control of thread to programmer.

Lock Implementation

Key Question: How can we make efficient lock? What hardware supports does it need? OS?

Lock Evaluation

  1. Does it support mutex well
    1. Basic function that prevents multipe threads enter the critical section.
  2. Performance: Especailly, lock overheads.
    1. No race: Overheads when one thread lock/unlock?
    2. Race: multipe threads compete to achieve the lock within one cpu.

Interrupt Control

Mutex Can be implemented by disabling interrupt within critical section.

  1. Pro: Very easy. Codes will be executed atomically.
  2. Cons:
    1. Privileged Thread - enable/disable interrupt.
      1. It must be trustable that thread will soley use the privilege for its inteneded purpose.
      2. i.e. Greedy program gets the lock and use all the process resource, infinite loop…
    2. Can’t in multi-processors.
      1. When multiple threads are running on multipe CPU, each thread may try to access the same critical section.
      2. In this case, interrupt disable on certain process has no influence on the other process; hence cannot guarantee the mutex.
    3. May miss important interrupt.
      1. What if CPU misses that reading request completed from the data storage? When can we awake the process waiting for the data reading?
    4. Inefficient - Modern CPU tends to take more time to enable/disable the interrupt compares to regular instructions.
  3. Usage:
    1. Therefore, handling interrupt must be dealt within the limited area.
    2. i.e. OS disbales interrupt to atomic execution within internal data structures.
    3. Since internal OS has no such thing as trust/privilege, it is okay for letting OS does what it wants.

Spin Lock

Test-And-Set (Atomic Exchange)

As we discussed, there is no point of disabling interrupt in multi-processors. Hence, multi-processors have hardware support.

The simplest hardware support is Test-And-Set or atomic exchange.

Without hardware support

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
typedef struct __lock_t { 
int flag;
} lock_t;

void init(lock_t *mutex) {
// 0 -> lock is available, 1 -> held
mutex->flag = 0;
}
void lock(lock_t *mutex) {
while (mutex->flag == 1) // TEST the flag
; // spin-wait (do nothing)
// Here can be interrupt after left the while loop
mutex->flag = 1; // now SET it!
}
void unlock(lock_t *mutex) {
mutex->flag = 0;
}

Above code has two problems:

  1. Accuracy: If interrupt is called in the certain timing, the flag will be 1 for both of threads -> Fail to have mutex.
  2. Performance: spin-wait is infinitely checking the flag value. It wastes the time until the lock is released by the other thread -> Very wasteful especially within one processor.

With hardware support

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// TestAndSet is atomic
int TestAndSet(int *old_ptr, int new) {
int old = *old_ptr; // fetch old value at old_ptr
*old_ptr = new; // store ’new’ into old_ptr
return old; // return the old value
}
typedef struct __lock_t {
int flag;
} lock_t;
void init(lock_t *lock) {
// 0: lock is available, 1: lock is held
lock->flag = 0;
}
void lock(lock_t *lock) {
while (TestAndSet(&lock->flag, 1) == 1)
// someone has the lock then wait for it
; // spin-wait (do nothing)
}
void unlock(lock_t *lock) {
lock->flag = 0;
}

It is a spin-lock as it keeps wasting CPU cycles until getting the lock -> Obviously it only works in preemptive scheduler which can trig the interrupt using timer.

Spin Lock Evaluation

  1. Mutex - Works
  2. Fairness - Can it guarantee the opportunity for waiting threads?
    1. No, Indeed, a thread spinning may spin forever,under contention. Simple spin locks (as discussed thus far) are not fair and may lead to starvation.
  3. Performance
    1. Case Single CPU: Overheads of spin-lock may be significant.
    2. Case Multiple CPU: It works reasoably well.

Compare-And-Swap

1
2
3
4
5
int CompareAndSwap (int *ptr, int expected, int new){
int actual = *ptr;
if (actual == expected) *prt = new;
return actual;
}

CompareAndSwap is stronger than TestAdnSet; wait-free synchronization.

In spin-lock implementation, it works exactly the same as TestAndSet.

Load-Linked & Store-Conditional

1
2
3
4
5
6
7
int LoadLinked(int *ptr) return *ptr;
int StoreConditional(int *ptr, int value){
if (not updated since LoadLinked){
*ptr = value;
return 1; // Success
} else return 0; // Fail to update
}

Load-Linked works same as load instruction.

1
2
3
4
void lock(lock_t *lock){
while (LoadLinked(&lock->flag) || !StoreConditional(&lock->flag,1)) ;
}
void unlock(lock_t *lock) lock->flag = 0;

As can be expected, StoreConditional may fail due to interrupt -> But the if condition (not updated since) guarantees that one thread has the lock at a time.

Fetch-And-Add

1
2
3
4
5
int FetchAndAdd(int *ptr){
int old = *ptr;
*ptr = old + 1;
return old;
}

Ticket-Turn lock method. Since FetchAndAdd is atomic, assign numbers in ticket and turn to decide which thread’s turn to have the lock.

Once the thread is done with the lock, it increases the turn numnber, which allows one thread (the ticker holder) will exit the while loop and gets the lock.

  • The important difference between the previous lock methods is that it executes in order, hence there is no starvation.

How to not waste CPU time in a spin?

First Approach: Yield (양보)

Approach is simple: Yield its allocated CPU to the other thread while waiting for the lock (spin).

  • It works fine when there are few threads.
  • If many threads exist
    • let’s say the scheduling policy is round-robin, it needs to wait for n-1 strip time to do run-and-yield every time.
    • It is better than wasting n-1 threads, but waste contect switch overheads (problematic when context switch is expensive).
    • Starvation is possible - What if miss the time while waiting for the other threads to use the CPU?

Secon Approach: Queue - sleep instead of spinning

The fundamental problems of discussed methods are all on luck; either spinning or yield -> wasteful, and cannot prevent starvation.

Hence, it is required to explicitly control which threads get the next lock -> support from OS and use of Queue.

Implement two stage: guard and lock. Cases:

  • Guard, no lock: Add itself to the queue (release guard) - Before critical
  • Both lock and guard: critical section (release guard)
  • Lock, guard: Wake up the next queue thread, release guard.

Guard is still spinning implementation, but alot more efficient as it only deals with few instructions; no critical section parts at all.

Linux Lock - Futex

Futex:

  • Linked with particular physical memory address.
  • Has its own internal kernel queue per each futex.
  • Caller calls futex, and sleep/awake upon needs.
    • futex_wait(address, expected), futex_wake(address)

Two Phase lock

  • First Phase: wait in spinning with hope it will soon have the lock.
  • Second Phase: Sleeps and awake after the lock is released.
Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×