[LINUX] vlocks for Bare-Metal Mutual Exclusion (1/2)

https://www.kernel.org/doc/html/latest/arm/vlocks.html

vlocks for Bare-Metal Mutual Exclusion

Voting Locks, or “vlocks” provide a simple low-level mutual exclusion mechanism, with reasonable but minimal requirements on the memory system.

Voting Locks or "vlock" provide a simple, low-level mutual exclusion mechanism. This is reasonably priced and requires only minimal demands on the memory system.

These are intended to be used to coordinate critical activity among CPUs which are otherwise non-coherent, in situations where the hardware provides no other mechanism to support this and ordinary spinlocks cannot be used.

It is intended to be used to coordinate critical activity between inconsistent CPUs. It can be used in situations where normal spinlocks are not available, as the hardware does not provide any other mechanism to support this.

vlocks make use of the atomicity provided by the memory system for writes to a single memory location. To arbitrate, every CPU “votes for itself”, by storing a unique number to a common memory location. The final value seen in that memory location when all the votes have been cast identifies the winner.

vlock takes advantage of the atomicity provided by the memory system to write to a single memory location. To arbitrate, all CPUs "vote themselves" and write unique numbers to the shared memory city. When all votes have been cast, the final value displayed in memory location identifies the winner.

In order to make sure that the election produces an unambiguous result in finite time, a CPU will only enter the election in the first place if no winner has been chosen and the election does not appear to have started yet.

To ensure that voting takes a finite amount of time and produces results, the CPU will only vote if the winner has not been determined and the voting has not yet begun.

Algorithm

The easiest way to explain the vlocks algorithm is with some pseudo-code:

A simple way to illustrate the vlock algorithm is to use pseudocode.

int currently_voting[NR_CPUS] = { 0, };
int last_vote = -1; /* no votes yet */

bool vlock_trylock(int this_cpu)
{
        /* signal our desire to vote */
        /*Signal your desire to vote*/
        currently_voting[this_cpu] = 1;
        if (last_vote != -1) {
                /* someone already volunteered himself */
                /*Someone has already volunteered*/
                currently_voting[this_cpu] = 0;
                return false; /* not ourself */
                              /*It's not me*/
        }

        /* let's suggest ourself */
        /*Propose yourself*/
        last_vote = this_cpu;
        currently_voting[this_cpu] = 0;

        /* then wait until everyone else is done voting */
        /*Wait for everyone else to complete the vote*/
        for_each_cpu(i) {
                while (currently_voting[i] != 0)
                        /* wait */;
        }

        /* result */
        /*result*/
        if (last_vote == this_cpu)
                return true; /* we won */
                             /*victory*/
        return false;
}

bool vlock_unlock(void)
{
        last_vote = -1;
}

The currently_voting[] array provides a way for the CPUs to determine whether an election is in progress, and plays a role analogous to the “entering” array in Lamport’s bakery algorithm [1].

The current_voting [] array provides a way for the CPU to determine if a vote is in progress and serves similar to the "entering" array in Lamport's bakery algorithm [1].

However, once the election has started, the underlying memory system atomicity is used to pick the winner. This avoids the need for a static priority rule to act as a tie-breaker, or any counters which could overflow.

Once voting has begun, it will be used to elect the winner, using the primitive nature of the underlying memory system. This eliminates the need for static priority rules that act as tie-breakers, or counters that can overflow.

As long as the last_vote variable is globally visible to all CPUs, it will contain only one value that won’t change once every CPU has cleared its currently_voting flag.

As long as the last_vote variable is globally visible to all CPUs, it contains only one value that does not change until all CPUs clear the current_voting flag.

Features and limitations

・ Vlocks are not intended to be fair. In the contended case, it is the last CPU which attempts to get the lock which will be most likely to win.

vlocks are therefore best suited to situations where it is necessary to pick a unique winner, but it does not matter which CPU actually wins.

--vlock is not threaded to be equal. In competing cases, the "last" CPU is most likely to win trying to acquire the lock. Therefore, vlock is best suited for situations where you need to determine the only winner, but you don't really care which CPU is the winner.

・ Like other similar mechanisms, vlocks will not scale well to a large number of CPUs.

vlocks can be cascaded in a voting hierarchy to permit better scaling if necessary, as in the following hypothetical example for 4096 CPUs:

vlocks can cascade voting hierarchies for better scaling as needed. For example, a case that potentially has 4096 CPUs.

Originally, it is a part of the Linux Kernel source code, so it will be treated as GPLv2 (recognition that it should be).

https://www.kernel.org/doc/html/latest/index.html

Licensing documentation

The following describes the license of the Linux kernel source code (GPLv2), how to properly mark the license of individual files in the source tree, as well as links to the full license text.

https://www.kernel.org/doc/html/latest/process/license-rules.html#kernel-licensing