[LINUX] Read Clock sources, Clock events, sched_clock () and delay timers

Originally, it is a part of the Linux Kernel source code, so it will be treated as GPLv2 (recognition that it should be).


Licensing documentation

The following describes the license of the Linux kernel source code (GPLv2), how to properly mark the license of individual files in the source tree, as well as links to the full license text.


https://www.kernel.org/doc/html/latest/timers/timekeeping.html I will read.

Clock sources, Clock events, sched_clock() and delay timers

This document tries to briefly explain some basic kernel timekeeping abstractions. It partly pertains to the drivers usually found in drivers/clocksource in the kernel tree, but the code may be spread out across the kernel.

This document is a brief description of the timekeeping abstractions of some basic kernels. This is partly related to the drivers normally found in the kernel tree, which are usually located in drivers / clocksource. However, the code may traverse the entire kernel.

If you grep through the kernel source you will find a number of architecture-specific implementations of clock sources, clockevents and several likewise architecture-specific overrides of the sched_clock() function and some delay timers.

If you grep the kernel source, you'll find a number of implementations where clock source and clockevent are architecture-dependent. You'll also find architecture-dependent overrides for the sche_clock () function and some delay timers.

To provide timekeeping for your platform, the clock source provides the basic timeline, whereas clock events shoot interrupts on certain points on this timeline, providing facilities such as high-resolution timers. sched_clock() is used for scheduling and timestamping, and delay timers provide an accurate delay source using hardware counters.

To provide timekeeping for your platform, clock source provides a basic timeline. A clock event triggers an interrupt at a specific point on this timeline, providing features such as a high resolution rimer. shed_clock () is used for scheduling and timestamps. delay timer utilizes hardware counters to provide an accurate delay source.

Clock sources

The purpose of the clock source is to provide a timeline for the system that tells you where you are in time. For example issuing the command 'date' on a Linux system will eventually read the clock source to determine exactly what time it is.

The purpose of clock source is to provide a timeline for your system and inform you of the current time. For example, when the'date' command is executed on a linux system, it will eventually read the clock source to determine the exact current time.

Typically the clock source is a monotonic, atomic counter which will provide n bits which count from 0 to (2^n)-1 and then wraps around to 0 and start over. It will ideally NEVER stop ticking as long as the system is running. It may stop during system suspend.

Generally the clock source is a monotonous atomic counter. When n bits are given, it counts from 0 to (2 ^ n) -1, wraps to 0 and starts over.

The clock source shall have as high resolution as possible, and the frequency shall be as stable and correct as possible as compared to a real-world wall clock. It should not move unpredictably back and forth in time or miss a few cycles here and there.

The clock source should have the highest resolution possible and the frequency should be as stable and accurate as possible compared to the real world wall clock.

It must be immune to the kind of effects that occur in hardware where e.g. the counter register is read in two phases on the bus lowest 16 bits first and the higher 16 bits in a second bus cycle with the counter bits potentially being updated in between leading to the risk of very strange values from the counter.

It should be unaffected by the hardware. For example, if the counter register reads the lower 16 bits ahead of the bus and the upper 16 bits later, the counter bits may be updated. At that time, there is a risk of getting strange values from the counter.

When the wall-clock accuracy of the clock source isn't satisfactory, there are various quirks and layers in the timekeeping code for e.g. synchronizing the user-visible time to RTC clocks in the system or against networked time servers using NTP, but all they do basically is update an offset against the clock source, which provides the fundamental timeline for the system. These measures does not affect the clock source per se, they only adapt the system to the shortcomings of it.

If you are not satisfied with the accuracy of the clock source as a wall-clock, the timekeeping code has various quirks and layers. The time visible to the user can be synchronized with the networked time service using the RTC clock or NTP in the system. However, they only update the offset with respect to the clock source that provides the basic timeline of the system. These do not affect the clock source itself. It just adapts the system to its shortcomings.

The clock source struct shall provide means to translate the provided counter into a nanosecond value as an unsigned long long (unsigned 64 bit) number.

The clock source struct provides a way to convert a given counter value to a nanosecond value as an unsigned long long (unsigned 64bit) number.

Since this operation may be invoked very often, doing this in a strict mathematical sense is not desirable: instead the number is taken as close as possible to a nanosecond value using only the arithmetic operations multiply and shift, so in clocksource_cyc2ns() you find:

This operation is called so often that it is not desirable to perform it in a strict mathematical sense. As an alternative, use only arithmetic multiplication and shift to bring numbers as close as possible to nanoseconds. For example, in clocksource_cyc2ns () you can find:

ns ~= (clocksource * mult) >> shift

You will find a number of helper functions in the clock source code intended to aid in providing these mult and shift values, such as clocksource_khz2mult(), clocksource_hz2mult() that help determine the mult factor from a fixed shift, and clocksource_register_hz() and clocksource_register_khz() which will help out assigning both shift and mult factors using the frequency of the clock source as the only input.

You will find a number of helper functions in the clock source code that are intended to assist in the calculation of mult and shift value. For example, clocksource_khz2mult (), clocksource_hz2mult () are helpers for determining the mult coefficient from a fixed shift. clocksource_register_hz () and clocksource_register_khz () use the frequency of the clock source as the only input to calculate the shift and mult coefficients.

For real simple clock sources accessed from a single I/O memory location there is nowadays even clocksource_mmio_init() which will take a memory location, bit width, a parameter telling whether the counter in the register counts up or down, and the timer clock rate, and then conjure all necessary parameters.

From a single I / O memory allocation, for a truly simple clock source, there is now clocksource_mmio_init (). It calls all the necessary parameters given the memory position, bit width, a parameter indicating whether the register value is a countup or a countdown, and a timer clock rate.

Since a 32-bit counter at say 100 MHz will wrap around to zero after some 43 seconds, the code handling the clock source will have to compensate for this.

When running at 100MHz with a 32 bit counter, it wraps after 0 to 43 seconds. The code that handles the clock source needs to correct this.

That is the reason why the clock source struct also contains a 'mask' member telling how many bits of the source are valid. This way the timekeeping code knows when the counter will wrap around and can insert the necessary compensation code on both sides of the wrap point so that the system timeline remains monotonic.

This is why the clock source structure holds "mask" members. The timekeeping code recognizes when the counter wraps around and inserts the required correction code at both ends of the wrap point so that the system timeline increases monotonically.

Clock events

Clock events are the conceptual reverse of clock sources: they take a desired time specification value and calculate the values to poke into hardware timer registers.

Clock events are the reverse concept of clock sources. Gets the required timed value and calculates the value to write to the hardware timer register.

Clock events are orthogonal to clock sources. The same hardware and register range may be used for the clock event, but it is essentially a different thing. The hardware driving clock events has to be able to fire interrupts, so as to trigger events on the system timeline. On an SMP system, it is ideal (and customary) to have one such event driving timer per CPU core, so that each core can trigger events independently of any other core.

The clock event works directly with the clock source. Clock events can be used with the same hardware and register range, but they are basically different. The hardware that drives the clock event triggers the event on the system timeline by triggering an interrupt. In an SMP system, it is ideal (and customary) to have one such timer for each CPU Core, and each core can trigger an event independently of the other cores.

You will notice that the clock event device code is based on the same basic idea about translating counters to nanoseconds using mult and shift arithmetic, and you find the same family of helper functions again for assigning these values. The clock event driver does not need a 'mask' attribute however: the system will not try to plan events beyond the time horizon of the clock event.

The clock event device code has the basic idea of converting counter values to nano seconds using mult and shift calculations, and there are helper functions of the same family to calculate these values. However, the clock event driver does not require the "mask" attribute. The system does not attempt to plan events that exceed the time range of clock events.


In addition to the clock sources and clock events there is a special weak function in the kernel called sched_clock(). This function shall return the number of nanoseconds since the system was started. An architecture may or may not provide an implementation of sched_clock() on its own. If a local implementation is not provided, the system jiffy counter will be used as sched_clock().

In addition to clock source and clock event, the kernel has a "weak" function called sched_clock (). This function returns nanoseconds since the system started. The architecture itself may or may not implement sched_clock (). If no local implementation is done, the system jiffy counter will be treated like sched_clock ().

As the name suggests, sched_clock() is used for scheduling the system, determining the absolute timeslice for a certain process in the CFS scheduler for example. It is also used for printk timestamps when you have selected to include time information in printk for things like bootcharts.

As the name implies, sched_clock () applies to system scheduling. For example, it is used to identify the absolute time slice in a specific process of CFS scheduler. It is also used for printk timestamps when including information such as boot chard in printk.

Compared to clock sources, sched_clock() has to be very fast: it is called much more often, especially by the scheduler. If you have to do trade-offs between accuracy compared to the clock source, you may sacrifice accuracy for speed in sched_clock(). It however requires some of the same basic characteristics as the clock source, i.e. it should be monotonic.

Compared to clock source, sched_clock () should be very fast. This function is called very many times. Especially called by the scheduler. If you need to make a trade-off with accuracy compared to the clock source, sched_clock () should prioritize speed over accuracy. However, it requires the same basic characteristics as a clock source. For example, it is a monotonous increase.

The sched_clock() function may wrap only on unsigned long long boundaries, i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps after circa 585 years. (For most practical systems this means "never".)

The sched_clock () function wraps at unsigned long long boundary, that is, 64bit or later. This is a nano second value, so wrap will be 585 years later (on most real systems this means "never happen").

If an architecture does not provide its own implementation of this function, it will fall back to using jiffies, making its maximum resolution 1/HZ of the jiffy frequency for the architecture. This will affect scheduling accuracy and will likely show up in system benchmarks.

If architecure doesn't implement this function on its own, use jiffies to fall back. In other words, set its maximum resolution to 1 / HZ of the jiffy frequency of archite cure. This affects the accuracy of scheduling and may appear in system benchmarks.

The clock driving sched_clock() may stop or reset to zero during system suspend/sleep. This does not matter to the function it serves of scheduling events on the system. However it may result in interesting timestamps in printk().

During system suspend / sleep, the clock that starts sched_clock () may stop or be reset to zero. This has nothing to do with the ability of the system to schedule events. However, the timestamp of printk () may give interesting results.

The sched_clock() function should be callable in any context, IRQ- and NMI-safe and return a sane value in any context.

The sched_clock () function must be able to work in any context, be IRQ-, NMI-safe, and return the correct value in any context.

Some architectures may have a limited set of time sources and lack a nice counter to derive a 64-bit nanosecond value, so for example on the ARM architecture, special helper functions have been created to provide a sched_clock() nanosecond base from a 16- or 32-bit counter. Sometimes the same counter that is also used as clock source is used for this purpose.

Some architectures have a limited set of time sources and may not have suitable counters to calculate 64-bit nanosecond values. For example, in the case of ARM architecture, sche_clock () can calculate nano second from 16 or 32bit counter. The same counter used in the clock source may be used for this purpose.

On SMP systems, it is crucial for performance that sched_clock() can be called independently on each CPU without any synchronization performance hits. Some hardware (such as the x86 TSC) will cause the sched_clock() function to drift between the CPUs on the system. The kernel can work around this by enabling the CONFIG_HAVE_UNSTABLE_SCHED_CLOCK option. This is another aspect that makes sched_clock() different from the ordinary clock source.

SMP systems require that sched_clock () can be called independently on each CPU without affecting synchronization performance. On some hardware (such as x86 TSC), the sched_clock () function drifts between CPUs on the system. The kernel can enable workarounds by enabling the CONFIG_HAVE_UNSTABLE_SCHED_CLOCK option. This is another aspect of sche_clock () that differs from clock source.

Delay timers (some architectures only)

On systems with variable CPU frequency, the various kernel delay() functions will sometimes behave strangely. Basically these delays usually use a hard loop to delay a certain number of jiffy fractions using a "lpj" (loops per jiffy) value, calibrated on boot.

On systems with variable CPU frequencies, the kernel delay () function may behave unexpectedly. Basically, these delays use a boot-tuned "lpj" (loops per jiffy) value to delay a certain number.

Let's hope that your system is running on maximum frequency when this value is calibrated: as an effect when the frequency is geared down to half the full frequency, any delay() will be twice as long. Usually this does not hurt, as you're commonly requesting that amount of delay or more. But basically the semantics are quite unpredictable on such systems.

If you have adjusted this value, expect your system to be operating at maximum frequency. When the frequency is adjusted to half the maximum frequency, delay () is doubled in length. Normally delay requires "more", so there is no problem. But basically in such a system, the semantics are totally unpredictable.

Enter timer-based delays. Using these, a timer read may be used instead of a hard-coded loop for providing the desired delay.

Let's introduce a timer-based delay, which allows you to read the timer and achieve the required delay instead of a hard-coded loop.

This is done by declaring a struct delay_timer and assigning the appropriate function pointers and rate settings for this delay timer.

This can be achieved by declaring a struct delay_timer and setting the appropriate function pointer and rate for this delay timer.

This is available on some architectures like OpenRISC or ARM. This is useful for architectures like OpenRISC and ARM.

Recommended Posts

Read Clock sources, Clock events, sched_clock () and delay timers