[LINUX] Freezing of tasks (1/2)

Originally, it is a part of the Linux Kernel source code, so it will be treated as GPLv2 (recognition that it should be).

https://www.kernel.org/doc/html/latest/index.html

Licensing documentation

The following describes the license of the Linux kernel source code (GPLv2), how to properly mark the license of individual files in the source tree, as well as links to the full license text.

https://www.kernel.org/doc/html/latest/process/license-rules.html#kernel-licensing

https://www.kernel.org/doc/html/latest/power/freezing-of-tasks.html


Docs » Power Management » Freezing of tasks

Freezing of tasks 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL

I. What is the freezing of tasks? (What is task freezing?)

The freezing of tasks is a mechanism by which user space processes and some kernel threads are controlled during hibernation or system-wide suspend (on some architectures).

Task freezing is the control of userspace processes and some kernel threads during hibernation or system-wide syspend (in some architectures).

II. How does it work? (How does it work?)

There are three per-task flags used for that, PF_NOFREEZE, PF_FROZEN and PF_FREEZER_SKIP (the last one is auxiliary).

Three types of task flag are used. PF_NOFREEZE, PF_FROZEN, PF_FREEZER_SKIP (the last one is auxiliary).

The tasks that have PF_NOFREEZE unset (all user space processes and some kernel threads) are regarded as ‘freezable’ and treated in a special way before the system enters a suspend state as well as before a hibernation image is created

Tasks (all userspace processes and some kernel threads) for which PF_NOFREEZE is not set are considered "freezable". They are treated in a special way before the system transitions to the syspend state before it produces hibernation images.

(in what follows we only consider hibernation, but the description also applies to suspend).

(Hereafter, only hibernation is considered, but the description can be applied to suspend as well.)

.

Namely, as the first step of the hibernation procedure the function freeze_processes() (defined in kernel/power/process.c) is called.

That is, the freeze_processes () function is called as the first step in the hibernation procedure (as described in kernel / power / process.c).

A system-wide variable system_freezing_cnt (as opposed to a per-task flag) is used to indicate whether the system is to undergo a freezing operation.

The system-wide variable system_freezing_cnt (rather than the per-task flag) is used to indicate whether the system undergoes freezing operation.

And freeze_processes() sets this variable.

freeze_processes () sets this variable.

After this, it executes try_to_freeze_tasks() that sends a fake signal to all user space processes, and wakes up all the kernel threads.

After this, try_to_freeze_tasks () is executed to send a pseudo signal to all userspace processes and all kernel threads.

All freezable tasks must react to that by calling try_to_freeze(), which results in a call to __refrigerator() (defined in kernel/freezer.c), which sets the task’s PF_FROZEN flag, changes its state to TASK_UNINTERRUPTIBLE and makes it loop until PF_FROZEN is cleared for it.

All freezable tasks must react by calling try_to_freeze (). This sets the task's PF_FROZEN flag, changes the task's state to TASK_UNTERRRUPTIBLE, and loops until PF_FROZEN is cleared.

Then, we say that the task is ‘frozen’ and therefore the set of functions handling this mechanism is referred to as ‘the freezer’

And since the task is frozen, the set of functions that handle this mechanism is called" the freezer ".

(these functions are defined in kernel/power/process.c, kernel/freezer.c & include/linux/freezer.h).

(These functions are described in kernel / power / process.c, kernel / freezer.c and include / linux / freezer.h)

User space processes are generally frozen before kernel threads

User-spatial processes are basically frozen before the kernel thread. ..

.

__refrigerator() must not be called directly.

You cannot call __refrigerator () directly.

Instead, use the try_to_freeze() function (defined in include/linux/freezer.h), that checks if the task is to be frozen and makes the task enter __refrigerator().

Instead, use the try_yo_freeze () function (defined in include / linux / freezer.h), which checks if the task should be supervised and the task moves to __refrigerator () To do.

.

For user space processes try_to_freeze() is called automatically from the signal-handling code, but the freezable kernel threads need to call it explicitly in suitable places or use the wait_event_freezable() or wait_event_freezable_timeout() macros (defined in include/linux/freezer.h) that combine interruptible sleep with checking if the task is to be frozen and calling try_to_freeze().

In a userspace process, try_yo_freeze () is automatically called from the signal handling code. However, the freezable kernel thread must be explicitly called in place or use wait_event_freezable () or wait_event_freezable_timeout () macro (defined in include / linux / freezer.h). This is an interruptible sleep macro that determines if a task should freeze and whether to call try_to_freeze ().

The main loop of a freezable kernel thread may look like the following one:

The main looop of the freezable kernel thread has the following form.

set_freezable();
do {
        hub_events();
        wait_event_freezable(khubd_wait,
                        !list_empty(&hub_event_list) ||
                        kthread_should_stop());
} while (!kthread_should_stop() || !list_empty(&hub_event_list));
(from drivers/usb/core/hub.c::hub_thread()).

If a freezable kernel thread fails to call try_to_freeze() after the freezer has initiated a freezing operation, the freezing of tasks will fail and the entire hibernation operation will be cancelled.

The freezablekernel thread will fail if you call try_to_freeze () after freezer initializes the freezing process. The task freezing will fail and the transition to hibernation processing will be cancelled.

For this reason, freezable kernel threads must call try_to_freeze() somewhere or use one of the wait_event_freezable() and wait_event_freezable_timeout() macros.

For this reason, the freezable kernel thread must either call try_to_freeze () or use either wait_event_freezable () and wait_event_freezable_timeout () macros.

.

After the system memory state has been restored from a hibernation image and devices have been reinitialized, the function thaw_processes() is called in order to clear the PF_FROZEN flag for each frozen task. Then, the tasks that have been frozen leave __refrigerator() and continue running.

When the system memory state is restored from the hibernation image and the device can be reinitialized, thaw_processes () is called. This clears the PF_FROZEN flag for each frozen task. As a result, the task is separated from __refrigerator () and can continue to be processed. ..

Rationale behind the functions dealing with freezing and thawing of tasks

The relationship between task freezing and thawing is as follows.

freeze_processes():

freezes only userspace tasks

Freeze only user space tasks.

freeze_kernel_threads():

freezes all tasks (including kernel threads) because we can’t freeze kernel threads without freezing userspace tasks

Freeze all tasks, including kernel threads. This is because user space tasks cannot be frozen and kernel threads cannot be frozen.

thaw_kernel_threads():

thaws only kernel threads; this is particularly useful if we need to do anything special in between thawing of kernel threads and thawing of userspace tasks, or if we want to postpone the thawing of userspace tasks

Restore the kernel thread. This is useful if you want to do something special between unzipping a kernel thread and unzipping a userspace task, or if you want to defer unzipping a userspace task.

thaw_processes():

thaws all tasks (including kernel threads) because we can’t thaw userspace tasks without thawing kernel threads

Answer all tasks, including kernel thread. Because you can't unzip userspace tasks without unzipping kernel thread.

III. Which kernel threads are freezable? (Which kernel threads are freezable?)

Kernel threads are not freezable by default.

The kernel thread is not freezable by default.

However, a kernel thread may clear PF_NOFREEZE for itself by calling set_freezable() (the resetting of PF_NOFREEZE directly is not allowed).

However, the kernel thread can remove its own PF_NOFREEZE by calling set_freezable (). (Resetting PF_NOFREEZE directly is not allowed).

From this point it is regarded as freezable and must call try_to_freeze() in a suitable place.

From this point on, it is considered freezable and you must call try_to_freeze () in the appropriate place.

.

IV. Why do we do that? (Why do we do that?)

Generally speaking, there is a couple of reasons to use the freezing of tasks:

Generally speaking, there are several reasons why you should use task freezing.

.

The principal reason is to prevent filesystems from being damaged after hibernation.

The main reason is to prevent damage to the file system after hibernation.

At the moment we have no simple means of checkpointing filesystems, so if there are any modifications made to filesystem data and/or metadata on disks, we cannot bring them back to the state from before the modifications.

At this time, there is no easy way to set a file system checkpoint. Therefore, if you make changes to the data or metadata on the disk, you cannot restore it to its original state.

At the same time each hibernation image contains some filesystem-related information that must be consistent with the state of the on-disk data and metadata after the system memory state has been restored from the image

At this time, each hibernation image contains file system related information that must match the state of the data and metadata on the disk after the state of system memory has been restored from the image.

(otherwise the filesystems will be damaged in a nasty way, usually making them almost impossible to repair).

(Otherwise, the file system is damaged in a cumbersome way, which is usually almost impossible to recover).

We therefore freeze tasks that might cause the on-disk filesystems’ data and metadata to be modified after the hibernation image has been created and before the system is finally powered off.

Therefore, a hibernation image is created that freezes potentially changing tasks in the filesystem data and metadata on disk before the system is finally powered off.

The majority of these are user space processes, but if any of the kernel threads may cause something like this to happen, they have to be freezable.

Many of these are user-space processes, but they are also freezable if you have a kernel thread that does something that causes this.

.

Next, to create the hibernation image we need to free a sufficient amount of memory (approximately 50% of available RAM) and we need to do that before devices are deactivated, because we generally need them for swapping out.

Second, you need to free enough memory to make a hibernation image (roughly 50% of the available RAM). Must be done before deactivating the device. Because it is necessary to swap out.

Then, after the memory for the image has been freed, we don’t want tasks to allocate additional memory and we prevent them from doing that by freezing them earlier.

Therefore, after the memory for the image is released, it is prevented by freezing earlier so that the task does not require more memory.

[Of course, this also means that device drivers should not allocate substantial amounts of memory from their .suspend() callbacks before hibernation, but this is a separate issue.]

[Of course, this means that you shouldn't allocate a lot of memory from each suspend () callback before the device driver becomes hibernation. This is another matter] .

The third reason is to prevent user space processes and some kernel threads from interfering with the suspending and resuming of devices.

The third reason is to ensure that userspace processes and some kernel threads do not interfere with device suspend and resume.

A user space process running on a second CPU while we are suspending devices may, for example, be troublesome and without the freezing of tasks we would need some safeguards against race conditions that might occur in such a case.

For example, a userspace process running on a second CPU can be annoying when trying to suspend a device. Without freezing the task, you need protection against race conditions that cause such cases.

.

Although Linus Torvalds doesn’t like the freezing of tasks, he said this in one of the discussions on LKML (http://lkml.org/lkml/2007/4/27/608):

Linus Torvalds hates freezing tasks. I mentioned this in one of the discussions about LKML (http://lkml.org/lkml/2007/4/27/608)

.

“RJW:> Why we freeze tasks at all or why we freeze kernel threads?” (Why freeze all tasks or why we freeze kernel threads?)

Linus: In many ways, ‘at all’. (Often all) I do realize the IO request queue issues, and that we cannot actually do s2ram with some devices in the middle of a DMA.

I understand the IO request issue. Also, you cannot s2ram with a device in the middle of DMA.

So we want to be able to avoid that, there’s no question about that.

So there is no question about this to avoid this.

And I suspect that stopping user threads and then waiting for a sync is practically one of the easier ways to do so.

And stopping the user thread and then waiting for synchronization is actually one of the easiest ways.

So in practice, the ‘at all’ may become a ‘why freeze kernel threads?’ and freezing user threads I don’t find really objectionable.”

So, in reality,'at all' becomes "why freeze kernel threads" and freezes user threads, which I don't think is really unpleasant.

.

Still, there are kernel threads that may want to be freezable.

Still, I have a kernel thread that I want to be freezable.

For example, if a kernel thread that belongs to a device driver accesses the device directly, it in principle needs to know when the device is suspended, so that it doesn’t try to access it at that time.

For example, if the kernel thread belongs to the device driver and accesses the device directly, you need to know if it is time for the device to enter suspend and you must not access the device at that point.

However, if the kernel thread is freezable, it will be frozen before the driver’s .suspend() callback is executed and it will be thawed after the driver’s .resume() callback has run, so it won’t be accessing the device while it’s suspended.

However, if the kernel thread is freezable, it will be frozen before the driver suspend () callback is executed. It will then be answered before the driver resume () callback is executed. Accessing the device will die during suspend.

Another reason for freezing tasks is to prevent user space processes from realizing that hibernation (or suspend) operation takes place.

Another reason to freeze a task is to make the userspace process unaware that hibernation (or suspend) processing is taking place.

Ideally, user space processes should not notice that such a system-wide operation has occurred and should continue running without any problems after the restore (or resume from suspend).

Ideally, the user-space process should be able to continue working without problems after restoring (or resume from suspend) without knowing that such a system-wide operatio is occurring.

Unfortunately, in the most general case this is quite difficult to achieve without the freezing of tasks.

Unfortunately, in the most common cases it is very difficult to achieve this without freezing the task. ..

Consider, for example, a process that depends on all CPUs being online while it’s running.

For example, consider a process that depends on all CPUs that are running online.

Since we need to disable nonboot CPUs during the hibernation, if this process is not frozen, it may notice that the number of CPUs has changed and may start to work incorrectly because of that.

If you want to stop a non-starting CPU during hibernation, you can't freeze that process. It will notice that the number of CPUs has changed and will not work properly.

Recommended Posts

Freezing of tasks (1/2)
Freezing of tasks (2/2)
Administrative tasks
Automating simple tasks with Python Table of contents