[LINUX] Step into the darkness of msync

This summary

--msync (MS_ASYNC) does nothing (unintelligible)

Step into the darkness of msync

https://www.oreilly.co.jp/community/blog/2010/09/buffer-cache-and-aio-part2.html

In this case, it is msync (2) that tells you that the memory area needs to be written back to disk. If you specify MS_ASYNC, the system call ends with just a notification. If MS_SYNC is specified, in addition to the notification, processing equivalent to fsync (2) is performed internally and the data is written back to the disk.

I will read the source code around here.

Linux Kernel Comments

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/mm/msync.c

/*

MS_SYNC synchronizes all files, including mapping. MS_ASYNC does not start I / O (it was used before 2.5.67). In addition, it does not mark in association with dirty pages (it was marked before 2.6.17). At this point, dirty pages are tracked properly, so I'm not doing anything right now.

The application writes dirty pages, so you can run fsync () to check when the writing is finished or the result. Alternatively, the application can run fadvise (FADV_DONTNEED) for immediate asynchronous export. Therefore, in MS_ASYNC, you can provide complete flexibility to your application by not initiating I / O.

Take a look at the actual source code

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/mm/msync.c#n61

If you extract only the important parts,

	/*
	 * If the interval [start,end) covers some unmapped address ranges,
	 * just ignore them, but return -ENOMEM at the end.
	 */
	down_read(&mm->mmap_sem);
	vma = find_vma(mm, start);
	for (;;) {

<Omitted>

		start = vma->vm_end;
		if ((flags & MS_SYNC) && file &&
				(vma->vm_flags & VM_SHARED)) {
			get_file(file);
			up_read(&mm->mmap_sem);
			error = vfs_fsync_range(file, fstart, fend, 1);
			fput(file);
			if (error || start >= end)
				goto out;
			down_read(&mm->mmap_sem);
			vma = find_vma(mm, start);
		} else {
			if (start >= end) {
				error = 0;
				goto out_unlock;
			}
			vma = vma->vm_next;
		}
	}

So, if MS_SYNC is specified and mapping is possible, it will be reflected by vfs_fsync_range () and fput (file). If not, he really does nothing.

Then, if you specify MS_ASYNC, when will it be reflected? Is treated as a dirty page and is updated every 5 seconds by default by pdflush ().

https://naoya-2.hatenadiary.org/entry/20070523/1179938637

This patch was cleaned

Apparently, the handling of dirty status can now be automated, so it seems that it was refactored including the fact that it was wasteful to set the flag. I see, i see.

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/mm/msync.c?id=204ec841fbea3e5138168edbc3a76d46747cc987

[PATCH] mm: msync() cleanup With the tracking of dirty pages properly done now, msync doesn't need to scan the PTEs anymore to determine the dirty status.

From: Hugh Dickins <hugh@veritas.com>

In looking to do that, I made some other tidyups: can remove several #includes, and sys_msync loop termination not quite right.

Most of those points are criticisms of the existing sys_msync, not of your patch.
In particular, the loop termination errors were introduced in 2.6.17: I did notice this shortly before it came out, but decided I was more likely to get it wrong myself, and make matters worse if I tried to rush a last-minute fix in. And it's not terribly likely to go wrong, nor disastrous if it does go wrong (may miss reporting an unmapped area; may also fsync file of a following vma).

Dirty page tracking is now better, and msync no longer needs to scan the PTE to determine the dirty status.

From: Hugh Dickins <hugh@veritas.com>

To do this, I did some cleanup, such as removing unnecessary inludes. And the loop termination condition of sys_nsync was incorrect.

Most of these points point to existing sys_msync, not your patch. In particular, loop termination errors were introduced in 2.6.17. I noticed this problem shortly before it happened, but I was more likely to make a mistake myself, and making last-minute corrections made it even worse. It didn't work (you could miss the report of the unmapped area, and you could also have the following vma fsync file).

Recommended Posts

Step into the darkness of msync
Divide the string into the specified number of characters
Enter into stdin of the running Docker container
The beginning of cif2cell
The meaning of self
the zen of Python
The story of sys.path.append ()
Revenge of the Types: Revenge of types
Recover CentOS 8 who fell into the darkness of the infinite restart loop in single user mode
How to connect the contents of a list into a string
Align the version of chromedriver_binary
Scraping the result of "Schedule-kun"
10. Counting the number of lines
The story of building Zabbix 4.4
Towards the retirement of Python2
[Apache] The story of prefork
Paver Application-Incorporate into the project
Compare the fonts of jupyter-themes
About the ease of Python
Get the number of digits
Explain the code of Tensorflow_in_ROS
Reuse the results of clustering
GoPiGo3 of the old man
Calculate the number of changes
Change the theme of Jupyter
The popularity of programming languages
Change the style of matplotlib
Visualize the orbit of Hayabusa2
About the components of Luigi
Connected components of the graph
Filter the output of tracemalloc
About the features of Python
Simulation of the contents of the wallet
The Power of Pandas: Python