What i want to say

When writing important files, it is necessary to consider unexpected OS shutdown etc. If you do not know how to do it, half-finished files and empty files will be generated, which will be fatal at system startup and in the linked system.

C language / Java / Python / JavaScript (node.js) is given as an example, but it is necessary to take measures in almost all languages.

background

A fatal problem occurred in which the software in production did not start.

When I collected and analyzed logs and config files, the config files were completely corrupted.

The config file is read at startup, but may be written as needed. As I follow the code, I notice that it may be written halfway if it is forcibly terminated during the writing process.

The power supply is cut off when it is used up, and the timing may have miraculously overlapped.

First thing I tried

If you write it directly in the config file, you can get a half-finished state (for example, when you want to write 10 characters, you still have only 1 character), even if it takes a short time. You can write it to config.yml.tmp once, then rename it to config.yml and it will be updated or not, or it can be atomic!

Result 1

It will be released with full satisfaction, but there is a problem that it will not start again. This time the config file is 0KB. If the tmp file writing fails, it will not be renamed, so it should never be 0KB on the flow. The flush is explicit and the close is neat.

I found a certain item when I was flipping through the O'Reilly book Linux system programming that I had at hand.

-** fsync and fdatasync **

What what··? Write the dirty buffer to disk ...?

At this point I finally notice the mistake and the solution. No, I learned it as knowledge, but it didn't come out.

Dirty buffer

The file is not written immediately even if the write process is programmed, and is temporarily saved in the form of a dirty buffer.

The disk export process is extremely slow, so if you export each time, the program will be messed up (100 times or more or at that level). To avoid this, it is a technique that has always been adopted in recent file systems. With this technique, the process entrusts the OS with slow write processing and can move on. When the OS writes to disk depends on the file system that the OS interacts with. In some cases it can take 10 seconds or more, and there is more than enough lead time to corrupt the file.

If the OS goes down due to an unexpected power failure in this state, even if it is flushed or closed on the program flow, a broken file or an empty file will be created. In other words, config.yml.tmp itself was half-finished, so even if it was renamed to config.yml, it was half-finished.

Calling fsync () immediately writes the dirty buffer to disk and waits for it to proceed to the next step.

fsync () and fdatasync ()

In Linux, a file consists of the following two types of data.

--Meta data called inode --Data of the file itself

The inode is the data displayed when the file is modified or managed in a directory.

fsync () writes both, and fdatasync () writes only the data of the file itself.

If the file is not updated very often or you don't have to worry about the performance, you can use fsync ().

Use fdatasync () if the inode (that is, the metadata such as the last update time) does not need to be updated in the worst case, or if it is updated frequently and performance is a concern.

Countermeasures

When generating important files, you can forcibly write to the disk so that you can rest assured even if there is an unexpected shutdown. In Linux, there is no problem with shutting down by following the regular procedure.

Here are some concrete examples of the code. Some error handling is omitted.

`sample.c`


int main() {
    const char* tmp_file_path = "./fsync_test.txt";
    FILE* fp= fopen(tmp_file_path , "w");
    int fd = fileno(fp);
    fputs("fsync() test\n", fp);
    fflush(fp);

    //This is the point!!
    int fsync_ret = fsync(fd);

    fclose(fp);
    return fsync_ret;
}

`FsyncTest.java`


import java.io.File;
import java.io.FileOutputStream;

public class FsyncTest {
    public static void main(String[] args) throws Exception {
        File file = new File("./fsync.txt");
        try (FileOutputStream output = new FileOutputStream(file);) {
            output.write("Fsync() test\n".getBytes("UTF-8"));

            //This is the point!!
            output.getFD().sync();
        }
    }
}

`sample.py`


import os

with open('./fsync_test.txt', 'w') as f:
    f.write('fsync() test')
    f.flush() #Dirty buffer with just this

    #This is the point!!
    os.fsync(f.fileno())

`fsync.js`


const http = require('http');

const server = http.createServer((request, response) => {
    const fs = require('fs');
    fs.open('./fsync_test.txt', 'w', (err, fd) => {
        fs.write(fd, 'fsync() test\n', () => {
            
            //This is the point!!
            fs.fsyncSync(fd);

            response.writeHead(200, {'Content-Type': 'text/plain'})
            response.end('Write success\n');
            fs.close(fd, ()=>{});
        })
    })
})

server.listen(9000);

Result 2

The file is no longer corrupted.

By the way, the time of being a dirty buffer is quite long. Depending on the file system, it may take about 30 seconds. Before taking measures, in Windows 7 at hand, I wrote a file, opened it with a text editor, confirmed that the contents were written, and when I unplugged the power after 15 seconds, the file was destroyed after startup. Was there. When I tried it in the CentOS 6 environment, the result was almost the same.

After the measures, it did not break even immediately after writing.

Conclusion

Regardless of the language, fsync () or equivalent processing is essential when generating important files. However, it forces a very slow synchronous disk write, which can have a serious impact on performance under some conditions. It is NG to do it in the dark clouds because it is safe.

Supplement

Related issues

You have provided a link in the comments that is highly relevant to this article.

firefox challenges

https://www.atmarkit.co.jp/flinux/rensai/watch2009/watch05a.html

This is a problem caused by slow fsync (). It is an article that using fdatasync () makes the inode faster because it is not updated.

PostgreSQL Challenges

https://masahikosawada.github.io/2019/02/17/PostgreSQL-fsync-issue/ The problem is that if fsync () fails, you can't call fsync () again. In PostgreSQL, if fsync () fails, crash the database and from the transaction log (WAL) It seems that he made a correction to restore it.

I wondered if it would fail in the first place, but it seems to occur easily with SAN and NFS. If you write on a general system, you should keep the data to write () and start over from write ().

Support on Windows

fsync () is a Linux library function and cannot be used on Windows. In Windows, it can be realized by using the following API.

BOOL FlushFileBuffers(HANDLE hFile);

Also, in Windows 7, it can be realized by setting the entire OS. [1] Open the control panel [2] Open Device Manager [3] Select a disk from the disk drive and open its properties [4] On the Policy tab, uncheck "Enable device write cache"

Note that this method will slow down not only the app but all operations.

[PYTHON] How to write files that you should be careful about in all languages