[PYTHON] How to write files that you should be careful about in all languages

What i want to say

When writing important files, it is necessary to consider unexpected OS shutdown etc. If you do not know how to do it, half-finished files and empty files will be generated, which will be fatal at system startup and in the linked system.

C language / Java / Python / JavaScript (node.js) is given as an example, but it is necessary to take measures in almost all languages.

background

A fatal problem occurred in which the software in production did not start.

When I collected and analyzed logs and config files, the config files were completely corrupted.

The config file is read at startup, but may be written as needed. As I follow the code, I notice that it may be written halfway if it is forcibly terminated during the writing process.

The power supply is cut off when it is used up, and the timing may have miraculously overlapped.

First thing I tried

If you write it directly in the config file, you can get a half-finished state (for example, when you want to write 10 characters, you still have only 1 character), even if it takes a short time. You can write it to config.yml.tmp once, then rename it to config.yml and it will be updated or not, or it can be atomic!

Result 1

It will be released with full satisfaction, but there is a problem that it will not start again. This time the config file is 0KB. If the tmp file writing fails, it will not be renamed, so it should never be 0KB on the flow. The flush is explicit and the close is neat.

I found a certain item when I was flipping through the O'Reilly book Linux system programming that I had at hand.

-** fsync and fdatasync **

What what··? Write the dirty buffer to disk ...?

At this point I finally notice the mistake and the solution. No, I learned it as knowledge, but it didn't come out.

Dirty buffer

The file is not written immediately even if the write process is programmed, and is temporarily saved in the form of a dirty buffer.

The disk export process is extremely slow, so if you export each time, the program will be messed up (100 times or more or at that level). To avoid this, it is a technique that has always been adopted in recent file systems. With this technique, the process entrusts the OS with slow write processing and can move on. When the OS writes to disk depends on the file system that the OS interacts with. In some cases it can take 10 seconds or more, and there is more than enough lead time to corrupt the file.

If the OS goes down due to an unexpected power failure in this state, even if it is flushed or closed on the program flow, a broken file or an empty file will be created. In other words, config.yml.tmp itself was half-finished, so even if it was renamed to config.yml, it was half-finished.

Calling fsync () immediately writes the dirty buffer to disk and waits for it to proceed to the next step.

fsync () and fdatasync ()

In Linux, a file consists of the following two types of data.

--Meta data called inode --Data of the file itself

The inode is the data displayed when the file is modified or managed in a directory.

fsync () writes both, and fdatasync () writes only the data of the file itself.

If the file is not updated very often or you don't have to worry about the performance, you can use fsync ().

Use fdatasync () if the inode (that is, the metadata such as the last update time) does not need to be updated in the worst case, or if it is updated frequently and performance is a concern.

Countermeasures

When generating important files, you can forcibly write to the disk so that you can rest assured even if there is an unexpected shutdown. In Linux, there is no problem with shutting down by following the regular procedure.

Here are some concrete examples of the code. Some error handling is omitted.

sample.c


int main() {
    const char* tmp_file_path = "./fsync_test.txt";
    FILE* fp= fopen(tmp_file_path , "w");
    int fd = fileno(fp);
    fputs("fsync() test\n", fp);
    fflush(fp);

    //This is the point!!
    int fsync_ret = fsync(fd);

    fclose(fp);
    return fsync_ret;
}

FsyncTest.java


import java.io.File;
import java.io.FileOutputStream;

public class FsyncTest {
    public static void main(String[] args) throws Exception {
        File file = new File("./fsync.txt");
        try (FileOutputStream output = new FileOutputStream(file);) {
            output.write("Fsync() test\n".getBytes("UTF-8"));

            //This is the point!!
            output.getFD().sync();
        }
    }
}

sample.py


import os

with open('./fsync_test.txt', 'w') as f:
    f.write('fsync() test')
    f.flush() #Dirty buffer with just this

    #This is the point!!
    os.fsync(f.fileno())

fsync.js


const http = require('http');

const server = http.createServer((request, response) => {
    const fs = require('fs');
    fs.open('./fsync_test.txt', 'w', (err, fd) => {
        fs.write(fd, 'fsync() test\n', () => {
            
            //This is the point!!
            fs.fsyncSync(fd);

            response.writeHead(200, {'Content-Type': 'text/plain'})
            response.end('Write success\n');
            fs.close(fd, ()=>{});
        })
    })
})

server.listen(9000);

Result 2

The file is no longer corrupted.

By the way, the time of being a dirty buffer is quite long. Depending on the file system, it may take about 30 seconds. Before taking measures, in Windows 7 at hand, I wrote a file, opened it with a text editor, confirmed that the contents were written, and when I unplugged the power after 15 seconds, the file was destroyed after startup. Was there. When I tried it in the CentOS 6 environment, the result was almost the same.

After the measures, it did not break even immediately after writing.

Conclusion

Regardless of the language, fsync () or equivalent processing is essential when generating important files. However, it forces a very slow synchronous disk write, which can have a serious impact on performance under some conditions. It is NG to do it in the dark clouds because it is safe.

Supplement

Related issues

You have provided a link in the comments that is highly relevant to this article.

firefox challenges

https://www.atmarkit.co.jp/flinux/rensai/watch2009/watch05a.html

This is a problem caused by slow fsync (). It is an article that using fdatasync () makes the inode faster because it is not updated.

PostgreSQL Challenges

https://masahikosawada.github.io/2019/02/17/PostgreSQL-fsync-issue/ The problem is that if fsync () fails, you can't call fsync () again. In PostgreSQL, if fsync () fails, crash the database and from the transaction log (WAL) It seems that he made a correction to restore it.

I wondered if it would fail in the first place, but it seems to occur easily with SAN and NFS. If you write on a general system, you should keep the data to write () and start over from write ().

Support on Windows

fsync () is a Linux library function and cannot be used on Windows. In Windows, it can be realized by using the following API.

BOOL FlushFileBuffers(HANDLE hFile);

Also, in Windows 7, it can be realized by setting the entire OS. [1] Open the control panel [2] Open Device Manager [3] Select a disk from the disk drive and open its properties [4] On the Policy tab, uncheck "Enable device write cache"

Note that this method will slow down not only the app but all operations.

Recommended Posts

How to write files that you should be careful about in all languages
You have to be careful about the commands you use every day in the production environment.
Summary of how to write .proto files used in gRPC
How to write soberly in pandas
How to read CSV files in Pandas
How to write this process in Perl?
How to write Ruby to_s in Python
How to write regular expression patterns in Linux
How to check / extract files in RPM package
How to get the files in the [Python] folder
How to write async and await in Vue.js
How to write a named tuple document in 2020
How to write string concatenation in multiple lines in Python
How to load files in Google Drive with Google Colaboratory
[Python] How to write a docstring that conforms to PEP8
How to use variables in systemd Unit definition files
How to download files from Selenium in Python in Chrome
How to upload files in Django generic class view
How to reference static files in a Django project
2 ways to read all csv files in a folder
How to set up a simple SMTP server that can be tested locally in Python
[Python3] Code that can be used when you want to resize images in folder units
How to write type hints for variables that are assigned multiple times in one line
Summary of Python implementation know-how and tips that AI engineers want to be careful about
About __all__ in python
How to test that Exception is raised in python unittest
How to write custom validations in the Django REST Framework
A story about how to specify a relative path in python.
Processing of python3 that seems to be usable in paiza
After all, how much should I write a Qiita article?
How to import a file anywhere you like in Python
How to write a test for processing that uses BigQuery
Batch convert all xlsx files in the folder to CSV files
How to get all the keys and values in the dictionary
How to write a metaclass that supports both python2 and python3
How doi may be useful when asking how to write code?
Command to list all files in order of file name
How to trick and use a terrible library that is supposed to be kept globally in flask
If you write go table driven test in python, it may be better to use subTest
How to solve the problem that time goes wrong every time you turn on the power in Linux