[LINUX] I followed the implementation of the du command (second half)

Introduction

This is a continuation of Last time. We will continue to follow the implementation of the du command.

review

I will repost the du_files () I saw last time.

python


/* Recursively print the sizes of the directories (and, if selected, files)
   named in FILES, the last entry of which is NULL.
   BIT_FLAGS controls how fts works.
   Return true if successful.  */

static bool
du_files (char **files, int bit_flags)
{
  bool ok = true;

  if (*files)
    {
      FTS *fts = xfts_open (files, bit_flags, NULL); //(1)

      while (1)
        {
          FTSENT *ent;

          ent = fts_read (fts); //(2)
          if (ent == NULL)
            {
              if (errno != 0)
                {
                  error (0, errno, _("fts_read failed"));
                  ok = false;
                }
              break;
            }
          FTS_CROSS_CHECK (fts);
          ok &= process_file (fts, ent); //(3)
        }

      if (fts_close (fts) != 0)
        {
          error (0, errno, _("fts_close failed"));
          ok = false;
        }
    }

  return ok;
}

The processing performed here is the following three.

(1). Get the FTS structure from the file name with xfts_open (). (2). Get the FTSENT structure from the FTS structure with fts_read (). --- Until last time (3). Call process_file () and return to (2).

The figure is as follows. du.jpeg

This time, we will look at process_file () in (3).

process_file()

python


static bool
process_file (FTS *fts, FTSENT *ent)
{
  bool ok = true;
  struct duinfo dui;
  struct duinfo dui_to_print;
  size_t level;
  static size_t prev_level;
  static size_t n_alloc;
  static struct dulevel *dulvl;

  const char *file = ent->fts_path;
  const struct stat *sb = ent->fts_statp;

  //Continue

Let's look at the variables.

duinfo_set() Then call duinfo_set ().

python


duinfo_set (&dui,
            (apparent_size
             ? MAX (0, sb->st_size)
             : (uintmax_t) ST_NBLOCKS (*sb) * ST_NBLOCKSIZE),
            (time_type == time_mtime ? get_stat_mtime (sb)
             : time_type == time_atime ? get_stat_atime (sb)
             : get_stat_ctime (sb)));

python


static inline void
duinfo_set (struct duinfo *a, uintmax_t size, struct timespec tmax)
{
  a->size = size;
  a->inodes = 1;
  a->tmax = tmax;
}

duinfo_set () is a function that sets the file size and time stamp in the first argument dui.

ʻApparent_size is a flag that outputs the actual file size, not the disk usage. The disk usage of a file is the file size rounded up to the file system block size. For example, if the block size is 4Kbyte and the file size is 8.5Kbyte, the disk usage will be 12Kbyte. The du command displays disk usage by default. When ʻapparent_size is true, the actual file size is passed to the argument ofduinfo_set (), and when false, the block usage is passed.

Add file size

The file size has been set in dui byduinfo_set (). See the next process.

python


level = ent->fts_level;
dui_to_print = dui;

Enter the depth of the file hierarchy in level. Also set dui_to_print.

The next step is to add the dui file size to the directory size. The processing is divided into the following three patterns.

  1. When you are in the same file hierarchy as you followed
  2. When going down the file hierarchy and increasing in depth
  3. When you go up the file hierarchy and the depth decreases

1. When you are in the same file hierarchy as you followed

python


prev_level = level;

if (! (opt_separate_dirs && IS_DIR_TYPE (info)))
  duinfo_add (&dulvl[level].ent, &dui);

duinfo_add (&tot_dui, &dui);

First, add the file size of dui to duvlv [level] .ent. dulvl [level] is a structure that handles directory information at depth level. Here, the total file size of the same hierarchy is calculated. Note that ʻopt_separate_dirs is an option that does not include the size of the subdirectory, and ʻIS_DIR_TYPE (ent-> fts_info) is true if the file you are looking at is a directory. That is, if you do not include subdirectories, the directory size will not be added. Next, add the file size to tot_dui as well. tot_dui is the sum of all file sizes.

2. When going down the file hierarchy and increasing in depth

python


size_t i;

for (i = prev_level + 1; i <= level; i++)
    {
      duinfo_init (&dulvl[i].ent);
      duinfo_init (&dulvl[i].subdir);
    }

prev_level = level;

if (! (opt_separate_dirs && IS_DIR_TYPE (info)))
  duinfo_add (&dulvl[level].ent, &dui);

duinfo_add (&tot_dui, &dui);

When you go down the file hierarchy, you will be in that directory for the first time, so initialize dulvl [i] with duinfo_init (). The rest of the process is the same as before.

3. When you go up the file hierarchy and the depth decreases

python


duinfo_add (&dui_to_print, &dulvl[prev_level].ent);
if (!opt_separate_dirs)
  duinfo_add (&dui_to_print, &dulvl[prev_level].subdir);
duinfo_add (&dulvl[level].subdir, &dulvl[prev_level].ent);
duinfo_add (&dulvl[level].subdir, &dulvl[prev_level].subdir);

prev_level = level;

if (! (opt_separate_dirs && IS_DIR_TYPE (ent->fts_info)))
  duinfo_add (&dulvl[level].ent, &dui);

duinfo_add (&tot_dui, &dui);

When you go up the file hierarchy, the previous location is the current subdirectory. So, add the capacity of the previous directory to dui_to_print and dulvl [level] .subdir.

Finally, if the file you are looking at is a directory, it will display the directory capacity.

python


if ((IS_DIR_TYPE (ent->fts_info) && level <= max_depth)
    || ((opt_all && level <= max_depth) || level == 0))
  print_size (&dui_to_print, file);

in conclusion

This is the first time I have read the source code of a command, but it is also recommended for beginners because the code is not long and the operating principle is simple. Also, it is fun to read while anticipating the operating principle of the command.

Recommended Posts

I followed the implementation of the du command (second half)
I followed the implementation of the du command (first half)
I read the implementation of golang channel
[Linux] I tried to summarize the command of resource confirmation system
I investigated the mechanism of flask-login!
[Python] I thoroughly explained the theory and implementation of decision trees
I made AI think about the lyrics of Kenshi Yonezu (implementation)
I tried to summarize the frequently used implementation method of pytest-mock
I used the worldcup command to check the outcome of the World Cup.
I want to leave an arbitrary command in the command history of Shell
I tried to summarize the umask command
I checked the options of copyMakeBorder of OpenCV
Othello-From the tic-tac-toe of "Implementation Deep Learning" (3)
I summarized the folder structure of Flask
I didn't know the basics of Python
Migemo version of the: find command,: mfind
The Python project template I think of.
The second night of the loop with for
Read the implementation of ARM global timer
Notice the completion of a time-consuming command
Othello-From the tic-tac-toe of "Implementation Deep Learning" (2)
I made an appdo command to execute a command in the context of the app
I want to add silence to the beginning of a wav file for 1 second
I tried cluster analysis of the weather map
I solved the deepest problem of Hiroshi Yuki.
Why the Python implementation of ISUCON 5 used Bottle
I checked the list of shortcut keys of Jupyter
I tried to touch the API of ebay
I tried to correct the keystone of the image
Try the free version of Progate [Python I]
I checked the session retention period of django
The story of misreading the swap line of the top command
I checked the processing speed of numpy one-dimensionalization
I touched some of the new features of Python 3.8 ①
Predict the second round of summer 2016 with scikit-learn
Put the second axis in 2dhistgram of matplotlib
I read and implemented the Variants of UKR
I want to customize the appearance of zabbix
I tried using the image filter of OpenCV
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
The story of Linux that I want to teach myself half a year ago
Second half of the first day of studying Python Try hitting the Twitter API with Bottle
I took a look at the contents of sklearn (scikit-learn) (1) ~ What about the implementation of CountVectorizer? ~
I want to use Linux commands at the command prompt! Use Linux commands at the command prompt instead of Git Bash