[LINUX] find mtime option and its companions ([cma] (time | min) option

I've summarized what I recently asked and answered at the company.

There was a talk on Linux that there was a cron process to gzip past files for a certain period of time and I wanted to change the number of days. I used the following command.

find ~~~ -mtime +2 -exec gzip {} \;

I decided to change the +2 part of mtime. I tried to summarize what I answered when asked the meaning of "+2", what I answered incorrectly, and the surrounding parts that I did not answer.

Operation check environment

In the following, "Lubuntu 18.04.4" is used for the part that is actually moved and confirmed.

Review of the basic functions of find

The basic function of find is a command that scans under the specified directory and outputs a file list to standard output.

The command originally mentioned uses the "-exec" option with a ";". This means "execute commands that follow -exec with individual files as arguments" instead of "printing a list of files". The command is executed with each file set in the position of "{}". Personally, when I use "-exec", I often use "xargs" instead.

In the following, we will focus on the options for dealing with the date and time associated with the file. For details of other functions, it is recommended to check JM find.

Narrowing down on a date and time basis

The mtime option is for narrowing down the output files by the modification date and time conditions of the files.

As you can see, there are the following options for narrowing down the date and time attached to the file.

In addition, the related options are:

In the following, we will explain these in detail.

In addition, there is also an option to specify the condition "newer than the specified file" in the option to narrow down the files by the date and time condition. However, I will omit the explanation in this article.

? time and? min

These options are for narrowing down the output files by the conditions of "last status change date", "last modified date", and "last access date" of the file.

The first letter of the option indicates the date and time type.

The second and subsequent characters of the option represent the unit of date and time.

Then follow the option with a number or a signed number. The presence or absence of a sign has the following meanings.

For example, "-mtime +2" is roughly "the last update date is more than 2 days past".

Let's dig into these elements in more detail.

Numerical value (? Time)

This section uses the last modified date and time "mtime".

In the story that triggered it, files that were "more than a certain amount of past" were targeted. As mentioned above, it will be specified by the notation such as "-mtime +2". By the way, what kind of files will be targeted when such a specification is made? Let's experiment a little.

Create a few days' worth of files with different timestamps every 6 hours and see which files are eligible for find.

First, prepare the test environment.

linuser$ rm -f *.txt
linuser$ for day in $( seq -w $( date +%d --date '5 days ago' ) $( date +%d ) ); do for hour in $( seq -w 0 6 23 ); do touch -m -t 202003${day}${hour}00 $day-$hour.txt; done; done
linuser$ date; ls -l
Saturday, March 14, 2020 10:23:45 JST
Total 0
-rw-rw-r--1 linuser linuser 0 March 9 00:00 09-00.txt
-rw-rw-r--1 linuser linuser 0 March 9 06:00 09-06.txt
-rw-rw-r--1 linuser linuser 0 March 9 12:00 09-12.txt
-rw-rw-r--1 linuser linuser 0 March 9 18:00 09-18.txt
-rw-rw-r--1 linuser linuser 0 March 10 00:00 10-00.txt
-rw-rw-r--1 linuser linuser 0 March 10 06:00 10-06.txt
-rw-rw-r--1 linuser linuser 0 March 10 12:00 10-12.txt
-rw-rw-r--1 linuser linuser 0 March 10 18:00 10-18.txt
-rw-rw-r--1 linuser linuser 0 March 11 00:00 11-00.txt
-rw-rw-r--1 linuser linuser 0 March 11 06:00 11-06.txt
-rw-rw-r--1 linuser linuser 0 March 11 12:00 11-12.txt
-rw-rw-r--1 linuser linuser 0 March 11 18:00 11-18.txt
-rw-rw-r--1 linuser linuser 0 March 12 00:00 12-00.txt
-rw-rw-r--1 linuser linuser 0 March 12 06:00 12-06.txt
-rw-rw-r--1 linuser linuser 0 March 12 12:00 12-12.txt
-rw-rw-r--1 linuser linuser 0 March 12 18:00 12-18.txt
-rw-rw-r--1 linuser linuser 0 March 13 00:00 13-00.txt
-rw-rw-r--1 linuser linuser 0 March 13 06:00 13-06.txt
-rw-rw-r--1 linuser linuser 0 March 13 12:00 13-12.txt
-rw-rw-r--1 linuser linuser 0 March 13 18:00 13-18.txt
-rw-rw-r--1 linuser linuser 0 March 14 00:00 14-00.txt
-rw-rw-r--1 linuser linuser 0 March 14 06:00 14-06.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020 14-12.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020 14-18.txt
linuser$

The time in the last two lines is not displayed because the date and time the file was modified points to the future. Now, let's run find.

linuser$ find . -type f -mtime +2 | xargs -r ls -l
-rw-rw-r--1 linuser linuser 0 March 9 00:00 ./09-00.txt
-rw-rw-r--1 linuser linuser 0 March 9 06:00 ./09-06.txt
-rw-rw-r--1 linuser linuser 0 March 9 12:00 ./09-12.txt
-rw-rw-r--1 linuser linuser 0 March 9 18:00 ./09-18.txt
-rw-rw-r--1 linuser linuser 0 March 10 00:00 ./10-00.txt
-rw-rw-r--1 linuser linuser 0 March 10 06:00 ./10-06.txt
-rw-rw-r--1 linuser linuser 0 March 10 12:00 ./10-12.txt
-rw-rw-r--1 linuser linuser 0 March 10 18:00 ./10-18.txt
-rw-rw-r--1 linuser linuser 0 March 11 00:00 ./11-00.txt
-rw-rw-r--1 linuser linuser 0 March 11 06:00 ./11-06.txt
linuser$ find . -type f -mtime 2 | xargs -r ls -l
-rw-rw-r--1 linuser linuser 0 March 11 12:00 ./11-12.txt
-rw-rw-r--1 linuser linuser 0 March 11 18:00 ./11-18.txt
-rw-rw-r--1 linuser linuser 0 March 12 00:00 ./12-00.txt
-rw-rw-r--1 linuser linuser 0 March 12 06:00 ./12-06.txt
linuser$ find . -type f -mtime -2 | xargs -r ls -l
-rw-rw-r--1 linuser linuser 0 March 12 12:00 ./12-12.txt
-rw-rw-r--1 linuser linuser 0 March 12 18:00 ./12-18.txt
-rw-rw-r--1 linuser linuser 0 March 13 00:00 ./13-00.txt
-rw-rw-r--1 linuser linuser 0 March 13 06:00 ./13-06.txt
-rw-rw-r--1 linuser linuser 0 March 13 12:00 ./13-12.txt
-rw-rw-r--1 linuser linuser 0 March 13 18:00 ./13-18.txt
-rw-rw-r--1 linuser linuser 0 March 14 00:00 ./14-00.txt
-rw-rw-r--1 linuser linuser 0 March 14 06:00 ./14-06.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020./14-12.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020./14-18.txt
linuser$

This is the result.

If "-mtime 2" is specified, "48 hours ago to 72 hours ago" is output. In other words, it means something like "the number of days going back to the past is 2 days, but less than 24 hours is rounded down".

It may be easier to understand if it is represented by a diagram like the one below.

findのmtimeの範囲_daystartなし.png

It is difficult to understand that the base point is not 0 o'clock even though the calculation unit is the number of days.

In fact, there is an option to set the base point to 0 o'clock. That is "-day start". Try again with this option. You must specify "-daystart" before "-mtime".

linuser$ find . -type f -daystart -mtime +2 | xargs -r ls -l
-rw-rw-r--1 linuser linuser 0 March 9 00:00 ./09-00.txt
-rw-rw-r--1 linuser linuser 0 March 9 06:00 ./09-06.txt
-rw-rw-r--1 linuser linuser 0 March 9 12:00 ./09-12.txt
-rw-rw-r--1 linuser linuser 0 March 9 18:00 ./09-18.txt
-rw-rw-r--1 linuser linuser 0 March 10 00:00 ./10-00.txt
-rw-rw-r--1 linuser linuser 0 March 10 06:00 ./10-06.txt
-rw-rw-r--1 linuser linuser 0 March 10 12:00 ./10-12.txt
-rw-rw-r--1 linuser linuser 0 March 10 18:00 ./10-18.txt
-rw-rw-r--1 linuser linuser 0 March 11 00:00 ./11-00.txt
-rw-rw-r--1 linuser linuser 0 March 11 06:00 ./11-06.txt
-rw-rw-r--1 linuser linuser 0 March 11 12:00 ./11-12.txt
-rw-rw-r--1 linuser linuser 0 March 11 18:00 ./11-18.txt
linuser$ find . -type f -daystart -mtime 2 | xargs -r ls -l
-rw-rw-r--1 linuser linuser 0 March 12 06:00 ./12-06.txt
-rw-rw-r--1 linuser linuser 0 March 12 12:00 ./12-12.txt
-rw-rw-r--1 linuser linuser 0 March 12 18:00 ./12-18.txt
-rw-rw-r--1 linuser linuser 0 March 13 00:00 ./13-00.txt
linuser$ find . -type f -daystart -mtime -2 | xargs -r ls -l
-rw-rw-r--1 linuser linuser 0 March 13 00:00 ./13-00.txt
-rw-rw-r--1 linuser linuser 0 March 13 06:00 ./13-06.txt
-rw-rw-r--1 linuser linuser 0 March 13 12:00 ./13-12.txt
-rw-rw-r--1 linuser linuser 0 March 13 18:00 ./13-18.txt
-rw-rw-r--1 linuser linuser 0 March 14 00:00 ./14-00.txt
-rw-rw-r--1 linuser linuser 0 March 14 06:00 ./14-06.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020./14-12.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020./14-18.txt
linuser$

I think the result was easy to understand. The number of days on the calendar is specified as it is. This is also shown in the figure.

findのmtimeの範囲_daystartあり.png

By the way, although it is easier to understand, there are some operation parts that are difficult to understand. Do you know where

"00:00 on the 13th" is included in both the "-mtime 2" and "-mtime -2" results. On the other hand, "00:00 on the 12th" is not included anywhere.

Intuitively, you wouldn't expect all dates and times, including borders, to be included in just one of "-mtime +2", "-mtime 2", or "-mtime -2". Is it? And since the manual also says that it is divided by 24 hours and rounded down, I feel that it is appropriate to move the boundary line to either side.

But in reality it doesn't seem to be the case.

Numerical specification (? Min)

"mmin" is the same as "mtime", except that it is in minutes rather than days.

I will check this only with "-day start". If you do it at the current date and time, the status of the file will change immediately.

linuser$ rm *.txt
linuser$ for day in $( date +%d ); do for min in $( seq -w 53 59 ); do touch -m -t 202003${day}23${min} $day-23${min}.txt; done; done
linuser$ ls -l
Total 0
-rw-rw-r--1 linuser linuser 0 March 14 2020 14-2353.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020 14-2354.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020 14-2355.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020 14-2356.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020 14-2357.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020 14-2358.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020 14-2359.txt
linuser$ find . -type f -daystart -mmin +5 | xargs -r ls -l
-rw-rw-r--1 linuser linuser 0 March 14 2020./14-2353.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020./14-2354.txt
linuser$ find . -type f -daystart -mmin 5 | xargs -r ls -l
-rw-rw-r--1 linuser linuser 0 March 14 2020./14-2356.txt
linuser$ find . -type f -daystart -mmin -5 | xargs -r ls -l
-rw-rw-r--1 linuser linuser 0 March 14 2020./14-2356.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020./14-2357.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020./14-2358.txt
-rw-rw-r--1 linuser linuser 0 March 14 2020./14-2359.txt
linuser$

This also gave the same result as "-mtime". The result was that 56 minutes was included in both "-mmin 5" and "-mmin -5", and 55 minutes was not included anywhere.

Date and time type

I wrote that there are three types of date and time that can be used for "? Time" and "? Min".

At the company, "c" is answered as "file creation date and time". This is wrong. In the notation of "JM find" written above, it is "last status change date and time".

Roughly speaking, the "last access date and time" changes when the file is read, the "last modified date and time" changes when the file is written, and the "last status change date and time" changes when the file is written or the i-node data is changed. Changes to i-node data include changes to file names, owners and groups, permissions, and the number of links.

Let's check it. Create a file at 00 o'clock, read a file at 01 o'clock, change the file name at 02 o'clock, write a file at 03 o'clock, and check the situation at each time.

linuser# rm -f a; date 03140000; touch a; stat a; date 03140100; cat a > /dev/null; stat a; date 03140200; mv a b; mv b a; stat a; date 03140300; echo OK >> a; stat a
Saturday, March 14, 2020 00:00:00 JST
  File: a
  Size: 0               Blocks: 0          IO Block:4096 Normal empty file
Device: 801h/2049d      Inode: 18054       Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-03-14 00:00:00.007999999 +0900
Modify: 2020-03-14 00:00:00.007999999 +0900
Change: 2020-03-14 00:00:00.007999999 +0900
 Birth: -
Saturday, March 14, 2020 01:00:00 JST
  File: a
  Size: 0               Blocks: 0          IO Block:4096 Normal empty file
Device: 801h/2049d      Inode: 18054       Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-03-14 01:00:00.003999999 +0900
Modify: 2020-03-14 00:00:00.007999999 +0900
Change: 2020-03-14 00:00:00.007999999 +0900
 Birth: -
Saturday, March 14, 2020 02:00:00 JST
  File: a
  Size: 0               Blocks: 0          IO Block:4096 Normal empty file
Device: 801h/2049d      Inode: 18054       Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-03-14 01:00:00.003999999 +0900
Modify: 2020-03-14 00:00:00.007999999 +0900
Change: 2020-03-14 02:00:00.007999999 +0900
 Birth: -
Saturday, March 14, 2020 03:00:00 JST
  File: a
  Size: 3               Blocks: 8          IO Block:4096 regular file
Device: 801h/2049d      Inode: 18054       Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-03-14 01:00:00.003999999 +0900
Modify: 2020-03-14 03:00:00.003999999 +0900
Change: 2020-03-14 03:00:00.003999999 +0900
 Birth: -
linuser#

In the output result of the stat command, "Access / Modify / Change" is "atime / mtime / ctime". "a / m / c" seems to be an acronym for "Access / Modify / Change".

As expected, you can see that the state has changed as follows.

By the way, I also found that "atime" does not change just by executing the ls or stat command. The only "access" indicated by the last access date and time is "file read".

Finally

By summarizing, I was able to correct my understanding of ctime. I also found that I mistakenly remembered the method of determining? Time and? Min, and the range of target dates and times.

I realized that putting together leads to a deeper understanding.

In addition, atime changes by reading the file. I wanted to explain to the point that "open () alone does not change, and read () does not change". However, I couldn't think of an easy way to check it with just a shell script, so I gave up.

Recommended Posts

find mtime option and its companions ([cma] (time | min) option