Linux archiving and compression (tar and gzip)

Archive and compression

A file that is a collection of multiple files and directories is called a ** archive ** (archive).

Also, if you can reduce the size of the archive file when archiving, you can reduce the amount of data when transferring and saving the file.

This process of reducing the size is called ** compression **.

Archive the file

Use the tar command on Linux to archive files.

↓ Creating a practice file

$ mkdir dir1
$ touch dir1/file-{1..5}.txt

↓ Confirmation

$ ls dir1
file-1.txt file-2.txt file-3.txt file-4.txt file-5.txt

Creating an archive file

↓ Format

c means create and is specified when creating a new archive file. f means file and is absolutely necessary to specify the newly created archive file name.

.sh


tar cf <Archive file> <Arc curve file path>

↓ Archive the dir1 directory

.sh


$ tar cf dir1.tar dir

Check the contents of the archive file

↓ Format

t means list. f means file and is absolutely necessary to specify the newly created archive file name.

.sh


tar tf <Archive file> <Arc curve file path>

↓ Check the contents

.sh


$ tar tf dir1.tar
dir1/
dir1/file-1.txt
dir1/file-2.txt
dir1/file-3.txt
dir1/file-4.txt

Deploying the archive

Use the x option to unpack the archive file and retrieve the original file or directory.

Format

x represents x of extract (extract, extract).

.sh


tar xf <Archive file>

↓ Expand the archive

To see if the original files can be restored from the archive, delete the dir1 directory and then unpack the archive.

$ ls
dir1  dir1.tar

$ rm -rf dir1 ← Delete original directory
$ tar xf dir1.tar ← expand archive
$ls dir1 ← Confirmation
file-1.txt  file-2.txt  file-3.txt  file-4.txt

Compress the file

gxip command

gzip is a command for compressing and decompressing files. By convention, files compressed with gzip have the extension .gz.

Format

.sh


$ gzip <Compression source file>

The following is for testing purposes, and the result of displaying the process list with the ps command is output redirected to a file called ps.txt. The file size of ps.txt is about 9.6 kilobytes.

$ ps aux > ps.txt
$ ls -lh
-rw-rw-r--. 1 vagrant vagrant  9.6K May  7 19:07 ps.txt

Try compressing with the gzip command

$ gzip ps.txt
$ ls -lh
-rw-rw-r--. 1 vagrant vagrant 2.3K May  7 19:07 ps.txt.gz

As you can see, it was compressed to about 2.3 kilobytes. Also, a new (compressed) file called ps.txt.gz remains, and the original file ps.txt is ** deleted **.

Try decompressing the compressed file.

You must specify the -d option to decompress the compressed file.

$ gzip -d ps.txt.gz
$ ls -lh
-rw-rw-r--. 1 vagrant vagrant 9.6K May  7 19:07 ps.txt

At this time as well, only the expanded file remains as in the case of compression, and the compressed file is deleted.

Commands to compress other files

In addition to the gzip command, there are commands to compress files.

bzip2 command

As a feature, the compression rate is higher than the ** gzip format, and the amount of data can be reduced. ** However, since the time required for compression / decompression is longer than gzip, it is often used when file size is more important than time.

zip command

Unlike tar and gzip, zip performs curve and compression at the same time. So ** you can compress multiple files and directories into one file **. The zip command is not installed by default, so you need to install it to use it.

reference

New Linux textbook

Recommended Posts

Linux archiving and compression (tar and gzip)
OS and Linux distribution
Linux: files and directories
Studying Linux commands and frustration
CLI and Linux basic terms
Linux (about files and directories)
Device and Linux file system
About LINUX files and processes
Recording and playback on Linux
Linux file and directory permissions
Linux tar xz command memo