[PYTHON] Pathlib provides a common interface for file path operations

Pathlib introduced from Python3.4 is roughly divided into multiple os, os.path, glob, etc. It can be said that it is a collection of path operations scattered in the modules of. It's convenient to have it, but the impression is that you don't have to worry about it.

Display directories / files under the specified directory

For the time being, should I just use "display directories / files under the specified directory", which I often use?

Directories under the specified directory/File display(pathlib)


>>> from pathlib import Path
>>> p = Path('.')
>>> a = list(p.glob('**/*'))
>>> a
[PosixPath('.gitkeep'), PosixPath('css'), PosixPath('css/style.css')]

If you write (almost) similar processing in an existing module, it will be like this.

Directories under the specified directory/File display(Existing module)


#Use the os module. Unlike the pathlib version, the specified path is entered, so if you are concerned, you can compare root and path and exclude them.
#There is no particular problem, but it's long...
>>> import os
>>> def _walk(path):
    for root, dirs, files in os.walk(path):
        yield root
        for f in files:
            yield os.path.join(root, f)
>>> a = list(_walk('.'))
>>> a
['.', './.gitkeep', './css', './css/style.css']

#Use the glob module. Dot files cannot be picked up, or directories in the middle cannot be picked up.
>>> import glob
>>> glob.glob('**/*')
['css/style.css']

Join Path

A little characteristic is that you can combine Paths with the / operator. It can be done not only with Paths but also with character strings.

Path join


>>> p = Path('test')
>>> p2 = Path('test2')
>>> p/p2 #Paths
PosixPath('test/test2')
>>> p/'test3' # Path/String
PosixPath('test/test3')
>>> 'test4'/p #String/Path
PosixPath('test4/test')

If you look at the cpython implementation, you can see that the methods corresponding to the / operator (\ __ truediv__ and \ __ rturediv__) are implemented. (ref. https://docs.python.org/3/reference/datamodel.html#emulating-numeric-types)

Implementation of Path join


class PurePath(object):
..
    #Paths, Paths/If it is a string, it is called
    def __truediv__(self, key):
        return self._make_child((key,))

    #String/In the case of Path, it is called
    def __rtruediv__(self, key):
        return self._from_parts([key] + self._parts)

This specification, here seems to be quite criticized when you look at it ... Personally, I'm likely to add it by mistake.

Examples that may be mistaken personally


>>> p = Path('test')
>>> 'test2' + p
Traceback (most recent call last):
  File "<ipython-input-8-22275bd1c6c1>", line 1, in <module>
    'test2' + p
TypeError: Can't convert 'PosixPath' object to str implicitly

So, rather than using the / operator, I think it's better not to hesitate to process everything as Path.

Thoroughly if you use it anyway


>>> p = Path('test')
>>> p2 = Path('test2')
>>> p3 = Path('test3')
#A new Path can be generated from any number of Paths
>>> Path(p, p2, p3)
PosixPath('test/test2/test3')

Backward compatible

Like the enum I mentioned the day before, you can install third-party libraries with pip. (Python 2.7 or later)

python


% pip search pathlib
pathlib                   - Object-oriented filesystem paths

Recommended Posts

Pathlib provides a common interface for file path operations
Basic commands for file operations
Package filer for simple file operations
Get the file path using Pathlib
File operations
I made a python dictionary file for Neocomplete