Introducing Python in Practice (PiP)

Python Advent Calendar Day 21 article. I'm very sorry for being late m (_ _;) m

Today, I would like to introduce "Python in Practice (PiP)" that I am currently reading.

What is Python in Practice (PiP)?

Python in Practice (PiP) is written for Pythonista who wants to improve the programming ability of Python. This is a book that was written. It was also selected for the 2014 "Jolt Jolt Awards: The Best Books". Reference: Which is the best IT book of the past year? "Jolt Awards: The Best Books" 2014 edition announced

This book is aimed at Python programmers who want to broaden and deepen their Python knowledge so that they can improve the quality, reliability, speed, maintainability, and usability of their Python programs. Quote: p.1, l.1

The book deals with the following four themes.

--Design patterns for elegant coding --Improved processing speed using parallel processing and Cython --High level networking --Graphics

Today, I would like to introduce the chapter "5. Extending Python" that focuses on improving processing speed. * Fold the contents to the end mm

Extending Python The Extending Python chapter summarizes some tips for improving Python processing performance.

--Use PyPy -PyPy uses Built-in JIT (Just in Time compiler), and the execution time is overwhelmingly longer than using CPython for programs that take a long time to process. Will be shorter. However, please note that the execution time may be longer for programs with short processing time due to the influence of compile time. --Use C or C ++ for Time-critical processing --By writing C or C ++ code in a form that can be referenced from a Python program, you can benefit from the overwhelming processing power of C and C ++. The simplest way to use C or C ++ code in Python is to use the Python C interface. If you want to use the existing C or C ++ library, SWIG or SIP that provides a wrapper for using C or C ++ in Python It is common to use tools such as (: //www.riverbankcomputing.com/software/sip). If you want to use C ++, you can also use boost :: python. See also CFFI (C Foreign Function Interface for Python) for the latest information in this area. --Compile Python code into C code using Cython -Cython is a Python-based language extended to handle static data types. The Cython source code is translated as C / C ++ code and compiled as a Python extension module. Cython is a very useful tool if you are conscious of speeding up, because you can compile most Python code, and the code you write will usually run much faster. --Please refer to the official page for details (cut out) --Use ctypes to access the C library

From this, we will focus on the method of accessing the C library using ctypes and introduce the detailed usage.

Accessing C Libraries with ctypes One of Python's standard modules, ctypes, allows access to stand-alone shared libraries written in C or C ++. (Represented by .so on Linux, .dylib on OS X, and .DLL on Windows.)

Let's actually see how to use the ctype module. Here, as an example, we use the hyphen library that inserts a hyphen that represents a spelling in a given word. (This library itself is used by OpenOffice.org and LibreOffice.) e.g. input: extraordinary, output: ex-traor-di-nary

Specifically, use the following functions in the hyphen library. (Detailed explanation of each function is omitted.)

hyphen.h


//Create a HyphenDict pointer from a dictionary file for hyphen processing
HyphenDict *hnj_hyphen_load(const char *filename);

//For memory release
void hnj_hyphen_free(HyphenDict *hdict);

//Hyphenate word according to HyphenDict pointer
int hnj_hyphen_hyphenate2(HyphenDict *hdict, const char *word, int word_size, char *hyphens, char *hyphenated_word, char ***rep, int **pos, int **cut);

Let's use this library in Python right away!

First, find the shared library hyphen to use.

Hyphenate1.py


import ctypes

class Error(Exception): 
    pass

_libraryName = ctypes.util.find_library("hyphen")
if _libraryName is None:
    raise Error("cannot find hyphenation library")

_LibHyphen = ctypes.CDLL(_libraryName)

It's so simple that it doesn't need much explanation, but the ctypes.util.find_library () function is looking for a shared library and it's loaded by the ctypes.CDLL () function. After loading the library, create Python wrappers for the functions in the library. The general method is to assign the functions in the library to Python variables. After assigning a function to a variable, you need to specify the type of the argument and the type of return.

e.g. Example of hnj_hyphen_load

Hyphenate1.py


_load = _LibHyphen.hnj_hyphen_load
_load.argtypes = [ctypes.c_char_p]
_load.restype = ctypes.c_void_p

e.g. Example of hnj_hyphen_hyphenate2

Hyphenate1.py


_int_p = ctypes.POINTER(ctypes.c_int)
_char_p_p = ctypes.POINTER(ctypes.c_char_p)

_hyphenate = _LibHyphen.hnj_hyphen_hyphenate2
_hyphenate.argtypes = [
    ctypes.c_void_p,                # HyphenDict *hdict
    ctypes.c_char_p,                # const char *word
    ctypes.c_int,                   # int word_size
    ctypes.c_char_p,                # char *hyphens
    ctypes.c_char_p,                # char *hyphenaated_word
    _char_p_p,                      # char ***rep
    _int_p,                         # int **pos
    _int_p                          # int **cut
]
_hyphenate.restype = ctypes.c_int

Let's use these to create a private Python function.

Hyphenate1.py


def hyphenate(word, filename, hyphen='-'):
    originalWord = word
    
    hdict = _get_hdict(filename)
    word = word.encode("utf-8")
    word_size = ctypes.c_int(len(word))
    hyphens = ctypes.create_string_buffer(word)
    hyphenated_word = ctypes.create_string_buffer(len(word) * 2)
    rep = _char_p_p(ctypes.c_char_p(None))
    pos = _int_p(ctypes.c_int(0))
    cut = _int_p(ctypes.c_int(0))

    if _hyphenate(hdict, word, word_size, hyphens, hyphenated_word, rep, pos, cut):
        raise Error("hyphenation failded for '{}'".format(originalWord))

    return hyphenated_word.value.decode("utf-8").replace("=", hyphen)

Like this. ctypes.create_string_buffer is a function that creates a Cchar based on the number of bytes. Encoding processing is performed because it is necessary to pass a byte to UTF-8 to the function for hyphen processing.

The _get_hdict () function can be written as follows. It is a simple file load process.

Hyphenate1.py


_hdictForFilename = {}

def _get_hdict(filename):
    if filename not in _hdictForFilename:
        hdict = _load(ctypes.create_string_buffer(filename.encode("utf-8")))
        if hdict is None:
            raise Error("failed to load '{}'".format(filename))
        _hdictForFilename[filename] = hdict
    hdict = _hdictForFilename.get(filename)
    if hdict is None:
        raise Error("failed to load '{}'".format(filename))
    return hdict

You are ready to call the C library from Python. If you actually use the function, you should get the following output.

>>> hyphenate('extraordinary', '/path/to/dictfile')
u'ex-traor-dinary'

In this way, the C library can be used casually from Python, so you may consider leaving the processing to the C library for the part where the processing is inevitably heavy.

Summary

This time, I picked up the C language extension part from PiP and introduced it. PiP is written in very simple English, so it is recommended for those who are not good at English. In particular, the first chapter on design patterns is a cross-linguistic basic story, so I think that there are many stories that will be helpful to those who are using other languages.

We are planning to have a reading session for this book at PyLadies Tokyo at the beginning of the year, so if you are interested, please contact us (promotion).

Recommended Posts

Introducing Python in Practice (PiP)
Introducing GUI: PyQt5 in Python
Algorithm (segment tree) in Python (practice)
Quadtree in Python --2
CURL in python
Metaprogramming in Python
Python 3.3 in Anaconda
Geocoding in python
SendKeys in Python
(Bad) practice of using this in Python
Meta-analysis in Python
Unittest in python
Discord in Python
DCI in Python
quicksort in python
nCr in python
N-Gram in Python
Programming in python
Plink in Python
Constant in python
Lifegame in Python.
Sqlite in python
StepAIC in Python
N-gram in python
LINE-Bot [0] in Python
Csv in python
Disassemble in Python
Reflection in Python
Constant in python
Beginners practice Python
nCr in Python.
format in python
Scons in Python3
Puyo Puyo in python
python in virtualenv
PPAP in Python
Quad-tree in Python
Reflection in Python
Chemistry in Python
Hashable in python
DirectLiNGAM in Python
LiNGAM in Python
Flatten in python
flatten in python
Introducing a library that was not included in pip on Python / Windows
Boost.NumPy Tutorial for Extending Python in C ++ (Practice)
Practice applying functions and global variables in Python
Class inheritance practice in python as seen in sklearn
Sorted list in Python
Daily AtCoder # 36 in Python
Clustering text in Python
Daily AtCoder # 2 in Python
Implement Enigma in python
Daily AtCoder # 32 in Python
Daily AtCoder # 6 in Python
Edit fonts in Python
Singleton pattern in Python
File operations in Python
Read DXF in python
Daily AtCoder # 53 in Python
Key input in Python