Wrap (part of) the AtCoder Library in Cython for use in Python

Introduction

This article is from AtCoder Library, DSU And Fenwick Tree will be made available from Python using Cython. I haven't confirmed if there is anything other than DSU and Fenwick Tree that can be used in the same way, but for example, [Segtree](https://atcoder.github.io/ac-library/production/document_ja/segtree. I think it's difficult because html) uses non-typed templates and Cython doesn't support non-typed templates (probably).

The AtCoder Library is written in C ++, so it's faster than implementing the same thing in Python. However, as I will explain later, when compared with PyPy, the result was not fast.

This article rarely touches on Cython's grammar, so if you are interested in Cython in this article, please refer to other articles as well. (Cython works even if you write the Python code as it is.)

What is Cython

It is a language for creating Python extension modules using C / C ++, and you can call C / C ++ classes and functions. It's a language for doing things like this one. (really?)

Environment

Cython and AtCoder Library are required, so we will install them. If you are a Windows person, you may get stuck in compiling Cython, so we recommend using Linux or WSL if possible.

Install Cython

terminal


$ pip3 install cython

I will do it at. Let's see if Cython can be used.

hello.pyx


print("Hello, World!")

A program that outputs Hello, World !. Note that the extension is .pyx. Let's compile this and run it.

terminal


$ cythonize -i -3 hello.pyx
$ python3 -c "import hello"

I am compiling with the cythonize command on the first line. The -i option first converts it to C code and compiles it into a format that can be imported by Python (.so on Linux, .pyd on Windows). The -3 option is used when Python is 3 series.

If you execute the code that imports hello in Python and output Hello, World! As in the second line, the Cython environment has been successfully built. AtCoder Library Create a / opt / ac-library / folder. (It matches the judge environment of AtCoder.) Copy the atcoder folder in https://img.atcoder.jp/practice2/ac-library.zip directly under it.

Wrap AtCoder Library in Cython

DSU First, enable DSU. In order to use the AtCoder Library from Python, Cython requires the following two steps:

  1. Make AtCoder Library available from Cython
  2. Wrap it for use from Python

1. Make AtCoder Library available from Cython

# distutils: language=c++
# distutils: include_dirs=/opt/ac-library
from libcpp cimport bool
from libcpp.vector cimport vector
cdef extern from "<atcoder/dsu>" namespace "atcoder":
    cdef cppclass dsu:
        dsu(int n) #constructor
        int merge(int a, int b) #Edge(a, b)Add
        bool same(int a, int b) #Vertex a,Whether b is concatenated
        int leader(int a) #Representative element of the connected component to which vertex a belongs
        int size(int a) #The size of the connected component to which vertex a belongs
        vector[vector[int]] groups() #List of "List of vertex numbers of one connected component"

This is the code that made DSU available in Cython. The first line specifies that C ++ is used, and the second line specifies the location of the AtCoder Library. Lines 3 and 4 are imported to take advantage of C ++ bools and vectors. From the 5th line onward, the methods of the dsu class in the atcoder namespace of atcoder / dsu.hpp (I'm not sure about C ++, so I'm not sure about this, so I'll implement it internally and [DSU documentation](https Please read: //atcoder.github.io/ac-library/production/document_ja/dsu.html) …….), Which is used in Cython. Now you have a DSU in Cython

cdef dsu *u = new dsu(3)
u.merge(0, 1)
print(u.same(0, 1)) # True
print(u.same(0, 2)) # False

It is now available as.

2. Wrap it for use from Python

cdef class DSU:
    cdef dsu *_thisptr
    #constructor
    def __cinit__(self, int n):
        self._thisptr = new dsu(n)
    #Edge(a, b)Add
    cpdef int merge(self, int a, int b):
        return self._thisptr.merge(a, b)
    #Vertex a,Whether b is concatenated
    cpdef bool same(self, int a, int b):
        return self._thisptr.same(a, b)
    #Representative element of the connected component to which vertex a belongs
    cpdef int leader(self, int a):
        return self._thisptr.leader(a)
    #The size of the connected component to which vertex a belongs
    cpdef int size(self, int a):
        return self._thisptr.size(a)
    #List of "List of vertex numbers of one connected component"
    cpdef vector[vector[int]] groups(self):
        return self._thisptr.groups()

Functions and methods defined in cpdef can also be called from Python. By creating a DSU class and calling the method of the dsu class internally, it means that the use of DSU from Python is realized.

This completes the Cython code.

dsu.pyx


# distutils: language=c++
# distutils: include_dirs=/opt/ac-library
from libcpp cimport bool
from libcpp.vector cimport vector
cdef extern from "<atcoder/dsu>" namespace "atcoder":
    cdef cppclass dsu:
        dsu(int n)
        int merge(int a, int b)
        bool same(int a, int b)
        int leader(int a)
        int size(int a)
        vector[vector[int]] groups()
cdef class DSU:
    cdef dsu *_thisptr
    def __cinit__(self, int n):
        self._thisptr = new dsu(n)
    cpdef int merge(self, int a, int b):
        return self._thisptr.merge(a, b)
    cpdef bool same(self, int a, int b):
        return self._thisptr.same(a, b)
    cpdef int leader(self, int a):
        return self._thisptr.leader(a)
    cpdef int size(self, int a):
        return self._thisptr.size(a)
    cpdef vector[vector[int]] groups(self):
        return self._thisptr.groups()

Let's compile it so that it can be imported in Python.

terminal


$ cythonize -i -3 dsu.pyx

Run in python

Let's write the code to solve DSU Exercise in Python.

ALPC-A.py


from dsu import DSU
n, q = map(int, input().split())
dsu = DSU(n)
res = []
for i in range(q):
    t, u, v = map(int, input().split())
    if t == 0:
        dsu.merge(u, v)
    else:
        if dsu.same(u, v):
            res.append(1)
        else:
            res.append(0)
print('\n'.join(map(str, res)))

If you try to enter a sample exercise and get the correct output, you're good to go.

Submission

I'd like to actually submit it to DSU Exercises and confirm that it will be AC, but atCoder's judge side dsu. There is no pyx or a module that compiles it, so if you submit it as is, it will be RE. So you need to embed dsu.pyx in your submission code.

ALPC-A.py


cycode = r"""
# distutils: language=c++
# distutils: include_dirs=/opt/ac-library
from libcpp cimport bool
from libcpp.vector cimport vector
cdef extern from "<atcoder/dsu>" namespace "atcoder":
    cdef cppclass dsu:
        dsu(int n)
        int merge(int a, int b)
        bool same(int a, int b)
        int leader(int a)
        int size(int a)
        vector[vector[int]] groups()
cdef class DSU:
    cdef dsu *_thisptr
    def __cinit__(self, int n):
        self._thisptr = new dsu(n)
    cpdef int merge(self, int a, int b):
        return self._thisptr.merge(a, b)
    cpdef bool same(self, int a, int b):
        return self._thisptr.same(a, b)
    cpdef int leader(self, int a):
        return self._thisptr.leader(a)
    cpdef int size(self, int a):
        return self._thisptr.size(a)
    cpdef vector[vector[int]] groups(self):
        return self._thisptr.groups()
"""
import os, sys
if sys.argv[-1] == 'ONLINE_JUDGE':
    open('dsu.pyx', 'w').write(cycode)
    os.system('cythonize -i -3 dsu.pyx')
from dsu import DSU
n, q = map(int, input().split())
dsu = DSU(n)
res = []
for i in range(q):
    t, u, v = map(int, input().split())
    if t == 0:
        dsu.merge(u, v)
    else:
        if dsu.same(u, v):
            res.append(1)
        else:
            res.append(0)
print('\n'.join(map(str, res)))

AtCoder's Python is in the compile phase for Numba's AOT (read this article for more information). Is not included in the execution time), and ONLINE_JUDGE is given as a command line argument during the compile phase. Using this, if sys.argv [-1] =='ONLINE_JUDGE': on the 30th line determines whether it is the compile phase, and if it is the compile phase, it writes the file of dsu.pyx. , Compile with the cythonize command. I [submitted] this code (https://atcoder.jp/contests/practice2/submissions/17535199) and it became AC safely in 424 ms. For comparison, volunteers have DSU in the AtCoder Library Python port. When using python / blob / master / atcoder / dsu.py), when Submit in Python, 635 ms, [Submit in PyPy] ](Https://atcoder.jp/contests/practice2/submissions/17535461) It was 551 ms. ~~ Well, despite the hard work, it doesn't get that fast ~~ Fenwick Tree Similarly, make Fenwick Tree available in Python.

fenwick_tree.pyx


# distutils: language=c++
# distutils: include_dirs=/opt/ac-library
cdef extern from "<atcoder/fenwicktree>" namespace "atcoder":
    cdef cppclass fenwick_tree[T]:
        fenwick_tree(int n)
        void add(int p, T x)
        T sum(int l, int r)
ctypedef long long ll
cdef class FenwickTree:
    cdef fenwick_tree[ll] *_thisptr
    def __cinit__(self, int n):
        self._thisptr = new fenwick_tree[ll](n)
    cpdef void add(self, int p, ll x):
        self._thisptr.add(p, x)
    cpdef ll sum(self, int l, int r):
        return self._thisptr.sum(l, r)

It's basically the same as DSU, but Fenwick Tree uses templates. According to Document, int / uint (unsigned int) / ll (long long) / ull (unsigned long long) It seems that / modint can be specified as a type. Here, the type is ll. (Because I think int and uint can be replaced by ll) If you want a ull Fenwick Tree, rewrite it or create another class. ~~ modint gave up ... ~~

Try to solve Fenwick Tree Exercises.

ALPC-B.py


cycode = r"""
# distutils: language=c++
# distutils: include_dirs=/opt/ac-library
cdef extern from "<atcoder/fenwicktree>" namespace "atcoder":
    cdef cppclass fenwick_tree[T]:
        fenwick_tree(int n)
        void add(int p, T x)
        T sum(int l, int r)
ctypedef long long ll
cdef class FenwickTree:
    cdef fenwick_tree[ll] *_thisptr
    def __cinit__(self, int n):
        self._thisptr = new fenwick_tree[ll](n)
    cpdef void add(self, int p, ll x):
        self._thisptr.add(p, x)
    cpdef ll sum(self, int l, int r):
        return self._thisptr.sum(l, r)
"""
import os, sys
if sys.argv[-1] == 'ONLINE_JUDGE':
    open('fenwick_tree.pyx', 'w').write(cycode)
    os.system('cythonize -i -3 fenwick_tree.pyx')
from fenwick_tree import FenwickTree
n, q = map(int, input().split())
ft = FenwickTree(n)
for i, a in enumerate(map(int, input().split())):
    ft.add(i, a)
res = []
for _ in range(q):
    t, u, v = map(int, input().split())
    if t == 0:
        ft.add(u, v)
    else:
        res.append(ft.sum(u, v))
print('\n'.join(map(str, res)))

I [submitted] this code (https://atcoder.jp/contests/practice2/submissions/17535413) and it became AC in 1287 ms. For comparison, if you use the Python ported Fenwick Tree (https://github.com/not522/ac-library-python/blob/master/atcoder/fenwicktree.py), submit in Python (https) When you do //atcoder.jp/contests/practice2/submissions/17535420), it takes 4663 ms, and when you do Submit with PyPy, it takes 1001 ms. did. ~~ It's slower than PyPy! I'm sharp ... ~~

Comparison

DSU

Cython + Python Python PyPy
424 ms 635 ms 551 ms

Fenwick Tree

Cython + Python Python PyPy
1287 ms 4663 ms 1001 ms

bonus

In fact, if you write the part that was done in Python in Cython, it will be much faster than PyPy. The following is the code when Fenwick Tree Exercises is written entirely in Cython.

ALPC-B.pyx


# distutils: language=c++
# distutils: include_dirs=/opt/ac-library
from libc.stdio cimport scanf, printf
from libcpp.vector cimport vector
cdef extern from "<atcoder/fenwicktree>" namespace "atcoder":
    cdef cppclass fenwick_tree[T]:
        fenwick_tree(int n)
        void add(int p, T x)
        T sum(int l, int r)
ctypedef long long ll
cdef int n, q, i, t
cdef ll u, v, a
scanf('%d%d', &n, &q)
cdef fenwick_tree[ll] *ft = new fenwick_tree[ll](n)
for i in range(n):
    scanf('%lld', &a)
    ft.add(i, a)
for i in range(q):
    scanf('%d%lld%lld', &t, &u, &v)
    if t == 0:
        ft.add(u, v)
    else:
        printf('%lld\n', ft.sum(u, v))

This submission was 251 ms.

Cython Cython + Python Python PyPy
251 ms 1287 ms 4663 ms 1001 ms

Super fast. So why is Cython + Python slower than PyPy? I suspected that the cause was the slowness of Python's loop processing. As an experiment, let's submit code in Python and PyPy that just accepts the input for this problem and see the execution time.

ALPC-B-input.py


n, q = map(int, input().split())
for i, a in enumerate(map(int, input().split())):
    i, a
for _ in range(q):
    t, u, v = map(int, input().split())

The result is 1012 ms for Python and 687 ms for PyPy. did. Subtracting this result from the execution time of the AC code, it seems that the processing time of the Fenwick Tree takes about 275ms for Cython + Python and 314ms for PyPy. ~~ Subtle ~~

Summary

I think you should use PyPy.

Recommended Posts

Wrap (part of) the AtCoder Library in Cython for use in Python
How to use the C library in Python
Wrap C with Cython for use from Python
Wrap C ++ with Cython for use from Python
Use the LibreOffice app in Python (3) Add library
[Modint] Decoding the AtCoder Library ~ Implementation in Python ~
Let's use the open data of "Mamebus" in Python
[Internal_math version (2)] Decoding the AtCoder Library ~ Implementation in Python ~
Check the operation of Python for .NET in each environment
[python] How to use the library Matplotlib for drawing graphs
Google search for the last line of the file in Python
The story of participating in AtCoder
Use networkx, a library that handles graphs in python (Part 2: Tutorial)
[Introduction to Python] How to use the in operator in a for statement?
Check the behavior of destructor in Python
Implement part of the process in C ++
Summary of various for statements in Python
The result of installing python in Anaconda
What is "mahjong" in the Python library? ??
MongoDB for the first time in Python
The basics of running NoxPlayer in Python
Pandas of the beginner, by the beginner, for the beginner [Python]
AtCoder cheat sheet in python (for myself)
In search of the fastest FizzBuzz in Python
Use pathlib in Maya (Python 2.7) for upcoming Python 3.7
About Python code for simple moving average assuming the use of Numba
Code reading of Safe, a library for checking password strength in Python
Get the key for the second layer migration of JSON data in python
Output the number of CPU cores in Python
[Python] Sort the list of pathlib.Path in natural sort
Get the caller of a function in Python
Match the distribution of each group in Python
How to use Python Image Library in python3 series
View the result of geometry processing in Python
Use logger with Python for the time being
Make a copy of the list in Python
Summary of how to use MNIST in Python
Tips for hitting the ATND API in Python
Find the divisor of the value entered in python
[Python] Read the source code of Bottle Part 1
Image processing? The story of starting Python for
Getting rid of DICOM images in Python Part 2
Find the solution of the nth-order equation in python
Don't use readlines () in your Python for statement!
The story of reading HSPICE data in Python
[Note] About the role of underscore "_" in Python
About the behavior of Model.get_or_create () of peewee in Python
Solving the equation of motion in Python (odeint)
Output in the form of a python array
Code for checking the operation of Python Matplotlib
[Python + OpenCV] Whiten the transparent part of the image
Basic story of inheritance in Python (for beginners)
Use something other than a <br> string for the <br> dict key in Python
I want to use Python in the environment of pyenv + pipenv on Windows 10
Python script to get a list of input examples for the AtCoder contest
Touch the sample v20-python-samples of the OANDA v20 REST API wrapper library for Python
Daily AtCoder # 36 in Python
Use data class for data storage of Python 3.7 or higher
Daily AtCoder # 32 in Python
Daily AtCoder # 6 in Python
Daily AtCoder # 18 in Python