[PYTHON] Since handling the Cython mold was troublesome, I summarized the points I was careful about

There is an explanation about how to use Cython, but I felt that there were few articles about types, so I will summarize it. Cython is a language that allows you to create interfaces between C and Python in a writing style that is almost the same as Python, and that can be expected to speed up Python. However, a mixture of C and Python types causes frequent type errors. I thought Cython's difficulty was ** type control **.

Fusion of static typing and dynamic typing

Cython is a language that can be dynamically typed like Python and statically typed like C. It is a convenient function that allows you to enjoy the strengths of dynamic typing and static typing in Python, but Cython has many points to be aware of because not only Python types but also C types are mixed. When using Cython, keep in mind that ** Cython treats it as a Python type unless you explicitly type it. (If the entity is not a Python-level type, it will be cast to a C-level type) **. By the way, in Cython, vector <int> is written as vector [int].

    cdef vector[int] vec1 #C level type

In Cython, variables can be declared as C-level types by using cdef. Note that this cdef has the restriction that it must be written at the beginning of the function, except for comments such as docstring. In other words, the type to cdef cannot be changed by branching in the ʻif statement` as shown below.

#I can't write like this
cdef func0(type):
    if type == "double":
        cdef double x = 10
        return x
    elif type == "int":
        cdef double y = 10
        return y

In Cython, the return value can also be specified explicitly. This type specification is optional when the return value is a Python type, but when the return value is a C-level type, the type must be specified. If not specified, it will be converted to a Python level type.

cdef func1(type):
     cdef vector[double] vec
     return vec #Each element is converted into a list of floats

cdef vector[double] func2(type):#Specify the return type
     cdef vector[double] vec
     return vec #vector[double]Return as it is

Cython has the ability to implicitly convert between Python-level and C-level types. For example, Cython automatically converts vector to list and double to float for vector <double> without specifying vector [double] as the return type.

If the type you want to return is a user-defined type or a C library-specific type that Cython cannot handle, a compile error will occur because conversion to the Python level is not defined. In the following example, the type defined in C called MyClass is returned from within the cdef function. This MyClass type does not define how to convert it to a Python type, so an error will occur in the following example.

cdef func3():
    cdef MyClass x = MyClass()
    return x #Error MyClass type cannot be converted to Python type

func3()

Therefore, if you want to return a C-level type value, you need to specify the return type as follows.

cdef MyClass func3():
    cdef MyClass x = MyClass()
    return x #OK The return value is MyClass

func3()

This also applies to vector whose element is MyClass.

Type cast and overload

Not only can Cython speed up Python, but it's also great because it can wrap C functions and types. I also tried to wrap, but it was quite awkward when I wanted to wrap a function that was overloaded in C.

cdef extern from "rect.h":
    int area(int x,int y)
    double area(double x,double y)

For example, suppose you have a function like the one above. I want to change the function to call this depending on the type of x, so write it as follows. Then an error occurs.

#Error: Cannot call the appropriate method
def py_area (x,y):
   if type(x) == int:
        return area(x,y) 
   elif type(x) == float:
        return area(x,y) 

If you want to write it like this, you need to explicitly cast all the values passed as actual arguments to the appropriate type before passing them. This means that typecasts are used for explicit typing on the fly.

#OK
def py_area (x,y):
    if type(x) == int:
        return area(<int>x,<int>y) 
    elif type(x) == float:
        return area(<double>x,<double>y) 

However, when the formal argument is declared by passing by reference, the formal argument cannot be passed while being cast on the spot. At this time, by declaring the type with cdef in advance, it is possible to pass while explicitly specifying the type.

For example, if the function looks like this

cdef extern from "rect.h":
    int area(int& x,int& y)
    double area(double& x,double& y)

It is necessary to change the writing style so far to the following writing style.

def py_area (x,y):
    cdef:
        int x1
        int y1
        double x2
        double y2
    if type(x) == int:
       x1 = x
       y1 = y
       return area(x1,y1) 
    elif type(x) == float:
       x2 = x
       y2 = y
       return area(x2,y2) 

Then, even if the formal argument is defined in the reference type, no error occurs.

Fused type Cython has a feature called ** Fused type **. This is essentially a feature of Cython that uses template types. It can be used when there can be multiple types of return values and arguments.

#List any type
ctypedef fused my_type:
    hoge
    foo
    bar 

By enumerating the types of Fused type as described above, my_type will be treated as any of hoge, foo, and bar. By using this, type conversion between C and Python of multidimensional list can be realized as follows. PyClass is the type of MyClass on Python.

ctypedef fused T:
    MyClass
    vector[MyClass]

cdef vector_to_list (T x):
 if T == vector[MyClass]:
    return [vector_to_list(i) for i in range(<vector[MyClass]>x.size())]
 else :  
    return PyClass(x)

Fused type is parsed considering the possibility of all types. For example, in the above example, Cython's x is considered to be not only vector [MyClass] but also MyClass. If you write as below without explicitly specifying the type by casting, an error will occur because MyClass does not have size ().

ctypedef fused T:
    MyClass
    vector[MyClass]

cdef vector_to_list (T x):
 if T == vector[MyClass]:
    return [vector_to_list(i) for i in range(x.size())] # (1)
 else :  
    return PyClass(x)

In this example, the line (1) is not cast with <vector [MyClass]>, so it is considered that the type T may be MyClass. And I get an error that size () is not defined in MyClass.

Summary

Since Cython is a language that combines C and Python, it can take advantage of these two characteristics, but I found it a lot of trouble. For the time being, if you're having trouble with Cython's typing, make good use of static typing and casting with cdef.

References

--Kurt W. Smith, translated by Hideki Nakata, translated by Takahiro Nagao "Cython --Speeding up Python by fusing with C" 2015

Recommended Posts

Since handling the Cython mold was troublesome, I summarized the points I was careful about
I passed the Python data analysis test, so I summarized the points