Boost.NumPy Tutorial for Extending Python in C ++ (Practice)

C ++ Advent Calender Day 10 article.

Boost 1.63 merges Boost.NumPy into Boost.Python. Along with that, there may be a problem with the following description. I will write a revised article.

Motivation

Python is really useful. For me, who is an academia (apprentice) whose main focus is simulation and analysis of the results, An IPython Notebook that allows you to seamlessly process, analyze, and visualize data interactively is a must. Unfortunately, C ++ alone does not provide the equivalent functionality (probably). The cling of ROOT made by CERN seems to be able to execute C ++ interactively, but I have never used it because there is little information. Please make an IC ++ Notebook that can be literately programmed in C ++ (seriously).

However, the simulation itself runs for days to weeks, Speed is essential and Python cannot be written in a slow language. So the simulation will be written in C ++ (what Fortran is that delicious).

On the other hand, projects like scikit-learn and PyMC are front ends for Python. If you look at the place where you are preparing For the time being, I feel that it will become more common to implement it in another language and use it from Python. So, among the methods for writing functions that can be used in Python in C ++ Here's how to use Boost.NumPy.

Why Boost.NumPy?

I once tried to use Boost.Python and was frustrated because the converter didn't understand Boost.NumPy was very easy to use (important).

To briefly mention the other options, Boost.NumPy introduced this time simply takes the method of constructing a wrapping for C ++ of numpy.ndarray in Python. Since C ++ already has (innumerable) libraries that perform linear algebra operations like Eigen, There is also a Choice that converts numpy.ndarray to Eigen's multidimensional vector as is. More simply, using only Boost.Python, convert C ++ vector etc. to Python list, You also have the option of then converting to numpy.ndarray. There are many others such as SWIG and Cython.

Purpose

I summarized how to install and compile in previous article, so this time I will summarize how to actually use it. Normally, I would like to use only the interface part of C ++ code from the Python side, so I will summarize how to use it as easily as possible while avoiding difficult things.

I won't explain Boost.Python, so please check it yourself. Below is a list of Boost.Python commentary sites:

-Boost.Python -Boost.Python (Japanese translation) Honke (Japanese translation)

Tutorial for Boost.NumPy

The following is what I wrote based on the Boost.NumPy tutorial. The namespace is abbreviated as follows to eliminate notational complexity.

namespace p = boost::python;
namespace np = boost::numpy;

How to use np :: ndarray (for one-dimensional)

The introduction was long, so let's put a code that works quickly:

mymod1.cpp


#include "boost/numpy.hpp"
#include <stdexcept>
#include <algorithm>

namespace p = boost::python;
namespace np = boost::numpy;

/*Double*/
void mult_two(np::ndarray a) {
  int nd = a.get_nd();
  if (nd != 1)
    throw std::runtime_error("a must be 1-dimensional");
  size_t N = a.shape(0);
  if (a.get_dtype() != np::dtype::get_builtin<double>())
    throw std::runtime_error("a must be float64 array");
  double *p = reinterpret_cast<double *>(a.get_data());
  std::transform(p, p + N, p, [](doublex){return2*x;});
}

BOOST_PYTHON_MODULE(mymod1) {
  Py_Initialize();
  np::initialize();
  p::def("mult_two", mult_two);
}

This can be compiled as is. It's a simple function that just doubles a one-dimensional array, but it contains important elements.

--Use get_nd () for array dimensions --Use shape (n) for the size of the array --The type of the elements of the array is dynamically determined and you can get it with get_dtype () --Data can be accessed with a raw pointer --C ++ std :: runtime_error is converted to RuntimeError on the Python side

The memory is managed on the np :: ndarray side, so you do not need to be aware of it. Unfortunately, the type is dynamically determined, so the function cannot be overloaded. It must be judged by the if statement at the time of execution. It's nice to convert exceptions.

Let's start with Python.

mymod1.py


#!/usr/bin/env python
# -*- coding: utf-8 -*-

import mymod1
import numpy as np

if __name__ == '__main__':
    a = np.array([1,2,3], dtype=np.float64)
    mymod1.mult_two(a)
    print(a)

    b = np.array([1,2,3], dtype=np.int64)
    mymod1.mult_two(b) # raise error
    print(b)

The b part is an error because the type is long long instead of double as described above:

[ 2.  4.  6.]
Traceback (most recent call last):
  File "/path/of/mymod1.py", line 13, in <module>
    mymod1.mult_two(b)
RuntimeError: a must be float64 array

When using an integer on numpy, ʻint64 (long long in C ++) is used unless otherwise specified. Please note that it is not ʻint.

There are some complaints such as not being able to overload, but I was able to implement a function in C ++ that can be easily used in Python.

How to use np :: ndarray (for multidimensional)

The basics are the same for multidimensional, but you need to be careful about the order of memory. The explanation of numpy.ndarray.stride is easy to understand, but below I will explain briefly.

Sometimes you want to manage a two-dimensional array in one continuous memory area.

double *a_as1 = new double[N*M];
double **a_as2 = new double*[N];
for(int i=0;i<N;++i){
  a_as2[i] = &a_as1[i*M];
}

Then the memory called by ʻa_as2 [i] [j]will be the same as ʻa_as1 [i * M + j]. The way to do something like this ʻi * M + j is ndarray.stride. In this case, it advances 1 byte in the j direction, but it advances 8 bytes in memory (double is 8 bytes). On the other hand, to advance 1 in the i direction, it advances 8 * MByte in memory ((i + 1) * M + j = i * M + j + M). These 8 and 8 * M` are called stride (stride, stride). This idea can be used even at higher dimensions.

If you pay attention to this, the rest is easy.

#include "boost/numpy.hpp"
#include <iostream>
#include <stdexcept>
#include <algorithm>

namespace p = boost::python;
namespace np = boost::numpy;

void print(np::ndarray a) {
  int nd = a.get_nd();
  if (nd != 2)
    throw std::runtime_error("a must be two-dimensional");
  if (a.get_dtype() != np::dtype::get_builtin<double>())
    throw std::runtime_error("a must be float64 array");

  auto shape = a.get_shape();
  auto strides = a.get_strides();

  std::cout << "i j val\n";
  for (int i = 0; i < shape[0]; ++i) {
    for (int j = 0; j < shape[1]; ++j) {
      std::cout << i << " " << j << " "
                << *reinterpret_cast<double *>(a.get_data() + i * strides[0] +
                                               j * strides[1]) << std::endl;
    }
  }
}

BOOST_PYTHON_MODULE(mymod2) {
  Py_Initialize();
  np::initialize();
  p::def("print", print);
}

It is faster to look at stride and turn it in ascending order in terms of memory access, I omitted it because it is troublesome. Please do your best.

Finally

Python or NumPy is really convenient. The big point is that SciPy provides the interface of the standard numerical calculation library with careful documentation.

Now that we have summarized how to access the data in np :: ndarray, I think that the algorithm implemented in C ++ can be used from Python. Next, I would like to find out about posts to PyPI and posts to scikits.

Recommended Posts

Boost.NumPy Tutorial for Extending Python in C ++ (Practice)
Extend python in C ++ (Boost.NumPy)
Next Python in C
C API in Python 3
Search for strings in Python
Techniques for sorting in Python
Binary search in Python / C ++
Introducing Python in Practice (PiP)
About "for _ in range ():" in python
Newton's method in C ++, Python, Go (for understanding function objects)
EEG analysis in Python: Python MNE tutorial
Check for memory leaks in Python
Check for external commands in python
ABC166 in Python A ~ C problem
Python cheat sheet (for C ++ experienced)
Solve ABC036 A ~ C in Python
Tips for calling Python from C
How to wrap C in Python
Algorithm (segment tree) in Python (practice)
Run unittests in Python (for beginners)
Solve ABC037 A ~ C in Python
Write C unit tests in Python
Learning history for participating in team app development in Python ~ Django Tutorial 5 ~
Learning history for participating in team app development in Python ~ Django Tutorial 4 ~
Learning history for participating in team app development in Python ~ Django Tutorial 1, 2, 3 ~
Best practice for logging in JSON format on AWS Lambda / Python
Learning history for participating in team app development in Python ~ Django Tutorial 6 ~
Learning history for participating in team app development in Python ~ Django Tutorial 7 ~
Solve ABC175 A, B, C in Python
Notes on nfc.ContactlessFrontend () for nfcpy in python
Algorithm in Python (ABC 146 C Binary Search
Implement FIR filters in Python and C
Tips for dealing with binaries in Python
Summary of various for statements in Python
Type annotations for Python2 in stub files!
Python tutorial
Write O_SYNC file in C and Python
Template for writing batch scripts in python
Process multiple lists with for in Python
MongoDB for the first time in Python
Get a token for conoha in python
Sample for handling eml files in Python
AtCoder cheat sheet in python (for myself)
I searched for prime numbers in python
Notes for using python (pydev) in eclipse
Generate C language from S-expressions in Python
Tips for making small tools in python
Use pathlib in Maya (Python 2.7) for upcoming Python 3.7
Run Python in C ++ on Visual Studio 2017
An introduction to Python for C programmers
Make the library created by Eigen in C ++ available from Python with Boost.Numpy.
Type notes to Python scripts for running PyTorch model in C ++ with libtorch
Template for creating command line applications in Python
[Python, Scala] Do a tutorial for Apache Spark
How to use the C library in Python
CERTIFICATE_VERIFY_FAILED in Python 3.6, the official installer for macOS
Output formatted output in Python, such as C / C ++ printf.
++ and-cannot be used for increment / decrement in python
How to generate permutations in Python and C ++
Import-linter was useful for layered architecture in Python
Run Python YOLOv3 in C ++ on Visual Studio 2017