Introduction

There are times when you want to do complicated processing by turning a for statement on a multidimensional array in Python (data is often represented by a multidimensional array in the atmosphere and ocean field), but in that case the execution time is often long. You may be in trouble. There are various speed-up methods to solve this, but one solution is to write a subroutine in Fortran, which is often used in fluid calculation, and call it from Python. There are two main ways to call a Fortran subroutine from Python: using numpy.ctypeslib and using f2py. The access order of the multidimensional array is different between the two, but I could not find the page summarized in Japanese, so I wrote an article.

The execution environment is as follows.

macOS 10.15.7 Catalina
Python 3.7.7 (conda 4.9.0)
NumPy 1.19.2
f2py version 2
gfortran (gcc) 10.2.0 (Homebrew)

Differences in handling multidimensional arrays between Python and Fortran

It's a basic story, so if you know it, you can skip this section.

Multidimensional arrays are handled in programming languages, but they are arranged side by side in the one-dimensional direction in memory. At that time, how it is stored depends on the language. The code below writes the value 0 to the elements of the array in the order they were placed in memory.

`loop.py`


import numpy as np

nk = 2
nj = 3
ni = 4
array = np.empty((nk, nj, ni), dtype=np.float32, order="C")
for k in range(nk):
    for j in range(nj):
        for i in range(ni):
            array[k, j, i] = 0

`loop.f90`


integer(4), parameter :: nk = 2, nj = 3, ni = 4
real(4), dimension(ni, nj, nk) :: array

do k = 1, nk
    do j = 1, nj
        do i = 1, ni
            array(i, j, k) = 0
        end do
    end do
end do

The difference is that in Python (to be exact, in the case of order =" C " in the numpy array), the order is [k, j, i], and in Fortran, the order is (i, j, k). (Sometimes called row-first and column-first, respectively). Whether to access in the order of storage in memory or to access in a discrete manner affects the execution time, especially when the size of the array is large.

When using numpy.ctypeslib

As an example, let's write a code that passes a 2D array of numpy to a subroutine to Fortran and returns a double of it (the same is true for 3D and above). The details of the code are not explained here, so if you want to know, for example, [Fortran, cooperation with C language](https://kyotogeopython.zawawahoge.com/html/%E5%BF%9C%E7%94%A8 % E7% B7% A8 / Fortran,% 20C% E8% A8% 80% E8% AA% 9E% 20% E3% 81% A8% E3% 81% AE% E9% 80% A3% E6% 90% BA. See other articles such as html).

`doubling1.py`


from ctypes import POINTER, c_int32, c_void_p
import numpy as np

libDoubling = np.ctypeslib.load_library("doubling1.so", ".")
libDoubling.doubling.argtypes = [
    np.ctypeslib.ndpointer(dtype=np.float32),
    np.ctypeslib.ndpointer(dtype=np.float32),
    POINTER(c_int32),
    POINTER(c_int32),
]
libDoubling.doubling.restype = c_void_p

nj = 2
ni = 3

vin = np.empty((nj, ni), dtype=np.float32, order="C")
vin[0, 0] = 1
vin[0, 1] = 2
vin[0, 2] = 3
vin[1, 0] = 4
vin[1, 1] = 5
vin[1, 2] = 6

vout = np.empty((nj, ni), dtype=np.float32, order="C")

_nj = POINTER((c_int32))(c_int32(nj))
_ni = POINTER((c_int32))(c_int32(ni))

libDoubling.doubling(vin, vout, _ni, _nj)

print(vin)
# [[1. 2. 3.]
#  [4. 5. 6.]]
print(vout)
# [[ 2.  4.  6.]
#  [ 8. 10. 12.]]

`doubling1.f90`


! gfortran -Wall -cpp -fPIC -shared doubling1.f90 -o doubling1.so
subroutine doubling(vin, vout, ni, nj) bind(C)
    use, intrinsic :: ISO_C_BINDING
    implicit none
    integer(c_int32_t), intent(in) :: ni, nj
    real(c_float), dimension(ni, nj), intent(in) :: vin
    real(c_float), dimension(ni, nj), intent(out) :: vout
    integer(c_int32_t) :: i, j

    do j = 1, nj
        do i = 1, ni
            vout(i, j) = vin(i, j) * 2
        end do
    end do
end subroutine doubling

A pointer is passed to the argument of the Fortran subroutine (shared library). In Python and Fortran, the indexes are the opposite of [j, i] and (i, j), but if you think about the order in which the data is stored in memory, you should be able to understand it straightforwardly. ..

When using f2py

Next, use f2py and write the same code as before. I won't go into details about this either, but I think you can refer to articles such as Handling numpy arrays with f2py.

`doubling2.f90`


! f2py -c --fcompiler=gfortran -m doubling2 doubling2.f90
subroutine doubling2(vin, vout, ni, nj)
    implicit none
    integer(4), intent(in) :: ni, nj
    real(4), dimension(nj, ni), intent(in) :: vin
    real(4), dimension(nj, ni), intent(out) :: vout
    integer(4) :: i, j

    do j = 1, nj
        do i = 1, ni
            vout(j, i) = vin(j, i) * 2
        end do
    end do
end subroutine doubling2

`doubling2.py`


import numpy as np
import doubling2

nj = 2
ni = 3
vin = np.empty((nj, ni), dtype=np.float32, order="C")
vin[0, 0] = 1
vin[0, 1] = 2
vin[0, 2] = 3
vin[1, 0] = 4
vin[1, 1] = 5
vin[1, 2] = 6

vout = np.empty((nj, ni), dtype=np.float32, order="C")

vout = doubling2.doubling2(vin, ni, nj)
print(vin)
# [[1. 2. 3.]
#  [4. 5. 6.]]
print(vout)
# [[ 2.  4.  6.]
#  [ 8. 10. 12.]]

When using f2py, Python's [j, i] can be changed to (j, i) in Fortran. The advantage is that the code looks the same, but the do statement in doubling2.f90 has the disadvantage that the array access order does not correspond to the data storage order in memory.

How can I solve this? In other words, I want to access the Fortran code in the same order as doubling1.f90, in the order of data storage in memory.

`doubling3.f90`


! f2py -c --fcompiler=gfortran -m doubling3 doubling3.f90
subroutine doubling3(vin, vout, ni, nj)
    implicit none
    integer(4), intent(in) :: ni, nj
    real(4), dimension(ni, nj), intent(in) :: vin
    real(4), dimension(ni, nj), intent(out) :: vout
    integer(4) :: i, j

    do j = 1, nj
        do i = 1, ni
            vout(i, j) = vin(i, j) * 2
        end do
    end do
end subroutine doubling3

You might think that you can reverse the storage order in memory by setting the order argument of the numpy array to"F"instead of the default"C". However, this does not solve this problem. That is, even if you change order =" C " to order =" F " in doubling2.py, the corresponding Fortran code is still doubling2.f90, and using doubling3.f90 will result in an error. is.

The solution is to transpose the array using the T attribute, as described in f2py — prevent array reordering --Stack Overflow. That is.

`doubling3.py`


import numpy as np
import doubling3

nj = 2
ni = 3
vin = np.empty((nj, ni), dtype=np.float32, order="C")
vin[0, 0] = 1
vin[0, 1] = 2
vin[0, 2] = 3
vin[1, 0] = 4
vin[1, 1] = 5
vin[1, 2] = 6

_vout = np.empty((nj, ni), dtype=np.float32, order="C")

_vout = doubling3.doubling3(vin.T, ni, nj)
vout = _vout.T
print(vin)
# [[1. 2. 3.]
#  [4. 5. 6.]]
print(vout)
# [[ 2.  4.  6.]
#  [ 8. 10. 12.]]

For example, this method is suitable for applications such as "reading GPV using the netCDF4 library, processing it efficiently with Fortran subroutines, and then drawing with matplotlib + cartopy". It is troublesome to transpose to avoid "Osetsusuke", but even if it is included, the amount of description is smaller than when using numpy.ctypeslib, and I think that f2py is easier to use.

Access order of multidimensional arrays when calling Fortran subroutines from Python (ctypeslib and f2py)

Introduction

Differences in handling multidimensional arrays between Python and Fortran

loop.py

loop.f90

When using numpy.ctypeslib

doubling1.py

doubling1.f90

When using f2py

doubling2.f90

doubling2.py

doubling3.f90

doubling3.py