SWIG is a tool for wrapping programs written in C/C ++ so that they can be used in multiple languages. Supported languages include scripting languages such as Javascript, Perl, PHP, Python, Tcl and Ruby, and non-scripting languages such as C #, D, Go, Java, Lua, OCaml, Octave, Scilab and R.
It is well known that code written in C/C ++ is fast, but we will compare the actual speed difference under some conditions.
Hello World Execution speed of the function that displays "hello world!" 1000 times
python_time:2.356291e-03[sec]
swig_time__:1.398325e-03[sec]
There is a difference of about twice, but it does not seem to be as much as is generally said. I don't know exactly, but it's probably because of the overhead of string console display speed.
python
def hello_world():
for i in range(1000):
print("hello world!")
def test1():
# python
start = time.time()
hello_world()
python_time = time.time() - start
# swig
start = time.time()
helloWorld()
swig_time = time.time() - start
print("python_time:{:e}".format(python_time) + "[sec]")
print("swig_time__:{:e}".format(swig_time) + "[sec]")
c
void helloWorld()
{
for (int i = 0; i < 1000; i++)
{
printf("hello world!\n");
}
}
Execution speed of a function that counts up to 1000 and returns
python_time:6.842613e-05[sec]
swig_time__:5.483627e-06[sec]
This time, the difference was more than 10 times. If you can make such a difference just by counting up the numbers, you can get a glimpse of why python is said to be slow.
python
def count_up():
res = 0
for i in range(1000):
res += 1
return res
def test2():
# python
start = time.time()
res = count_up()
print(res)
python_time = time.time() - start
# swig
start = time.time()
res = countUp()
print(res)
swig_time = time.time() - start
print("python_time:{:e}".format(python_time) + "[sec]")
print("swig_time__:{:e}".format(swig_time) + "[sec]")
c
int countUp()
{
int res = 0;
for (int i = 0; i < 1000; i++)
{
res += 1;
}
return res;
}
Execution speed of the function that converts a grayscale image to an RGB image
python_time:1.032352e-04[sec]
swig_time__:1.156330e-04[sec]
This is almost the same result. OpenCV is written in C/C ++ in the first place, so it's a natural result. However, the execution speed of this SWIG includes the process of allocating the memory of the output destination with np.zeros (). It is not possible if you are using the OpenCV python package, but if you are writing in SWIG, you can reuse the output memory. The speed when the memory space is not secured is as follows.
python_time:1.101494e-04[sec]
swig_time__:7.319450e-05[sec]
Under this condition, SWIG is 30% faster. If you execute it multiple times, it seems that there is an advantage in solid writing with SWIG.
python
def test3():
img_size = (256, 256)
org_img = np.random.randint(0, 256, (img_size), dtype=np.uint8)
# python
start = time.time()
res_py = cv2.cvtColor(org_img, cv2.COLOR_GRAY2RGB)
python_time = time.time() - start
# swig
# res_swig = np.zeros((*img_size, 3), dtype=np.uint8)
start = time.time()
res_swig = np.zeros((*img_size, 3), dtype=np.uint8)
imgGray2RGB(org_img, res_swig)
swig_time = time.time() - start
print("array_equal: {}".format(np.array_equal(res_py, res_swig)))
print("python_time:{:e}".format(python_time) + "[sec]")
print("swig_time__:{:e}".format(swig_time) + "[sec]")
c
void imgGray2RGB(unsigned char *inArr, int inDim1, int inDim2,
unsigned char *inplaceArr, int inplaceDim1, int inplaceDim2, int inplaceDim3)
{
int height = inplaceDim1;
int width = inplaceDim2;
int channel = inplaceDim3;
int h, w;
int in_point, out_point;
for (h = 0; h < height; h++)
{
for (w = 0; w < width; w++)
{
in_point = h * width + w;
out_point = channel * (h * width + w);
inplaceArr[out_point] = inArr[in_point];
inplaceArr[out_point + 1] = inArr[in_point];
inplaceArr[out_point + 2] = inArr[in_point];
}
}
}
Execution speed with image normalization in addition to grayscale to RGB conversion
python_time:1.460791e-03[sec]
swig_time__:3.521442e-04[sec]
OpenCV should also be implemented in C/C ++, but SWIG is four times faster than OpenCV. This is because all the processing is completed in one raster scan, and the amount of processing is greatly reduced. Of course, the Python package of OpenCV is divided into function units, so if you want to speed up by reducing such processing, processing in C/C ++ is indispensable.
python
def test4():
img_size = (256, 256)
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]
mean_np = np.array(mean, dtype=np.float32)
std_np = np.array(std, dtype=np.float32)
org_img = np.random.randint(0, 256, (img_size), dtype=np.uint8)
# python
start = time.time()
res_py = cv2.cvtColor(org_img, cv2.COLOR_GRAY2RGB).astype(np.float32)
res_py = ((res_py / 255) - mean_np) / std_np
python_time = time.time() - start
# swig
start = time.time()
res_swig = np.zeros((*img_size, 3), dtype=np.float32)
imgNormalize(org_img, res_swig, *mean, *std)
swig_time = time.time() - start
print("array_equal: {}".format(np.array_equal(res_py, res_swig)))
print("python_time:{:e}".format(python_time) + "[sec]")
print("swig_time__:{:e}".format(swig_time) + "[sec]")
c
void imgNormalize(unsigned char *inArr, int inDim1, int inDim2,
float *inplaceArr, int inplaceDim1, int inplaceDim2, int inplaceDim3,
float meanR, float meanG, float meanB,
float stdR, float stdG, float stdB)
{
int height = inplaceDim1;
int width = inplaceDim2;
int channel = inplaceDim3;
int h, w;
int val;
int inPoint, outPoint;
for (h = 0; h < height; h++)
{
for (w = 0; w < width; w++)
{
inPoint = h * width + w;
outPoint = channel * (w + width * h);
val = inArr[inPoint];
inplaceArr[outPoint] = ((float)val / 255 - meanR) / stdR;
inplaceArr[outPoint + 1] = ((float)val / 255 - meanG) / stdG;
inplaceArr[outPoint + 2] = ((float)val / 255 - meanB) / stdB;
}
}
}
Next is the implementation edition. In the implementation section, we plan to introduce a little application such as basic SWIG usage, passing Numpy directly to an argument and referencing a pointer from the C/C ++ side.
Recommended Posts