Let's handle transparent png images with OpenCV and do something like the sprite of the old hobby computer. Those who want to make games using sprites in earnest should use pygame. I just want to study OpenCV.
Here, Irasutoya's "Characters of the squadron taking a fixed pose (group)" (reduced version) is used. The background is transparent.
sentai.png |
---|
To import RGBA images, specify * flags = cv2.IMREAD_UNCHANGED * in cv2.imread ()
.
Actually, you don't have to remember such a spell, just specify -1 as the second argument. If the argument is set to 1 or omitted, it will be captured as an RGB 3-channel image.
python
import cv2
filename = "sentai.png "
img4 = cv2.imread(filename, -1)
img3 = cv2.imread(filename)
print (img4.shape) #result:(248, 300, 4)
print (img3.shape) #result:(248, 300, 3)
When this image is displayed with cv2.imshow ()
, the background becomes black. When you draw a picture on a transparent canvas with drawing software, the transparent part is treated as black because there is no color element. All RGB components are zero = (0,0,0) = black.
As I write many times, OpenCV images are stored in numpy.ndarray format, so you can not only trim but also extract each color element by slicing.
python
import cv2
filename = "sentai.png "
img4 = cv2.imread(filename, -1)
b = img4[:, :, 0]
g = img4[:, :, 1]
r = img4[:, :, 2]
a = img4[:, :, 3]
cv2.imshow("img4", img4)
cv2.imshow("b", b)
cv2.imshow("g", g)
cv2.imshow("r", r)
cv2.imshow("a", a)
cv2.waitKey(0)
cv2.destroyAllWindows()
The original image | A element | |
---|---|---|
B element | G element | R element |
Each element has a shape of ndim = 2, that is, (height, width). Therefore, when each element is displayed as an image, it becomes grayscale. Since the Aka Ranger is (G, B, R) = (21,11,213), for example, the brightness of red is quite high, while the brightness of blue and green is low. On the other hand, since the killer is (G, B, R) = (0,171,247), the brightness of red is higher than that of the akaranger, and the green element is also high as it is. Also, the alpha value is 0 for transparent and 255 for opaque. It seems better to remember it as opacity rather than transparency.
Then, how to display each color element in each color is to create an RGB image with other color components set to 0.
Method 1
import cv2
import numpy as np
filename = "sentai.png "
img3 = cv2.imread(filename)
b = img3[:, :, 0]
g = img3[:, :, 1]
r = img3[:, :, 2]
z = np.full(img3.shape[:2], 0, np.uint8)
imgB = cv2.merge((b,z,z))
imgG = cv2.merge((z,g,z))
imgR = cv2.merge((z,z,r))
cv2.imshow("imgB", imgB)
cv2.imshow("imgG", imgG)
cv2.imshow("imgR", imgR)
cv2.waitKey(0)
cv2.destroyAllWindows()
There is also a way to reduce the brightness of unnecessary colors to 0 in the original RGB image.
Method 2
import cv2
import numpy as np
filename = "sentai.png "
img3 = cv2.imread(filename)
imgB = img3.copy()
imgB[:, :, (1,2)] = 0 # 3ch(BGR)1st of(G)And second(R)To 0
imgG = img3.copy()
imgG[:, :, (0,2)] = 0 # 3ch(BGR)0th(B)And second(R)To 0
imgR = img3.copy()
imgR[:, :, (0,1)] = 0 # 3ch(BGR)0th(B)And the first(G)To 0
cv2.imshow("imgB", imgB)
cv2.imshow("imgG", imgG)
cv2.imshow("imgR", imgR)
cv2.waitKey(0)
cv2.destroyAllWindows()
B element in blue | G element in green | R element in red |
---|---|---|
From here is the production. The background image is "Illustration of the Universe (Background Material)". It is a jpeg image and has no alpha value. The image overlaid on the background is "Illustration of an astronaut" (reduced version).
space.jpg |
---|
uchuhikoushi.png |
Create an RGB image and a mask image from an RGBA image. Since the one mentioned earlier as "A element" is one channel, it cannot be combined with the background image. Create a mask image of RGB3 channel in the same way as making the B element blue and the R element red.
python
import cv2
filename = "uchuhikoushi.png "
img4 = cv2.imread(filename, -1)
img3 = img4[:, :, :3] #The first three of RGBA, namely RGB
mask1 = img4[:, :, 3] #The third of RGBA, counting from 0, that is, A
mask3 = cv2.merge((mask1, mask1, mask1)) #3 channel RGB image
Original image (RGBA) | RGB image | Mask image |
---|---|---|
Also, cut out an image of the same size as the foreground from the background.
If your foreground image says "this color isn't part of the main image, it's just used as a background", you can use numpy.where ()
to set the transparency. It's like chroma key.
If you write only the minimum elements, it will be like this.
python
#back and front must be the same shape
transparence = (0,0,0)
result = np.where(front==transparence, back, front)
back | front | result |
---|---|---|
It's easy, but for example, in this image of Irasutoya, it is necessary to confirm in advance that (0,0,0)
is not used for the black hair of the astronaut. If you fail, you will end up with something like Transparent Gachapin.
If you look at other sites, the answer will come out immediately, but let's try and error for studying. Identity in logical operations x and 1 = x x and 0 = 0 x or 1 = 1 x or 0 = x Is performed for arbitrary values, not just Boolean values of 0 and 1. Luminance is expressed in 8 bits, so if you write it roughly x (arbitrary color) and 255 (white) = x (arbitrary color) x (arbitrary color) and 0 (black) = 0 (black) x (arbitrary color) or 255 (white) = 255 (white) x (arbitrary color) or 0 (black) = x (arbitrary color) That is. The table below was made by hand, so I'm sorry if there are any mistakes.
No | back | Calculation | mask | → | tmp |
---|---|---|---|---|---|
1 | OR | → | |||
2 | AND | → | |||
3 | OR | → | |||
4 | AND | → |
No. 1 and No. 4 are likely to be used at this stage. Let's synthesize the foreground image with them.
No | tmp | Calculation | front | → | result | Evaluation |
---|---|---|---|---|---|---|
1-1 | OR | → | × | |||
1-2 | AND | → | × | |||
1-3 | OR | → | × | |||
1-4 | AND | → | ○ | |||
4-1 | OR | → | ○ | |||
4-2 | AND | → | × | |||
4-3 | OR | → | × | |||
4-4 | AND | → | × |
So the correct answers were 1-4 and 4-1. "Foreground image with black background" and "Mask image with black background and white foreground" were prepared in advance, but they could not be combined by themselves. In 1-4, "foreground image with white background" was required, and in 4-1 "mask image with white background and black foreground" was required. True life is something that can't be left.
If you call yourself a sprite, you have to be able to describe it outside the range of the background image. I was going to write a commentary, but already in the previous article ["Making the function of drawing Japanese fonts with OpenCV general purpose"](https://qiita.com/mo256man/items/b6e17b5a66d1ea13b5e3#% E7% 94% BB% E5% 83% 8F% E5% A4% 96% E3% 81% B8% E3% 81% AE% E6% 8F% 8F% E5% 86% 99% E3% 81% B8% E3% 81% AE% E5% AF% BE% E5% BF% 9C), so the explanation is omitted.
Even though I'm studying OpenCV, I can't help but touch PIL.
-* Im * Image to paste.
-* box * The upper left coordinates are represented by (x, y)
. Thankfully, it can be outside the range of the original image. The default value is None
, which is the upper left. There is also a method to specify with a 4-element tuple, but that is omitted.
-* Mask * Mask image. The default value is None
. Images can be specified as RGBA images as well as black and white and grayscale. A kind specification that the alpha value is treated as a mask in the case of RGBA.
To combine a transparent image with an image, just specify ʻImage.paste (im, box, im)` and the same arguments for the first and third. It's PIL, and it makes me want to say what I've done with OpenCV so far.
So far, three methods have been shown. Also, in the previous article, I learned how to PIL only the necessary parts instead of PILing the entire image. So, let's see the execution speed with the following four self-made functions.
--putSprite_npwhere Set the transparent color and combine the images with np.where. There is support outside the background image. --putSprite_mask Set the mask image and combine. There is support outside the background image. --putSprite_pil_all Synthesize with PIL. Make the entire background image PIL. There is no need to deal with the outside of the background image. --putSprite_pil_roi Synthesize with PIL. Only the part necessary for synthesis is made into PIL. When setting the ROI, it corresponds to the outside of the background image.
The content to be executed is like this. I'm cutting the frame of the animated GIF to reduce the capacity. The sauce is long so it's at the bottom.
Average value when each is executed 10 times.
putSprite_npwhere : 0.265903830528259 sec
putSprite_mask : 0.213901996612548 sec
putSprite_pil_all : 0.973412466049193 sec
putSprite_pil_roi : 0.344804096221923 sec
Aside from the fact that the program is not optimized, my machine is slow, and Python is slow in the first place, the slowness of PIL stands out. It was also found that even with PIL, speed can be significantly improved if only the minimum ROI is used. I really appreciate the comments I received in the previous article.
Masking and np.where ()
are faster than PIL. np.where ()
is fast enough, but slightly slower than masking because it makes decisions internally. It is necessary to prepare a mask image for mask processing, but it is probably the lightest in terms of processing because it only overwrites the image.
Masking and np.where ()
cannot be used for translucent images.
Since the mask processing is 0 or 1, the identity holds, and if it is translucent, that is, it is calculated with a value that is neither 0 nor 1. x (background color) and a (half-finished mask) = tmp (strange color) tmp (strange color) or y (foreground color) = z (unexpected color) It is natural that it becomes.
I tried various things, thinking that np.where ()
could be translucent depending on the device, but at least I couldn't.
Next, I would like to challenge translucency and support for rotation.
I knew the function ʻeval ()` that uses a string as a Python command just before I posted this article. Reference: Python-Call a function dynamically from a string
python
import cv2
import numpy as np
from PIL import Image
import time
import math
# numpy.Synthesize with where
#Although RGBA images are imported due to the relationship with other functions
#Only RGB elements are used, not alpha values.
#Transparent color(0,0,0)It's a definite decision, but it's not a good idea.
def putSprite_npwhere(back, front4, pos):
x, y = pos
fh, fw = front4.shape[:2]
bh, bw = back.shape[:2]
x1, y1 = max(x, 0), max(y, 0)
x2, y2 = min(x+fw, bw), min(y+fh, bh)
if not ((-fw < x < bw) and (-fh < y < bh)) :
return back
front3 = front4[:, :, :3]
front_roi = front3[y1-y:y2-y, x1-x:x2-x]
roi = back[y1:y2, x1:x2]
tmp = np.where(front_roi==(0,0,0), roi, front_roi)
back[y1:y2, x1:x2] = tmp
return back
#Mask
#The mask image is created from the RGBA image in the function each time.
#It is faster to make a mask image in advance,
#This is easier to use. I think.
def putSprite_mask(back, front4, pos):
x, y = pos
fh, fw = front4.shape[:2]
bh, bw = back.shape[:2]
x1, y1 = max(x, 0), max(y, 0)
x2, y2 = min(x+fw, bw), min(y+fh, bh)
if not ((-fw < x < bw) and (-fh < y < bh)) :
return back
front3 = front4[:, :, :3]
mask1 = front4[:, :, 3]
mask3 = 255 - cv2.merge((mask1, mask1, mask1))
mask_roi = mask3[y1-y:y2-y, x1-x:x2-x]
front_roi = front3[y1-y:y2-y, x1-x:x2-x]
roi = back[y1:y2, x1:x2]
tmp = cv2.bitwise_and(roi, mask_roi)
tmp = cv2.bitwise_or(tmp, front_roi)
back[y1:y2, x1:x2] = tmp
return back
#Whole background image to be combined with PIL
def putSprite_pil_all(back, front4, pos):
back_pil = Image.fromarray(back)
front_pil = Image.fromarray(front4)
back_pil.paste(front_pil, pos, front_pil)
return np.array(back_pil, dtype = np.uint8)
#Only the part in the background image to be synthesized with PIL
def putSprite_pil_roi(back, front4, pos):
x, y = pos
fh, fw = front4.shape[:2]
bh, bw = back.shape[:2]
x1, y1 = max(x, 0), max(y, 0)
x2, y2 = min(x+fw, bw), min(y+fh, bh)
if not ((-fw < x < bw) and (-fh < y < bh)) :
return back
back_roi_pil = Image.fromarray(back[y1:y2, x1:x2])
front_pil = Image.fromarray(front4[y1-y:y2-y, x1-x:x2-x])
back_roi_pil.paste(front_pil, (0,0), front_pil)
back_roi = np.array(back_roi_pil, dtype = np.uint8)
back[y1:y2, x1:x2] = back_roi
return back
def main(func):
filename_back = "space.jpg "
filename_front = "uchuhikoushi.png "
img_back = cv2.imread(filename_back)
img_front = cv2.imread(filename_front, -1)
bh, bw = img_back.shape[:2]
xc, yc = bw*0.5, bh*0.5
rx, ry = bw*0.3, bh*1.2
cv2.putText(img_back, func, (20,bh-20), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255))
###Start from here to measure the time
start_time = time.time()
for angle in range(-180, 180):
back = img_back.copy()
x = int(xc + rx * math.cos(math.radians(angle)))
y = int(yc + ry * math.sin(math.radians(angle)))
img = eval(func)(back, img_front, (x,y))
#This can be enabled or disabled as needed
#cv2.imshow(func, img)
#cv2.waitKey(1)
elasped_time = time.time() - start_time
###So far
print (f"{func} : {elasped_time} sec")
cv2.destroyAllWindows()
if __name__ == "__main__":
funcs = ["putSprite_npwhere",
"putSprite_mask",
"putSprite_pil_all" ,
"putSprite_pil_roi" ]
for func in funcs:
for i in range(10):
main(func)