Wouldn't it be fun if you could easily "*
recognize things` **" with AI using a webcam?
You can do that ** easily ** by using the published model.
Let's do it now!
Capture the image from the webcam to the PC, let AI ** real-time recognition
** what is reflected there, and display up to TOP3 on the screen. This time, we will use ** trained model
**, so there is no time-consuming AI learning, so you can play ** quickly **.
It uses a library called ʻOpenCV to capture image data from a webcam and an AI library called
keras` to identify the image data. Install the required library packages.
What you need | Remarks column |
---|---|
Note PC with webcam, etc. | A webcam connected to a PC via USB is also OK |
development language | Python3.7 * The version used is 3.7.7 |
Main required libraries | 【 OpenCV 】 Library for processing images and videos * The version used is 4.3.0 【 keras 】 Python language neural network library * The version used is 2.3.1 |
DenseNet121
I'm doing it because it's easy Suddenly called
DenseNet121
, Wakewakaran! No more!
It's okay. calm down please. This is a ** trained model **, which means that we will use a training model called DenseNet121
this time. You can do it without knowing the details!
――Why did you choose the DenseNet121 model
?
Various image classification models such as VGG16
and ResNet50
can be easily used from the keras library
, but the image classification model used this time is ** model size is relatively small at 33MB. I chose DenseNet121
because it has a good recognition rate **. According to "Keras Documentation", if you recognize things with DenseNet121
, the recognition accuracy rate up to TOP5 will be about 92%. (Approximately 75% for TOP1 only)
Source: Keras Documentation https://keras.io/ja/applications/#documentation-for-individual-models
After installing the package of the library to be used (keras, opencv, etc.), copy the following AI program.
main.py
# -------------------------------------------------------------------------------------
#Display the camera on the screen.
#Image judgment with DenseNet121
# [+]Change Camera Device with key
# [s]Save image with key
# [ESC] or [q]End with key
# -------------------------------------------------------------------------------------
from keras.applications.densenet import DenseNet121
from keras.applications.densenet import preprocess_input, decode_predictions
from keras.preprocessing import image
import numpy as np
import cv2
import datetime
# -------------------------------------------------------------------------------------
# capture_device
# -------------------------------------------------------------------------------------
def capture_device(capture, dev):
while True:
#Capture image from camera device
ret, frame = capture.read()
if not ret:
k = ord('+')
return k
#DenseNet121 image judgment
resize_frame = cv2.resize(frame, (300, 224)) # 640x480(4:3) -> 300x224(4:3)Image resizing
trim_x, trim_y = int((300-224)/2), 0 #Trimmed to 224x224 for judgment
trim_h, trim_w = 224, 224
trim_frame = resize_frame[trim_y : (trim_y + trim_h), trim_x : (trim_x + trim_w)]
x = image.img_to_array(trim_frame)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x) #Image AI judgment
# Usage
disp_frame = frame
txt1 = "model is DenseNet121"
txt2 = "camera device No.(" + str(dev) + ")"
txt3 = "[+] : Change Device"
txt4 = "[s] : Image Capture"
txt5 = "[ESC] or [q] : Exit"
cv2.putText(disp_frame, txt1, (10, 30), cv2.FONT_HERSHEY_PLAIN, 1, (255, 255, 255), 1, cv2.LINE_AA)
cv2.putText(disp_frame, txt2, (10, 60), cv2.FONT_HERSHEY_PLAIN, 1, (255, 255, 255), 1, cv2.LINE_AA)
cv2.putText(disp_frame, txt3, (10, 90), cv2.FONT_HERSHEY_PLAIN, 1, (255, 255, 255), 1, cv2.LINE_AA)
cv2.putText(disp_frame, txt4, (10, 120), cv2.FONT_HERSHEY_PLAIN, 1, (255, 255, 255), 1, cv2.LINE_AA)
cv2.putText(disp_frame, txt5, (10, 150), cv2.FONT_HERSHEY_PLAIN, 1, (255, 255, 255), 1, cv2.LINE_AA)
#Image judgment character output
output1 = 'No.1:{0}:{1}%'.format(decode_predictions(preds, top=3)[0][0][1],
int(decode_predictions(preds, top=3)[0][0][2] * 100))
output2 = 'No.2:{0}:{1}%'.format(decode_predictions(preds, top=3)[0][1][1],
int(decode_predictions(preds, top=3)[0][1][2] * 100))
output3 = 'No.3:{0}:{1}%'.format(decode_predictions(preds, top=3)[0][2][1],
int(decode_predictions(preds, top=3)[0][2][2] * 100))
cv2.putText(disp_frame, output1, (10, 300), cv2.FONT_HERSHEY_PLAIN, 1, (255, 255, 255), 1, cv2.LINE_AA)
cv2.putText(disp_frame, output2, (10, 330), cv2.FONT_HERSHEY_PLAIN, 1, (255, 255, 255), 1, cv2.LINE_AA)
cv2.putText(disp_frame, output3, (10, 360), cv2.FONT_HERSHEY_PLAIN, 1, (255, 255, 255), 1, cv2.LINE_AA)
#Camera screen output
cv2.imshow('camera', disp_frame)
#Wait 1msec and get the key
k = cv2.waitKey(1) & 0xFF
# [ESC] or [q]Continue to display on the screen until is pressed
if (k == ord('q')) or (k == 27):
return k
# [+]Change device with
if k == ord('+'):
txt = "Change Device. Please wait... "
XX = int(disp_frame.shape[1] / 4)
YY = int(disp_frame.shape[0] / 2)
cv2.putText(disp_frame, txt, (XX, YY), cv2.FONT_HERSHEY_PLAIN, 1, (255, 255, 255), 1, cv2.LINE_AA)
cv2.imshow('camera', disp_frame)
cv2.waitKey(1) & 0xFF
return k
# [s]Save the image displayed on the screen with
elif k == ord('s'):
cv2.imwrite('camera_dsp{}.{}'.format(datetime.datetime.now().strftime('%Y%m%d_%H%M%S_%f'), "png"), disp_frame)
# cv2.imwrite('camera_rsz{}.{}'.format(datetime.datetime.now().strftime('%Y%m%d_%H%M%S_%f'), "png"), resize_frame)
# cv2.imwrite('camera_trm{}.{}'.format(datetime.datetime.now().strftime('%Y%m%d_%H%M%S_%f'), "png"), trim_frame)
# cv2.imwrite('camera_raw{}.{}'.format(datetime.datetime.now().strftime('%Y%m%d_%H%M%S_%f'), "png"), frame)
# -------------------------------------------------------------------------------------
# camera
# -------------------------------------------------------------------------------------
def camera(dev):
while True:
capture = cv2.VideoCapture(dev)
ret = capture_device(capture, dev)
if (ret == ord('q')) or (ret == 27):
#Resource release
capture.release()
cv2.destroyAllWindows()
break
if ret == ord('+'):
dev += 1
if dev == 9:
dev = 0
# -------------------------------------------------------------------------------------
# main
# -------------------------------------------------------------------------------------
# ●DenseNet121
# https://keras.io/ja/applications/#densenet
#
#By running DenseNet121
# (1)DenseNet121 model,(2)Two of the classification files will be downloaded automatically.
#Therefore, at the first startup, the DenseNet 121 model, which is about 33MB, and the classification file
#It takes a long time to start up because it needs to be downloaded,
#After the second startup, the download will be omitted, so the startup will be faster.
#
#The download file is stored in the following directory.
# 「C:/Users/xxxx/.keras/models/」
#
# (1)Model of DenseNet121: DenseNet121_weights_tf_dim_ordering_tf_kernels.h5
# (2)Classification file(All 1000 categories):imagenet_class_index.json
#Image classification model
model = DenseNet121(weights='imagenet')
#Camera activation
camera(dev=0)
By executing the DenseNet121 library function
, both the DenseNet121 model
and the classification file
will be downloaded automatically. Therefore, it takes a long time to start up because it is necessary to download the DenseNet121 model
and class classification file
, which are about 33MB, at the first startup, but after the second startup, the download is omitted and the startup is faster.
File storage location
C:/Users/xxxx/.keras/models/
-** DenseNet121 model ** (DenseNet121_weights_tf_dim_ordering_tf_kernels.h5) -** Class classification file ** (imagenet_class_index.json)
If the window opens like this, the webcam image is output, and the TOP3 information recognized by AI is displayed on the screen, it is successful. By the way, the result of showing our dog (Toy Poodle) says that AI will be a Toy Poodle with a probability of 76% as shown below, so the recognition of AI will be correct.
** [Example of results] AI recognition rate TOP3 **
- toy_poodle : 76%
- miniature_poodle : 20%
- Dandie_Dinmont : 1%
- The recognition rate fluctuates in real time.
The download log below is displayed only for the first time.
C:\Users\xxxx\anaconda3\envs\python37\python.exe C:/Users/xxxx/PycharmProjects/OpenCV/sample09.py
Using TensorFlow backend.
2020-08-12 10:38:59.579123: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
Downloading data from https://github.com/keras-team/keras-applications/releases/download/densenet/densenet121_weights_tf_dim_ordering_tf_kernels.h5
8192/33188688 [..............................] - ETA: 12:39
16384/33188688 [..............................] - ETA: 8:26
40960/33188688 [..............................] - ETA: 5:03
106496/33188688 [..............................] - ETA: 2:59
245760/33188688 [..............................] - ETA: 1:42
~ Omitted ~
32743424/33188688 [============================>.] - ETA: 0s
32776192/33188688 [============================>.] - ETA: 0s
32956416/33188688 [============================>.] - ETA: 0s
33005568/33188688 [============================>.] - ETA: 0s
33193984/33188688 [==============================] - 32s 1us/step
Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
8192/35363 [=====>........................] - ETA: 0s
40960/35363 [==================================] - 0s 0us/step
[ WARN:0] global C:\projects\opencv-python\opencv\modules\videoio\src\cap_msmf.cpp (436) `anonymous-namespace'::SourceReaderCB::~SourceReaderCB terminating async callback
Process finished with exit code 0
If you refer to the "ʻimagenet_class_index.json" file, you can see that the image data can be classified into ** 1000 classes ** from No. 0 to 999 below. It is also a ** restriction **, but even if you recognize something that is not described here, it will be classified as one of these. If you want to classify things that are not here, please create a
new model or check
transfer learning,
fine tuning`, etc.
imagenet_class_index.json
{
"0": ["n01440764", "tench"],
"1": ["n01443537", "goldfish"],
"2": ["n01484850", "great_white_shark"],
"3": ["n01491361", "tiger_shark"],
"4": ["n01494475", "hammerhead"],
"5": ["n01496331", "electric_ray"],
"6": ["n01498041", "stingray"],
"7": ["n01514668", "cock"],
"8": ["n01514859", "hen"],
"9": ["n01518878", "ostrich"],
~ Omitted ~
"990": ["n12768682", "buckeye"],
"991": ["n12985857", "coral_fungus"],
"992": ["n12998815", "agaric"],
"993": ["n13037406", "gyromitra"],
"994": ["n13040303", "stinkhorn"],
"995": ["n13044778", "earthstar"],
"996": ["n13052670", "hen-of-the-woods"],
"997": ["n13054560", "bolete"],
"998": ["n13133613", "ear"],
"999": ["n15075141", "toilet_tissue"]
}
Thank you for your hard work!
Recommended Posts