(Python: OpenCV) I tried to output a value indicating the distance between regions while binarizing the video in real time.


Even though I work in the non-IT department of the manufacturing industry, I have come to work with an awareness of AI and IoT. I work closer to production engineering jobs that are close to the manufacturing site, but one of the challenges in systematization is that it is not balanced with the profits to the factory. There is something I want to do, but if I estimate it, the introduction cost will be high (labor costs are often high ...) and I may give up. Therefore, the following content is what I am thinking that it is good to make it from the viewpoint of skill improvement.

  1. Create a program for image processing (image ver.)
  2. ** This time: Create a program for image processing (video ver.) **
  3. Build environment + store program on Raspberry Pi
  4. Check if the captured image can be processed in real time
  5. Search for good processing methods that will lead to factory improvement
  6. Improve KPI values and achieve results

We aim to process the recorded video used at the manufacturing site in real time (binarization, calculation / display of a certain distance, etc.) and display the result. Surprisingly, it can help on-site workers to easily manufacture even simple processed video images.

Well, it's been a long time, but the outline of this time is as follows.

The previous article is here.

Binarize the image. Furthermore, the shortest distance between two regions can be calculated (Ver1.1). https://qiita.com/Fumio-eisan/items/10c54af7a925b403f59f

Simply display the video

First, perform the process of displaying the video. This time, we will process the video with the following capture for about 7 seconds.



#Simply display
import cv2
import sys

file_path = 'sample_.mov'
delay = 1
window_name = 'frame'

cap = cv2.VideoCapture(file_path)
text = 'text.wmv'

if not cap.isOpened():

while True:
    ret, frame = cap.read()
   # if not ret: #If you insert these two lines, the video will be played once.
   #     break
    if ret:
        frame = cv2.resize(frame, dsize=(600, 400))
        cv2.imshow(window_name, frame)
        if cv2.waitKey(delay) & 0xFF == ord('q'):
        cap.set(cv2.CAP_PROP_POS_FRAMES, 0)


It is OK if you describe the path of the video you want to display in file_path. In the case of this program, the video will continue to play in an infinite loop. You can stop the video playback by pressing the q key.

Regarding the display of the video, we are shooting according to the following procedure.

  1. Load the video with the cv2.VideoCapture () method
  2. Play with the syntax in while
  3. Continue to display every frame (30 frames in 1 second if 30fps)
  4. (Depending on the process) Infinite loop or end when playback is completed once, etc.

It has become. I will write a note later, but since it is processed every frame in 3., you need to be careful if you want to change the telop (value to be displayed, etc.) that you want to display every second or every few seconds.

I want to add subtitles


cv2.putText(frame, text,(100, 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0, 255, 0), thickness=2)

Subtitle output is possible with this display. The arguments are as follows. 1st argument: video frame, 2nd argument: text to be displayed, 3rd argument: x, y position, 4th argument: font, 5th argument: character size, 6th argument: BGR color information, 7th argument : Character thickness

I want to binarize

Now, let's binarize it and display it in black and white. The method is OK if you define the upper and lower limits of the color to be extracted like the image and define it with the cv2.inRange () method.


#Binarization process
import cv2
import sys

camera_id = 0
delay = 1
window_name = 'frame'
file_path = 'sample_.mov'

cap = cv2.VideoCapture(file_path)

import numpy as np
bgrLower = np.array([0, 100, 100])    #Lower limit of colors to extract(BGR)
bgrUpper = np.array([250,250, 250])

if not cap.isOpened():

while True:
    ret, frame = cap.read()
    if not ret:
    frame = cv2.resize(frame, dsize=(600, 400))

    img_mask = cv2.inRange(frame, bgrLower, bgrUpper) 
    contours, hierarchy = cv2.findContours(img_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    contours.sort(key=lambda x: cv2.contourArea(x), reverse=True)
    #target_contour = max(contours, key=lambda x: cv2.contourArea(x))
    #img_mask = cv2.line(img_mask, (250,300), (350,300), (120,120,120), 10) #The second argument is the start point, the third argument is the end point, the fourth argument is the color, and the fifth argument is the line thickness.

    cv2.imshow(window_name, img_mask)
    #cv2.imshow(window_name,img_mask, [contours[0],contours[1]])

    if cv2.waitKey(delay) & 0xFF == ord('q'):



I was able to successfully binarize.

I want to output the distance between the enclosed areas

Well, this is the main subject. Define between the enclosed areas as shown below and find the distance between them. And I would like to output that value as appropriate.


Find the distance between two points


    contours, hierarchy = cv2.findContours(img_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)#Boundary drawing
    contours.sort(key=lambda x: cv2.contourArea(x), reverse=True)#Sort boundaries

Enclose the white area with the cv2.findContours () method. The return value is stored in contours. Then sort the contours by area.

Get the coordinates of two regions


    x1=np.unravel_index(np.argmax(contours[0],axis=0), contours[0].shape)
    x2=np.unravel_index(np.argmax(contours[1],axis=0), contours[0].shape)
    img_mask = cv2.line(img_mask, tuple(x1[0][0]), tuple(x2[0][0]), (120,120,120), 3)

Gets the coordinates surrounding the area. Returns x1, x2 with argmax, which takes the maximum value of the coordinates in contours [0], [1](the value where either x or y is the maximum). In the case of argmax, it is flattened to one dimension (determined in one dimension regardless of x and y coordinates), so the unravel_index () method returns the index as coordinates.

Then, actually insert the coordinates with the cv2.line () method to connect the coordinates.

(Supplement) Understand the numerical values stored in contours


It's complicated like this.

I want to display the distance value on the screen

Now, let's display this calculated value on the screen. Normally, it can be displayed by the cv2.putText () method. However, if it is left as it is, the value for each frame will be calculated and displayed. This will make the value flickering and hard to see. As a countermeasure, you can update the value every certain number of frames and display it. The following is done so that the value is updated in 30 frames (≈ about every second this time) using the if syntax.


    if idx % 30 == 0:
        text =str(math.floor(np.linalg.norm(x1[0][0]-x2[0][0])))
    cv2.putText(img_mask, text,(300, 100), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 255, 255), thickness=3)

Actual result

Here is the result of processing and outputting in this way.


Originally, I wish I could give a value to white or black for each of the two areas of white and black, but I couldn't adjust it well. In addition, a line was drawn between the areas of the adapter, which is the white part (this time it is one thing, but it is divided into two areas), and it was also possible to output a numerical value indicating the distance.

At the end

Well, this time I made a program to play while processing the video. The point was that it was not possible to perform heavy processing because it was processed frame by frame. Originally, I wanted to find the shortest distance between regions and subtract that distance, but I couldn't implement it that much. Also, since there were many articles in which OpenCV itself was summarized in c ++, I had a hard time searching. ..

The program is stored below. https://github.com/Fumio-eisan/movie_20200406

Recommended Posts

(Python: OpenCV) I tried to output a value indicating the distance between regions while binarizing the video in real time.
I tried to display the video playback time (OpenCV: Python version)
I tried to create a Python script to get the value of a cell in Microsoft Excel
I tried to describe the traffic in real time with WebSocket
I tried to find out the difference between A + = B and A = A + B in Python, so make a note
I tried to display the altitude value of DTM in a graph
I tried "binarizing" the image with Python + OpenCV
[Python & SQLite] I tried to analyze the expected value of a race with horses in the 1x win range ①
I tried to graph the packages installed in Python
I tried to implement a pseudo pachislot in Python
Python OpenCV tried to display the image in text.
Change the standard output destination to a file in Python
I tried face recognition from the video (OpenCV: python version)
I tried "How to get a method decorated in Python"
I tried to illustrate the time and time in C language
How to get the last (last) value in a list in Python
I tried to implement the mail sending function in Python
I tried to enumerate the differences between java and python
I tried to make a stopwatch using tkinter in python
How to write offline real time I tried to solve the problem of F02 with Python
I also tried to imitate the function monad and State monad with a generator in Python
I wrote a doctest in "I tried to simulate the probability of a bingo game with Python"
I tried to make a function to judge whether the major stock exchanges in the world are daylight saving time with python
I tried to make a regular expression of "time" using Python
I tried to process the image in "sketch style" with OpenCV
I tried to implement a misunderstood prisoner's dilemma game in Python
I tried to process the image in "pencil style" with OpenCV
I tried to cut out a still image from the video
A story that didn't work when I tried to log in with the Python requests module
Part 1 I wrote the answer to the reference problem of how to write offline in real time in Python
I tried "Implementing a genetic algorithm (GA) in python to solve the traveling salesman problem (TSP)"
How to write offline real time I tried to solve E11 with python
I tried to classify guitar chords in real time using machine learning
I tried to develop a Formatter that outputs Python logs in JSON
I made a program to check the size of a file in Python
A useful note when using Python for the first time in a while
Python: I want to measure the processing time of a function neatly
I tried to implement a card game of playing cards in Python
How to write offline real time I tried to solve E12 with python
I tried to implement PLSA in Python
I tried to implement permutation in Python
I tried to implement PLSA in Python 2
I tried to implement ADALINE in Python
I tried to implement PPO in Python
How to generate a QR code and barcode in Python and read it normally or in real time with OpenCV
Part 1 I wrote an example of the answer to the reference problem of how to write offline in real time in Python
In IPython, when I tried to see the value, it was a generator, so I came up with it when I was frustrated.
Introduction to AI creation with Python! Part 2 I tried to predict the house price in Boston with a neural network
I tried to create a class that can easily serialize Json in Python
To output a value even in the middle of a cell with Jupyter Notebook
Use libsixel to output Sixel in Python and output a Matplotlib graph to the terminal.
[Python] Smasher tried to make the video loading process a function using a generator
I searched for the skills needed to become a web engineer in Python
[Python] I tried to get the type name as a string from the type function
I tried to implement what seems to be a Windows snipping tool in Python
I tried "smoothing" the image with Python + OpenCV
I tried "differentiating" the image with Python + OpenCV
I tried to optimize while drying the laundry
I want to create a window in Python
I tried simulating the "birthday paradox" in Python
I tried the least squares method in Python