[PYTHON] About the main tasks of image processing (computer vision) and the architecture used

Purpose of this post

Create a guide for choosing an implementation method when solving problems related to image processing.

Problem-solving flow

item Contents
Task definition Define which task the problem to be solved will be treated as
Architecture decision Determine the main architecture from the defined tasks
Determination of evaluation index Determine the appropriate evaluation index for the problem

Key tasks of image processing

When the problem you want to solve is image recognition, define which task it is according to your requirements

Famous architecture for each task

Image classification

Object detection

Semantic segmentation

Anomaly detection

Reference: https://www.youtube.com/watch?v=vFpZrxaq5xU

Evaluation index for each task

Semantic segmentation

Recommended Posts

About the main tasks of image processing (computer vision) and the architecture used
About the behavior of copy, deepcopy and numpy.copy
About the processing speed of SVM (SVC) of scikit-learn
Image processing? The story of starting Python for
About the behavior of Queue during parallel processing
About the * (asterisk) argument of python (and itertools.starmap)
Think about the next generation of Rack and WSGI
Personal notes about the integration of vscode and anaconda
100 language processing knock-42: Display of the phrase of the person concerned and the person concerned
100 language processing knock-29: Get the URL of the national flag image
The image display function of iTerm is convenient for image processing.
About the ease of Python
100 image processing knocks !! (001 --010) Carefully and carefully
About the components of Luigi
About the features of Python
Image expansion and contraction processing
Understand the function of convolution using image processing as an example
Display the image of the camera connected to the personal computer on the GUI.
Flow of getting the result of asynchronous processing using Django and Celery
Read the image of the puzzle game and output the sequence of each block
Consider the speed of processing to shift the image buffer with numpy.ndarray
Verify the compression rate and time of PIXZ used in practice