Recently, more and more articles are being written on the web regarding the problem of collecting data. I want to search, research, and learn.
This is a collection of links to articles that may be helpful.
-How to increase the number of machine learning dataset images
-Deep learning to determine if you have big breasts from your face photo (it works or not)
-Learning data set 2 that can be used for extracting feature points of face images
-Learning data set that can be used for extracting feature points of face images (updated from time to time) --This kind of manual feature point annotation is important for the first learning of facial feature point extraction. ――However, once a method has been constructed that gives the accuracy of deriving the feature points of the facial image, it is necessary to utilize those methods. --Currently, the performance of the feature point (kandmark) derivation library such as dlib is high, so in most cases, that library can replace the manual work.
-Nekoto Image Processing part 1-Material Collection
-How to put OpenCV in Raspberry Pi and easily collect images of face detection results with Python
--Ryan Mitchell, Translated by Toshiaki Kurokawa, Technical Supervision by Takeshi Shimada "Web Scraping with Python"
--Toby Segaran, Translated by Hitoshi Toyama and Masao Kamozawa "Collective Intelligence Programming"
--Interface July 2016 issue ["From how to make the most difficult learning database to Raspberry Pi 1, 2, 3 recognition test Learning and recognition of the target fish "Nabeka"]](http://www.kumikomi.net/interface/contents/201607.php)
According to reports, data for machine learning is often created manually. Also, if the purpose is clear and you expect to recover the results of your investment in the work, hire a large number of people. I hear that you are constantly adding input data manually and making improvements.
In the field of pedestrian detection, images of roads and streets that do not contain pedestrians are very important. In the case of an in-vehicle camera, it is important that the data has an angle of view that can be seen from the vehicle. If you want to learn pedestrian detection with Boosting, you need a large number of images that do not include people. With Cascade-type classifiers, the later the stage, the higher the proportion of confusing images. In such a case, if you find a human image and use it as a negative, the performance of the detector will be significantly reduced. With Cascade-type classifiers, the later the stage, the more the trained results tend to be more dependent on the trained data set (both positive and negative images). (Addition: Are few people using Boosting these days? The importance of negative samples remains the same.)
For example, when trying to make a dog face detector, it is not certain that collecting as many dog faces as can be detected by existing detectors will be useful for the performance of the detector. Shiba Inu and Bulldog have too different facial shapes. I think it is doubtful that the bulldog's face can be detected by collecting only the shiba inu's face. Just because it can be detected in one face, it does not mean that it can be detected in another. Therefore, it is dangerous to try to improve the performance of the detector by using the image that can be detected by the existing detector. You should make it possible to use images that cannot be detected by existing detectors, such as by using the tracking results of the next time in the scene where the dog's face can be detected. (I would like to know what this situation is like in deep learning.) It is claimed that deep learning can authenticate a person based on a profile and compared with a database of front faces.
You can also use YOLO to detect many types of objects in your videos. Even if there is a false detection, it is convenient to have a high detection speed if it is assumed that the selection will be done manually.
The HOG + SVM detector in dlib can be an object detector with very little positive data in the image. It's surprising that it's very different from the Haar Casecade detector.
Machine learning with dlib to detect objects
When collecting training data for hardware development, it is also possible to collect the data using a software version of the detector.
reference: Importance of machine learning datasets
CIFAR-10 and CIFAR-100 are a dataset of 80 million labeled color images with a size of 32x32. [Python] How to read CIFAR-10, CIFAR-100 data
There are various trained models in Model Zoo. If you make a detector using it, you can sample the image and automatically generate annotations within the range of the learned characteristics.
--Allows the automatic generation of annotation files for images using the existing detector in model zoo. --Set the threshold value of the detector loosely, and collect images that are slightly outside the original learning range. ――From the collected data, pick up what should be detected by the detector you want to make, and annotate it with your own rules. --Run the learning program using both the automatically generated annotation data and the data of the learning range extended by your own rules. --Repeat image sampling and automatic annotation using the learned results. --Learn again. By collecting the data in this way, you can collect the learning data of the template for making the detector you want to make.
Caltech Pedestrian Detection Benchmark https://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/
Robust Multi-Person Tracking from Mobile Platforms https://data.vision.ee.ethz.ch/cvl/aess/dataset/
Daimler Pedestrian Segmentation Benchmark Dataset http://www.gavrila.net/Datasets/Daimler_Pedestrian_Benchmark_D/daimler_pedestrian_benchmark_d.html
It is a database of pedestrians with segmentation. Only available for non-commercial purposes. It is useful for learning and evaluation of pedestrians.
FDDB: Face Detection Data Set and Benchmark
https://github.com/StephenMilborrow/muct#the-muct-face-database
As a negative dataset http://cocodataset.org/#home
link collection Computer Vision Datasets
Yet Another Computer Vision Index To Datasets (YACVID)
60 Facial Recognition Databases
In most of the papers, the origin of the data learned in the implementation is written. So, as you read through those things, you'll reach the data.
In the field of face detection and human detection, there are some open source implementations with reasonable accuracy. So, there is no way not to use it to create a learning dataset, a detector for your own purposes. If you expand the learning data to a ratio of data that is close to your purpose, there is a high possibility that you will get closer to a detector that covers your purpose.
SlideShare SSII2018TS: Large-scale Deep Learning
Concept of each stage of collecting data for machine learning It is not a good idea to use the ratio of training data as it appears. How machine learning datasets are lost How a sloppy person manages experimental data
Recommended Posts