[PYTHON] Cat breed identification with deep learning

Cat breed identification with deep learning

Russian_Blue_212.jpg

[('Russian_Blue', 0.58100362140429651),


   ('British_Shorthair', 0.22552991563514049),
   ('Abyssinian', 0.057159848358045016),
   ('Bombay', 0.043851502320485049),
   ('Egyptian_Mau', 0.030686072815385441)])]

Cat detection with OpenCV detected cat faces, but this time I will use deep learning technology to identify cat breeds.

If you are interested, the technical details are written on the blog.

Here, a technique called ** Deep Convolutional Neural Network (DCNN) ** is applied to general object recognition to identify cat breeds. The problem in this area is called ** Fine-Grained Visual Categorization (FGVC) **, which narrows down the target domain (this time the cat breed) to classify. It is difficult to achieve high accuracy because we are dealing with things that are visually similar.

Implementation

There are several DCNN implementations, but here we use a library called Caffe (* The library itself is an open source BSD 2-Clause license, but the ImageNet data is Note that it is non-commercial). The output of the intermediate layer (hidden layer) of DCNN is extracted as a 4096-dimensional feature quantity, and an appropriate classifier is created using it as a feature to make a prediction. I think it would be easier to use the scikit-learn implementation for the classifier.

The source code is posted on GitHub, so please refer to it if you are interested. The following processing is implemented. (It's a scribbled command line tool, not a library.)

:octocat: cat-fancier/classifier at master · wellflat/cat-fancier

Verification

Let's benchmark with a dataset of animal images published by the University of Oxford.

cat_classes.jpg

Since it is 12 classes, it will be a light task. This time, we will use 1800 for learning and 600 for verification. It seems that the number of learning images is 150 per class, which seems to be small, but if there are about 12 classes, this number can provide reasonable accuracy. Since the number of data is small, learning will be completed in about tens of minutes even if you do a grid search on a cheap VPS. Here, only the classification result by SVM-RBF is listed.

## SVM RBF Kernel
SVC(C=7.7426368268112693, cache_size=200, class_weight=None, coef0=0.0,
  degree=3, gamma=7.7426368268112782e-05, kernel='rbf', max_iter=-1,
  probability=False, random_state=None, shrinking=True, tol=0.001,
  verbose=False)
 
                   precision    recall  f1-score   support
 
       Abyssinian       0.84      0.91      0.88        47
           Bengal       0.84      0.83      0.84        46
           Birman       0.72      0.79      0.75        52
           Bombay       0.98      0.98      0.98        46
British_Shorthair       0.82      0.75      0.78        53
     Egyptian_Mau       0.87      0.87      0.87        61
       Maine_Coon       0.87      0.89      0.88        45
          Persian       0.85      0.91      0.88        45
          Ragdoll       0.76      0.76      0.76        41
     Russian_Blue       0.84      0.82      0.83        57
          Siamese       0.81      0.69      0.75        55
           Sphynx       0.94      0.96      0.95        52
 
      avg / total       0.85      0.84      0.84       600

svm_confusion_matrix_rbf.png roc.png

In the case of SVM-RBF, the accuracy was 84.5%. The accuracy of some long-haired species such as ragdolls is low, but I think it's OK if the accuracy is as high as 1800 learning data. The blog also posts the results of other classifiers, but I think it is more realistic to use linear SVM or logistic regression for large-scale data due to the problem of prediction speed.

It should be noted that the neural network automatically finds (learns) the features that are effective for recognition without using the hand-crafted features. This time, DCNN was used as a feature extractor, but Fine-tuning is used to fine-tune the entire network using other teacher data, using the parameters of the model learned based on large-scale teacher data such as ImageNet as initial values. If you use a model created using a technique called (fine tuning), you may be able to classify with higher accuracy. I tried various things at hand, but this task did not show a significant improvement in accuracy for the time (and memory usage) required to create the model. I don't think there is any difficulty in the Fine-tuning procedure if you follow the tutorial on the Caffe official website.

Deep CNN has come to see its name frequently in well-known competitions such as ILSVRC. In the future, I think that the number of cases where deep learning is used at the product level such as Web services and applications will increase steadily. Once a practical level method is established, money will be spent on how to collect data.

Abyssinian_178.jpg [('Abyssinian', 0.621), ('Bengal', 0.144), ('Sphynx', 0.087)] Abyssinian probability 62.1%, Bengal probability 14.4%, Sphinx probability 8.7%

Recommended Posts

Cat breed identification with deep learning
Try deep learning with TensorFlow
Deep Kernel Learning with Pyro
Try Deep Learning with FPGA
Generate Pokemon with Deep Learning
Try Deep Learning with FPGA-Select Cucumbers
Make ASCII art with deep learning
Try deep learning with TensorFlow Part 2
Solve three-dimensional PDEs with deep learning.
Check squat forms with deep learning
Categorize news articles with deep learning
Forecasting Snack Sales with Deep Learning
Deep Learning
Make people smile with Deep Learning
Classify anime faces with deep learning with Chainer
Try Bitcoin Price Forecasting with Deep Learning
Try with Chainer Deep Q Learning --Launch
Try deep learning of genomics with Kipoi
Sentiment analysis of tweets with deep learning
Python: Gender Identification (Deep Learning Development) Part 1
Python: Gender Identification (Deep Learning Development) Part 2
Deep Learning Memorandum
Start Deep learning
Python Deep Learning
Deep learning × Python
The story of doing deep learning with TPU
99.78% accuracy with deep learning by recognizing handwritten hiragana
First Deep Learning ~ Struggle ~
Learning Python with ChemTHEATER 03
"Object-oriented" learning with python
A story about predicting exchange rates with Deep Learning
Learning Python with ChemTHEATER 05-1
Python: Deep Learning Practices
Deep learning / activation functions
Deep Learning from scratch
Deep learning image analysis starting with Kaggle and Keras
Deep learning 1 Practice of deep learning
Deep learning / cross entropy
First Deep Learning ~ Preparation ~
First Deep Learning ~ Solution ~
[AI] Deep Metric Learning
Learning Python with ChemTHEATER 02
I tried deep learning
Learning Python with ChemTHEATER 01
Extract music features with Deep Learning and predict tags
Classify anime faces by sequel / deep learning with Keras
Python: Deep Learning Tuning
Deep learning large-scale technology
Deep learning / softmax function
Try to build a deep learning / neural network with scratch
[Evangelion] Try to automatically generate Asuka-like lines with Deep Learning
Create an environment for "Deep Learning from scratch" with Docker
(Now) Build a GPU Deep Learning environment with GeForce GTX 960
Recognize your boss and hide the screen with Deep Learning
[Deep learning] Image classification with convolutional neural network [DW day 4]
I captured the Touhou Project with Deep Learning ... I wanted to.
Deep Learning with Shogi AI on Mac and Google Colab
I tried to divide with a deep learning language model
HIKAKIN and Max Murai with live game video and deep learning
Easy deep learning web app with NNC and Python + Flask
Sine curve estimation with self-made deep learning module (python) + LSTM