[PYTHON] Machine Learning with Caffe -1-Category images using reference model

※ WIP。

Introduction

goal

Classify Caltech images using Caffe's reference model (* the model used in famous papers for which parameter tuning has been completed). The goal is to produce the following output as a concrete visible form.

(Image. You can get on when the implementation is completed successfully)

Rough procedure

** STEP1. ** Download the image dataset to classify ** STEP2. ** Extract features from dataset images ** STEP3. ** Train SVM to classify extracted features by linear SVM. ** STEP4. ** Classify based on features with trained SVM

1. Download the image dataset to classify

Jump to the following page and Download. http://www.vision.caltech.edu/Image_Datasets/Caltech101/#Download

2. Extract features from dataset images

2-1. Download reference model

$ scripts/download_model_binary.py models/bvlc_reference_caffenet

2-2. Implementation of code to extract image features

Extract feature data of images using a reference model. Just as colors are represented by three numbers, RGB, the features of one image in this model are represented by ** 4096 numbers **. In 2-2., Input: jpg data, output: 4096 numerical data, create a script to perform various processing.

Create the following in the caffe root directory.

feature_extraction.py


#! /usr/bin/env python
# -*- coding: utf-8 -*-
import sys, os, os.path, numpy as np, caffe

# path to git-cloned caffe dir
CAFFE_DIR  = os.getenv('CAFFE_ROOT')

MEAN_FILE  = os.path.join(CAFFE_DIR, 'python/caffe/imagenet/ilsvrc_2012_mean.npy')
MODEL_FILE = os.path.join(CAFFE_DIR, 'models/bvlc_reference_caffenet/deploy.prototxt')
PRETRAINED = os.path.join(CAFFE_DIR, 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel')

LAYER = 'fc7'
INDEX = 4

class FeatureExtraction:

    def __init__(self):
        net = caffe.Classifier(MODEL_FILE, PRETRAINED)
        caffe.set_mode_cpu()
        net.transformer.set_mean('data', np.load(MEAN_FILE))
        net.transformer.set_raw_scale('data', 255)
        net.transformer.set_channel_swap('data', (2,1,0))
        self.net = net

    def extract_features(self):
        imageDirPath = sys.argv[1]
        previousLabelName = ''
        labelIntValue = 0
        for root, dirs, files in os.walk(imageDirPath):
            for filename in files:
                if filename == '.DS_Store': 
                    continue
                fullPath  = os.path.join(root, filename)
                dirname   = os.path.dirname(fullPath)
                labelName = dirname.split("/")[-1]
                if labelName != previousLabelName:
                    labelIntValue += 1
                    previousLabelName = labelName
                image = caffe.io.load_image(fullPath)
                feat = self.extract_features_from_image(image)
                self.print_feature_with_libsvm_format(labelIntValue, feat)

    def build_test_data(self, imagePaths):
        for fullPath in imagePaths:
            image = caffe.io.load_image(fullPath)
            feat = self.extract_features_from_image(image)
            self.print_feature_with_libsvm_format(-1, feat)

    def extract_features_from_image(self, image):
        self.net.predict([image])
        feat = self.net.blobs[LAYER].data[INDEX].flatten().tolist()
        return feat 

    def print_feature_with_libsvm_format(self, labelIntValue, feat):
        formatted_feat_array = [str(index+1)+':'+str(f_i) for index, f_i in enumerate(feat)]
        print str(labelIntValue) + " " + " ".join(formatted_feat_array)

2-3. Export the feature data of all images downloaded in STEP1 to one file

Prepare the script for the above execution

exec.py


#! /usr/bin/env python
# -*- coding: utf-8 -*-
import sys
from feature_extraction import FeatureExtraction 
FeatureExtraction().extract_features()

Execute the following to create feature data (feature.txt).

python


$ python exec.py path/to/images_dir > feature.txt

On the machine at hand, the first line

(10, 3, 227, 227)

Will be included. This is not feature data, it's like a garbage print in another process, so delete it.

Supplement: About the format of the output feature data

In STEP3., SVM is learned by libsvm. In order to handle with libsvm, it is necessary to write out the feature data in the following format.

...
4 1:0.89 2:0.19 3:0.10 ...  4096:0.77 
1 1:0.01 2:0.99 3:0.11 ...  4096:0.97 
...

1 data,

(label number) 1:Numerical value of the first feature 2:Numerical value of the second feature...

It is expressed in the form of. In feature.txt, there are as many lines as the number of images.

3. Train the SVM to classify the extracted features by linear SVM.

The famous libsvm package is used for SVM. The explanation of libsvm and svm is kind here.

3-1. Installation of libsvm

$ brew install libsvm

3-2. Learning

Train SVM. Type the following command.

$ svm-scale -s scale.txt feature.txt > feature.scaled.txt
$ svm-train -c 0.03 feature.scaled.txt caltech101.model

svm-scale is a command to scale with libsvm, and svm-train is a command to learn. The meaning of each file is as follows.

4. Category classification by trained SVM

4-1. Experiment with the same data as a trial

$ cp feature.txt feature_test.txt
$ svm-scale -r scale.txt feature_test.txt > feature_test.scaled.txt
$ svm-predict feature_test.scaled.txt caltech101.model result.txt

... accuracy is bad! now debugging ...

Next Step Candidate

Hopefully all three will be in August. .. ..

Referenced URL list

In the whole flow

libsvm

FAQ

Q1. What is python / caffe / imagenet / ilsvrc_2012_mean.npy?

A. An average image. See below.

http://qiita.com/uchihashi_k/items/8333f80529bb3498e32f

Q2. Is SVM a binary classifier?

A. Multi-value classification is also possible. libsvm casually counts the number of classes of teacher data you enter and does a good job of creating a multi-value classifier if needed ... it wasn't a sweet story.

Recommended Posts

Machine Learning with Caffe -1-Category images using reference model
Amplify images for machine learning with python
Load caffe model with Chainer and classify images
[Python] Collect images with Icrawler for machine learning [1000 images]
Face image dataset sorting using machine learning model (# 3)
[Machine learning] Text classification using Transformer model (Attention-based classifier)
Machine learning model considering maintainability
Machine learning learned with Pokemon
Machine learning with Python! Preparation
Machine learning Minesweeper with PyTorch
Beginning with Python machine learning
Try machine learning with Kaggle
Install Caffe on OSX 10.10 and classify images by reference model
Create a python machine learning model relearning mechanism with mlflow
I tried machine learning with liblinear
Machine learning with python (1) Overall classification
Try machine learning with scikit-learn SVM
Inversely analyze a machine learning model
[Machine learning] Cluster Yahoo News articles with MLlib's topic model (LDA).
Creating a learning model using MNIST
Machine learning model management to avoid quarreling with the business side
Quantum-inspired machine learning with tensor networks
Validate the learning model with Pylearn2
Get started with machine learning with SageMaker
"Scraping & machine learning with Python" Learning memo
Application development using Azure Machine Learning
REST API of model made with Python with Watson Machine Learning (CP4D edition)
I tried to visualize the model with the low-code machine learning library "PyCaret"
Memorandum of means when you want to make machine learning with 50 images
<Course> Machine Learning Chapter 3: Logistic Regression Model
Stock price forecast using machine learning (scikit-learn)
Predict power demand with machine learning Part 2
[Machine learning] LDA topic classification using scikit-learn
Machine learning imbalanced data sklearn with k-NN
[Machine learning] FX prediction using decision trees
Image recognition model using deep learning in 2016
A story about machine learning with Kyasuket
Tips for using python + caffe with TSUBAME
[Shakyo] Encounter with Python for machine learning
<Course> Machine Learning Chapter 1: Linear Regression Model
Cross Validation improves machine learning model accuracy
Get a reference model using Django Serializer
Machine learning with Pytorch on Google Colab
<Course> Machine Learning Chapter 2: Nonlinear Regression Model
Stock price forecast using machine learning (regression)
Using MLflow with Databricks ③ --Model lifecycle management -
Build AI / machine learning environment with Python
[Machine learning] Regression analysis using scikit learn
Machine learning beginners tried to make a horse racing prediction model with python
Machine learning
I tried to implement various methods for machine learning (prediction model) using scikit-learn.
Creating a position estimation model for the Werewolf Intelligence Tournament using machine learning
[Machine learning] Create a machine learning model by performing transfer learning with your own data set
[Python] Easy introduction to machine learning with python (SVM)
Classification of guitar images by machine learning Part 1
A story about simple machine learning using TensorFlow
Machine learning starting with Python Personal memorandum Part2
Data supply tricks using deques in machine learning
Gaussian mixed model EM algorithm [statistical machine learning]
An amateur tried Deep Learning using Caffe (Introduction)
An amateur tried Deep Learning using Caffe (Practice)