[PYTHON] Put your own image data in Deep Learning and play with it

Machine learning is used in various services these days. Among them, the method called Deep Learning is attracting attention as a high-performance method. In this article, I will write about how to actually run Deep Learning using a machine learning library from LISA-Lab called pylearn2. The target is an image file.

It seems that there are many articles up to running the tutorial of pyleran2, but it seems that there is almost no information written to the point of learning the data created by myself, so "Deep Learning" or "pylearn2" is amazing. I understand, but it's for people who don't know how to use it. I will omit the installation, usage, and tutorial parts of pylearn2.

Also, I think there are various methods, but here is an example.

procedure

The overall flow is as follows. I will explain each in detail.

Convert image file to CSV file
Convert CVS to pkl
Learn
Identification test

Preparation

This time, the working directory is created in the following location under the directory of pylearn2. I will explain that all the files handled this time are basically placed directly under this directory.

pylearn2/pylearn2/script/tutorials/sample

Also, I will borrow the script for reading and testing the CSV dataset, which is published at pylearn2 in practice. You can download it from the GitHub page with a link at the bottom of the page.

After downloading, put "adult_dataset.py" inside in the following directory.

pylearn2/pylearn2/datasets

Convert image file to CSV file

First, convert the image file to a CSV file. All the image data to be trained is output as one CSV data. As shown in the example below, the CSV rule is to separate each image as one line, the beginning as the belonging class, and the subsequent ones as pixel data with commas ",". Of course, you can change it later even if it is not a comma, but here we will explain it with a comma.

`train.csv`


class,Pixel data 1,Pixel data 2,Pixel data 3,...
class,Pixel data 1,Pixel data 2,Pixel data 3,...
class,Pixel data 1,Pixel data 2,Pixel data 3,...
...

It doesn't matter how you do it, but I used OpenCV for the time being. As an example, it looks like this. For the sake of simplicity, we have included 200 images named with serial numbers, and the first 100 images are class 0 and the latter 100 images are class 1. This time it is a two-class classification, but you can increase it further, so please increase it with an integer if necessary. Change this area as needed.

`main.cpp`


int main() {
	FILE *fp = fopen("train.csv", "w");
    Iplimage *input = cvLoadImage("trainingImage.png ", CV_LOAD_IMAGE_GRAYSCALE);
    
    int numFiles = 200;
    int numFirstClass = 100;
    
    for(int i=0; i<numFiles; i++) {
    
        if(i<numFirstClass) fprintf(fp, "0");
        else                fprintf(fp, "1");
    
        for (int y = 0; y < input->height; y++) {
            for (int x = 0; x < input->width; x++) {
                uchar pixelValue = 0;
                pixelValue = (uchar)input->imageData[y*input->width+x];
                fprintf(fp, ",%d", (int)pixelValue);
            }
        }
        fprintf(fp, "\n");
    }
}

A file like this will be created (numerical values are examples). There should be 200 lines in total. Put the created file in the working directory.

0,13,15,18,41,11,...
0,19,40,50,31,23,...
...
...
1,135,244,210,15,150,...
1,45,167,84,210,100,...

Convert CSV to pkl

Convert the created CSV to a pkl file so that it can be easily handled by python. The source looks like this.

`python`


from pylearn2.datasets.adult_dataset import AdultDataset
import pickle

print 'convert: train.csv -> train.pkl'
pyln_data = AdultDataset('train.csv',one_hot=True)
pickle.dump(pyln_data, open('train.pkl', 'w'))

To learn

Now that the data is ready, let's actually learn. The network to be created this time will be a 3-layer network with 2 layers of AutoEncoder and a Softmax Regression layer as a discriminator.

The yaml of each layer looks like this. The numbers that can be set arbitrarily are rough, so change each parameter as necessary. What is important is

--nvis: Number of input units. The first layer must have the same number of pixels as the image, and the second layer must have the same number of units (nhid) as the Hidden Layer of the first layer. --n_Classes: Number of output classes. Specify the number of classes you want to classify. In this example, 2.

So, if this is not set properly, it will not learn in the first place, so if it does not work, you may want to check if you made a mistake here, or if you made a mistake at the CSV file creation stage. Hmm.

The pre-training results for the first and second layers are output to DAE_l1.pkl and DAE_l2.pkl.

`dae_l1.yaml`


!obj:pylearn2.train.Train {
    dataset: &train !pkl: "train.pkl",

    model: !obj:pylearn2.models.autoencoder.DenoisingAutoencoder {
        nvis : 200,
        nhid : 100,
        irange : 0.05,
        corruptor: !obj:pylearn2.corruption.BinomialCorruptor {
            corruption_level: .1,
        },
        act_enc: "tanh",
        act_dec: null,    # Linear activation on the decoder side.
    },
    algorithm: !obj:pylearn2.training_algorithms.sgd.SGD {
        learning_rate : 1e-3,
        batch_size : 5,
        monitoring_batches : 1,
        monitoring_dataset : *train,
        cost : !obj:pylearn2.costs.autoencoder.MeanSquaredReconstructionError {},
        termination_criterion : !obj:pylearn2.termination_criteria.EpochCounter {
            max_epochs: 10,
        },
    },
    save_path: "DAE_l1.pkl",
    save_freq: 1
}

`dae_l2.yaml`


!obj:pylearn2.train.Train {
    dataset: &train !obj:pylearn2.datasets.transformer_dataset.TransformerDataset {
        raw: !pkl: "train.pkl",
        transformer: !pkl: "DAE_l1.pkl"
    },
    model: !obj:pylearn2.models.autoencoder.DenoisingAutoencoder {
        nvis : 100,
        nhid : 20,
        irange : 0.05,
        corruptor: !obj:pylearn2.corruption.BinomialCorruptor {
            corruption_level: .2,
        },
        act_enc: "tanh",
        act_dec: null,    # Linear activation on the decoder side.
    },
    algorithm: !obj:pylearn2.training_algorithms.sgd.SGD {
        learning_rate : 1e-3,
        batch_size : 5,
        monitoring_batches : 1,
        monitoring_dataset : *train,
        cost : !obj:pylearn2.costs.autoencoder.MeanSquaredReconstructionError {},
        termination_criterion : !obj:pylearn2.termination_criteria.EpochCounter {
            max_epochs: 10,
        },
    },
    save_path: "DAE_l2.pkl",
    save_freq: 1
}

`dae_mlp.yaml`


!obj:pylearn2.train.Train {
    dataset: &train !pkl: "train.pkl",

    model: !obj:pylearn2.models.mlp.MLP {
        batch_size: 5,
        layers: [
                 !obj:pylearn2.models.mlp.PretrainedLayer {
                     layer_name: 'h1',
                     layer_content: !pkl: "DAE_l1.pkl"
                 },
                 !obj:pylearn2.models.mlp.PretrainedLayer {
                     layer_name: 'h2',
                     layer_content: !pkl: "DAE_l2.pkl"
                 },
                 !obj:pylearn2.models.mlp.Softmax {
                     layer_name: 'y',
                     n_classes: 2,
                     irange: 0.05
                 }
                ],
        nvis: 200
    },
    algorithm: !obj:pylearn2.training_algorithms.sgd.SGD {
        learning_rate: .05,
        learning_rule: !obj:pylearn2.training_algorithms.learning_rule.Momentum {
            init_momentum: .5,
        },
        monitoring_dataset:
            {
                'valid' : *train,
            },
        cost: !obj:pylearn2.costs.mlp.Default {},
        termination_criterion: !obj:pylearn2.termination_criteria.And {
            criteria: [
                !obj:pylearn2.termination_criteria.MonitorBased {
                    channel_name: "valid_y_misclass",
                    prop_decrease: 0.,
                    N: 100
                },
                !obj:pylearn2.termination_criteria.EpochCounter {
                    max_epochs: 50
                }
            ]
        },
        update_callbacks: !obj:pylearn2.training_algorithms.sgd.ExponentialDecay {
            decay_factor: 1.00004,
            min_lr: .000001
        }
    },
    extensions: [
        !obj:pylearn2.training_algorithms.learning_rule.MomentumAdjustor {
            start: 1,
            saturate: 250,
            final_momentum: .7
        }
    ],
    save_path: "mlp.pkl",
    save_freq: 1
}

When yaml is ready, run a script like the following to train it.

`train.py`


from pylearn2.config import yaml_parse
import os
import pickle

def train_step(config_file):
    assert(os.path.exists(config_file))
    _yaml = open(config_file).read()
    _train = yaml_parse.load(_yaml)
    _train.main_loop()
    return _train

l1_train = train_step('dae_l1.yaml')
l2_train = train_step('dae_l2.yaml')
_train = train_step('dae_mlp.yaml')

When executed, a file called "mlp.pkl" will be generated. Since this is the learning result, next we will actually use it to perform an identification test.

Identification test

The method of creating test data is the same as the training data, and the image is converted to a CSV file-> pkl. Here, it is "test.pkl".

`python`


from pylearn2.datasets.adult_dataset import AdultDataset
import pickle

print 'convert: test.csv -> test.pkl'
pyln_data = AdultDataset('test.csv', one_hot=True)
pickle.dump(pyln_data, open('test.pkl', 'w'))

You can run a script like the following to output how many of the test data were identified as the correct class.

`test.py`


import numpy as np
import pickle
import theano

# function for classifying a input vector
def classify(inp,model,input_size):
    inp = np.asarray(inp)
    inp.shape = (1, input_size)
    return np.argmax(model.fprop(theano.shared(inp, name='inputs')).eval())
 
# function for calculating and printing the models accuracy on a given dataset
def score(dataset, model, input_size):
    nr_correct = 0
    for features, label in zip(dataset.X,dataset.y):
        if classify(features,model, input_size) == np.argmax(label):
            nr_correct += 1
    print '{}/{} correct'.format(nr_correct, len(dataset.X))
    return nr_correct, len(dataset.X)

model = pickle.load(open('mlp.pkl'))
test_data = pickle.load(open('test.pkl'))
score(test_data, model, 200)

end

As mentioned above, I think that you can do everything from learning to testing using your own data. I noticed on the way, but it seems that pylearn2 also contains a reading script (pylearn2 / pylearn2 / datasets / csv_dataset.py) that supports CSV datasets, so you may use that.

Referenced materials & reference materials

Implementation Deep Learning pylearn2 dev Documentation pylearn2 in practice