[PYTHON] 4. Circle parameters with neural network!

Introduction

This is the 4th in the series.

So far, NN (2nd) that outputs the mean value and standard deviation from the given random numerical data, and outputs the three parameters used to create the normal distribution waveform data from the given normal distribution waveform data. I have created NN (3rd).

This time, from the two-dimensional image data that drew a circle, the convolutional NN (convolution NN) that outputs the four parameters (x-coordinate, y-coordinate, radius, and pen line thickness of the center of the circle) used to draw the circle. CNN) is created.

The image data was created in Objective-C.

Creation of image data

First is the creation of image data. Create an image of a 50-pixel x 50-pixel circle, get the pixel data from it, and write it to a file along with the four parameters used to draw the circle. One data consists of 2504 data separated by commas. The first 2500 are numbers from 0 to 1, and the remaining 4 are the x-coordinate, y-coordinate, radius, and line thickness of the circle center. Separate the data with a line break (\ n).

I used the NSImage class in Objective-C to create the data.

4-001.c


//Image size is 50 pixels x 50 pixels
//Random numbers determine the center coordinates, radius, and line thickness.
//Radius is 5 to 25
//The center coordinates allow the circle to be included in the image.
//The line thickness is 0.2 to 5
#import <Cocoa/Cocoa.h>
-(uint8_t *)pixelDataFromImage(NSImage *image);//main()Prototype declaration of the function used in

int main(int argc, const char * argv[]) {
    srand((unsigned int)time(NULL));//Random number initialization
    //Open the destination file
    char *fileName = "~/imageLearningData.txt";
    FILE *fp = fopen(fileName, "w");

    //Create 50,000 sheets
    for(int mmm = 0; mmm < 50000;mmm++){
        double radius = (double)rand()/RAND_MAX*20 + 5;
        double x = (double)rand()/RAND_MAX*(50 - radius * 2) + radius;
        double y = (double)rand()/RAND_MAX*(50 - radius * 2) + radius;
        double lineWidth = (double)rand()/RAND_MAX*4.8 + 0.2;
        [NSBezierPath setDefaultLineWidth:lineWidth];
    
        NSImage *image = [[NSImage alloc] initWithSize:NSMakeSize(50, 50)];    
        NSBezierPath *bezierPath = [NSBezierPath bezierPath];
        //Image drawing
        [image lockFocus];
        [bezierPath appendBezierPathWithOvalInRect:NSMakeRect(x - radius, y - radius, radius * 2, radius * 2)];
        [bezierPath stroke];
        [image unlockFocus];
    
        uint8_t *pixels = pixelDataFromImage(image);
    
        //Output pixel data
        NSSize size = [image size];
        uint32_t width = (uint32_t) size.width;
        uint32_t height = (uint32_t) size.height;
        int components = 4;
        for(int iii = 0; iii < height ;iii ++){
            for(int kkk = 0; kkk < width ; kkk++){
                double value = 0;
                value += pixels[( width * iii + kkk )*4     ]/255.0;
                value += pixels[( width * iii + kkk )*4 + 1 ]/255.0;
                value += pixels[( width * iii + kkk )*4 + 2 ]/255.0;
                //The following is a device to reduce the output file size. "1.Output "1" instead of "0000000".
                value /= 3;
                if(value == 1){
                    fprintf(fp,"%d,",1);
                }else{
                    fprintf(fp,"%f,",value);
                }
            }
        }
    
        fprintf(fp,"%f,%f,%f,%f\n",x,y,radius,lineWidth);
        free(pixels);
    }
    fclose(fp);
}

Regarding the method pixelDataFromImage in the above code, I modified the code of Mr. Shimapyon @ shimacpyon and created it as follows. Thank you, Mr. Shimapyon.

4-002.c


-(uint8_t *)pixelDataFromImage(NSImage *image){
    /*Create an instance of NSBitmapImageRep*/
    NSBitmapImageRep*bitmapRep = [NSBitmapImageRep imageRepWithData:[image TIFFRepresentation]];
    /*If saving as JPEG, remove the alpha channel*/
    [bitmapRep setAlpha:NO];
    /*Get quality for storage*/
    float quality = 1.0;
    /*Create a property*/
    NSDictionary* properties = [NSDictionary dictionaryWithObject:[NSNumber numberWithFloat:quality] forKey:NSImageCompressionFactor];
    /*Create JPEG data*/
    NSData *data = [bitmapRep representationUsingType:NSJPEGFileType properties:properties];

    //Create NSImage again from NSData
    NSImage *newImage = [[NSImage alloc] initWithData:data];

    if (newImage != nil) {
        NSSize size = [newImage size];
        uint32_t width = (uint32_t) size.width, height = (uint32_t) size.height, components = 4;
        uint8_t *pixels = (uint8_t *) malloc(size.width * size.height * components);//0 to 255
        if (pixels) {
            CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
            CGContextRef bitmapContext = CGBitmapContextCreate(pixels, width, height, 8, components * width, colorSpace, kCGImageAlphaPremultipliedLast);
            NSRect rect = NSMakeRect(0, 0, width, height);
            NSGraphicsContext *graphicsContext = (NSGraphicsContext *) [[NSGraphicsContext currentContext] graphicsPort];
            CGImageRef cgImage = [newImage CGImageForProposedRect:&rect context:graphicsContext hints:nil];
            CGContextDrawImage(bitmapContext, NSRectToCGRect(rect), cgImage);
            CGContextRelease(bitmapContext);
            CGColorSpaceRelease(colorSpace);
            return pixels;
        }
    }
    return nil;
}

It was not possible to extract pixel data directly from NSImage, so I obtained JPEG format data (NSData) from NSImage, created NSImage again based on this, and extracted pixel data from it. There may be a smarter way, but I would like to move on.

Division of training data

Four created training data (2504 data x 50000 rows), that is,

  1. Input data for training (2500 data x 40000 lines),
  2. Correct answer data for training (4 data x 40000 lines),
  3. Input data for evaluation (2500 data x 10000 lines),
  4. Correct answer data for evaluation (4 data x 10000 lines),

Divide into.

4-003.py


import numpy as np
d = np.loadtxt('./imageLearningData.txt', delimiter=',')
#-4:Is from the 4th to the end from the back.
d_training_x = d[:40000,:-4]
d_training_y = d[:40000,-4:]
d_test_x = d[40000:,:-4]
d_test_y = d[40000:,-4:]

#Change the shape of the data
d_training_x = d_training_x.reshape(40000,50,50,1)
d_test_x = d_test_x.reshape(10000,50,50,1)

CNN design

Design a CNN. We use a convolutional neural network to train a 2D image. The design of CNN was designed appropriately based on my intuition. The following points were taken into consideration.

  1. Place some convolution layers.
  2. Reduce the number of data in the Max pooling layer.
  3. Combine several fully connected layers and gradually reduce the number of data to 4.
  4. The loss function uses the sum of squares error.
  5. The input shape of the first layer is (50,50,1).
  6. The number of outputs in the last layer is 4.

4-004.py


import keras
from keras.models import Sequential
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPool2D
from keras.optimizers import Adam
from keras.layers.core import Dense, Activation, Dropout, Flatten

#Model definition
model = Sequential()
model.add(Conv2D(32,5,input_shape=(50,50,1)))
model.add(Activation('tanh'))
model.add(Conv2D(32,3))
model.add(Activation('relu'))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Conv2D(64,3))
model.add(Activation('relu'))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dense(4, activation='linear'))

adam = Adam(lr=1e-4)

model.compile(optimizer=adam, loss='mean_squared_error', metrics=["accuracy"])
model.summary()

Start learning

The number of parameters is 6,722,916. It looks like it will take some time ..... Start learning.

4-005.py


batch_size = 128  #128 data is thrown together
epochs = 20

history = model.fit(d_training_x, d_training_y,
batch_size=batch_size,
epochs=20,
verbose=1,
validation_data=(d_test_x, d_test_y))

It took 107 seconds per epoch. Graph the progress of learning. loss is the loss value calculated from the training data, and val_loss is the loss value calculated from the evaluation data.

4-006.py


#Drawing a graph
import matplotlib.pyplot as plt
plt.plot(history.history['loss'],label="loss")
plt.plot(history.history['val_loss'],label="val_loss")
plt.legend() #Show legend
plt.title("Can CNN learn to predict 4 parameters used to draw a circle?")
plt.xlabel("epoch")
plt.ylabel("Loss")
plt.show()

Figure_4-1.png It looks like you've learned well.

CNN rating

How accurate can you predict? Let's throw the first 200 data of the evaluation data into the CNN after training.

4-007.py


inp = d_test_x[:200,:]
out = d_test_y[:200,:]
pred = model.predict(inp, batch_size=1)

#Make a graph.
plt.title("Can NN deduce circle parameters?")

plt.scatter(out[:,0], pred[:,0],label = "x",marker='.', s=20,alpha=0.7)
plt.scatter(out[:,1], pred[:,1],label = "y",marker='.', s=20,color="green",alpha=0.7)
plt.scatter(out[:,2], pred[:,2],label = "r",marker='.', s=20,color="red",alpha=0.7)
plt.scatter(out[:,3], pred[:,3],label = "line width",marker='.', s=20,color="black",alpha=0.7)

plt.legend(fontsize=14) #Show legend
plt.xlabel("expected value")
plt.ylabel("prediction")
#It's hard to see, so x=The y line is omitted
#x = np.arange(-1, 41, 0.01)  
#y = x
#plt.plot(x, y,color="black")
plt.show()

Figure_4-3.png

The horizontal axis is the value of the parameter used when creating the circle data, and the vertical axis is the value output by CNN based on the image data.

If you take the $ x = y $ line drawn from the lower left to the upper right, you have successfully output.

It's not perfect, but it seems that I've learned a lot. Will it be a little better if I change the network configuration etc.?

Summary

It is now possible to output where and how large a circle is in an image as a parameter for drawing a circle.

This is the end of the 4th series!

Series 1st Preparation Series 2nd Mean and Standard Deviation Series 3rd Normal Distribution Series 4th Yen

Recommended Posts

4. Circle parameters with neural network!
3. Normal distribution with neural network!
Neural network starting with Chainer
Neural network with OpenCV 3 and Python 3
Simple classification model with neural network
[TensorFlow] [Keras] Neural network construction with Keras
Compose with a neural network! Run Magenta
Predict time series data with neural network
Persist the neural network built with PyBrain
2. Mean and standard deviation with neural network!
Parametric Neural Network
Experiment with various optimization algorithms with a neural network
Implement Neural Network from 1
Convolutional neural network experience
Train MNIST data with a neural network in PyTorch
Tech Circle ML # 8 Chainer with Recurrent Neural Language Model
Implement a 3-layer neural network
Simulate neural activity with Brian2
Try to build a deep learning / neural network with scratch
Python sample to learn XOR with genetic algorithm with neural network
Image classification with self-made neural network by Keras and PyTorch
Tuning Keras parameters with Keras Tuner
Adjusting LightGBM parameters with Optuna
[Deep learning] Image classification with convolutional neural network [DW day 4]
Pytorch Neural Network (CNN) Tutorial 1.3.1.
Neural network implementation (NumPy only)
TensorFlow Tutorial-Convolutional Neural Network (Translation)
Network programming with Python Scapy
Network performance measurement with iperf
Understand the number of input / output parameters of a convolutional neural network
Simple neural network implementation using Chainer
Function parameters with only an asterisk'*'
Implementation of a two-layer neural network 2
PRML Chapter 5 Neural Network Python Implementation
What is a Convolutional Neural Network?
Draw a beautiful circle with numpy
Write a Residual Network with TFLearn
I implemented a two-layer neural network
Simple neural network theory and implementation
Touch the object of the neural network
[Language processing 100 knocks 2020] Chapter 8: Neural network
Operate Linux Network Namespace with Go
Build a classifier with a handwriting recognition rate of 99.2% with a TensorFlow convolutional neural network