[PYTHON] Create an AWS GPU instance to train StyleNet

Linux GPU instance on AWS

AWS GPU instances are HVM based. HVM is an abbreviation for Hardware Virtual Machine. In other words, AWS GPU instances are not connected to physical GPUs, but to virtual GPUs. (I think, but please point out if you make a mistake.)

Create a g2.2xlarge instance

Create an instance in the cheaper US East (N. Virginia). ** $ 0.65 per hour. ** (as of September 23, 2015) It's more expensive than the micro-instances around it, so don't forget to delete it when you're done using it! !!

Create an instance with this procedure.

Search for nvidia on the AWS Marketplace

スクリーンショット 2015-09-23 23.07.44.png

Since Linux is used, select Amazon Linux AMI with NVIDIA GRID GPU Driver

スクリーンショット 2015-09-23 23.07.58.png

Select g2.2xlarge for GPU instance

スクリーンショット 2015-09-23 23.08.17.png

Various detailed settings (this is the basic default)

スクリーンショット 2015-09-23 23.08.28.png

スクリーンショット 2015-09-23 23.08.40.png

スクリーンショット 2015-09-23 23.09.15.png

スクリーンショット 2015-09-23 23.09.25.png

Final confirmation and creation

スクリーンショット 2015-09-23 23.09.48.png

After creating, you can check if the GPU is connected and the driver is installed with the following command.

nvidia-smi -q | head

(You can display all of them, but you can see the driver version and the number of connections in the first 10 lines.)

Install python package for chainer

Since the created AMI contains python and pip from the beginning, install the necessary packages for chainer. (If you need sudo once in a while, add it as appropriate.)

pip install numpy
pip install chainer

The PATH setting for using CUDA from chainer is as follows for the AMI created this time.

export PATH=/opt/nvidia/cuda/bin:$PATH
export LD_LIBRARY_PATH=/opt/nvidia/cuda/lib64:$LD_LIBRARY_PATH

Confirm from chainer

If there is no error with the following command, it's okay.

import chainer
from chainer import cuda

cuda.check_cuda_available()
cuda.get_device(0).use()

Run StyleNet

Run the topic StyleNet on the GPU instance in this article.

Install Git and download style-gogh. Also download NIN's caffe model. Renamed the caffemodel file to ***. Caffemodel.

sudo yum install git
git clone https://github.com/mattya/chainer-gogh
wget https://www.dropbox.com/s/0cidxafrb2wuwxw/nin_imagenet.caffemodel?dl=1
mv nin_imagenet.caffemodel?dl=1 nin_imagenet.caffemodel

And if you execute the following command, it will work. I wanted to display the calculation time, so I modified it to display the calculation time by outputting to the console for each generation. The last number in the console output is the elapsed time since the script was executed. (The other four numbers are, in order, the number of generations, the number of the layer that was output, the deviation from the content image, and the deviation from the style image) Please read the commentary article and code for details.

CPU version

load model... nin_imagenet.caffemodel
('image resized to: ', (1, 3, 435, 435))
('image resized to: ', (1, 3, 435, 428))
0 0 0.0 131.536987305 100.341277122
0 1 0.0 16.9276828766 101.161045074
0 2 5.29387950897 0.132858499885 101.5359869
0 3 0.530761241913 0.00741795729846 102.120722055
1 0 0.0 123.52784729 178.474854946
1 1 0.0 15.2022619247 179.293597937
1 2 5.03846788406 0.123586334288 179.670742035
1 3 0.470058709383 0.00525630777702 180.251745939
2 0 0.0 110.198478699 255.28085494
2 1 0.0 12.6540279388 256.074426889
2 2 4.90201044083 0.113212890923 256.448594093
2 3 0.431303560734 0.00383871118538 257.029608965
3 0 0.0 92.4355773926 332.629327059
3 1 0.0 10.1973600388 333.420332909
3 2 4.86908721924 0.108661472797 333.790095091
3 3 0.409945964813 0.00298984581605 334.37262392
4 0 0.0 72.380569458 410.049565077
4 1 0.0 9.32985496521 410.848016024
4 2 4.92668151855 0.117410235107 411.218291044
4 3 0.398817956448 0.00246199313551 411.802879095
5 0 0.0 53.6025543213 486.892725945
5 1 0.0 11.4278764725 487.710140944
5 2 5.04112100601 0.141176745296 488.083660126
5 3 0.39475902915 0.00214517721906 488.664105892

It takes about 80 seconds to turn one generation.

GPU version

load model... nin_imagenet.caffemodel
('image resized to: ', (1, 3, 435, 435))
('image resized to: ', (1, 3, 435, 428))
0 0 0.0 131.557006836 22.7398509979
0 1 0.0 16.9326477051 22.7434000969
0 2 5.2936425209 0.13280403614 22.7467870712
0 3 0.529067575932 0.00738066714257 22.7503979206
1 0 0.0 123.602386475 41.3092479706
1 1 0.0 15.211555481 41.3119618893
1 2 5.04172039032 0.123632088304 41.3146359921
1 3 0.469296246767 0.00526729598641 41.3175621033
2 0 0.0 110.336387634 41.394990921
2 1 0.0 12.6709651947 41.3977220058
2 2 4.90853118896 0.113566093147 41.4003748894
2 3 0.43169811368 0.00389048201032 41.4033150673
3 0 0.0 92.6167144775 41.4806849957
3 1 0.0 10.2161283493 41.4833440781
3 2 4.87660598755 0.109313540161 41.4860169888
3 3 0.41028663516 0.00304215308279 41.4892690182
4 0 0.0 72.5410461426 41.566724062
4 1 0.0 9.3198633194 41.5694000721
4 2 4.93378400803 0.118150554597 41.5720379353
4 3 0.399007946253 0.0025086470414 41.5749340057
5 0 0.0 53.6465911865 41.6522700787
5 1 0.0 11.3575658798 41.6549448967
5 2 5.0474896431 0.141661688685 41.6576039791
5 3 0.394712120295 0.00219127023593 41.6604459286

** Approximately 0.1 seconds to turn one generation. GPU detonation velocity **

Finally

This time I wanted to check the procedure for creating an AWS instance and the speed comparison between CPU and GPU, so StyleNet trained only about 10 times, but considering that it trains 5000 times, CPU ・ ・ ・ 80 * 5000 = 40,000 [s] ≒ about 4.6 days GPU ・ ・ ・ 0.1 * 5000 = 500 [s] And the time changes so much. Thinking normally, people often say that they don't use GPUs. GPGPU I will study more.

reference

http://qiita.com/pyr_revs/items/e1545e6f464b712517ed http://yamakatu.github.io/blog/2014/07/05/gpgpu/ http://www.gdep.jp/page/view/248

Recommended Posts

Create an AWS GPU instance to train StyleNet
Building an environment to run ChainerMN on a GPU instance on AWS
Use jupyter on AWS GPU instance
Create an alias for Route53 to CloudFront with the AWS API
Create an AWS Cloud9 development environment on your Amazon EC2 instance
[For Python] Quickly create an upload file to AWS Lambda Layer
Change AWS EC2 instance from t2 to t3
Run TensorFlow on a GPU instance on AWS
How to create an NVIDIA Docker environment
June 2017 version to build Tensorflow / Keras environment on GPU instance of AWS
Create an add-in-enabled Excel instance with xlwings
I tried to create an environment to check regularly using Selenium with AWS Fargate
Try Tensorflow with a GPU instance on AWS
How to create an OCF compliant resource agent
Creating an AWS EC2 EC2 Instance (Amazon Linux AMI) 2
Try to create an HTTP server using Node.js
How to terminate an AWS EC2 instance (remove security G and delete key pair)
Try running a Schedule to start and stop an instance on AWS Lambda (Python)
How to create an article from the command line
Script to cancel an instance of Bluemix-IaaS (formerly SoftLayer)
[Blender x Python] How to create an original object
I tried to get an AMI using AWS Lambda
AWS EC2 2nd SSH connection to EC2 Instance (Amazon Linux2)
How to create an image uploader in Bottle (Python)
How to deploy a Go application to an ECS instance
How to create an instance of a particular class from dict using __new__ () in python