[PYTHON] Try Tensorflow with a GPU instance on AWS

EC2 setup

Build an Ubuntu instance using a GPU instance called p2.xlarge as shown below. (Currently, p2 instances are not available in the Tokyo region, so here)

Set a suitable name for the security group and key pair. (Here, it is sg_01, kp_01```.) You can also use an existing key pair without any problem.

After downloading the key pair, go to .ssh and change the permissions.

$ mv ~/Download/kp_01.pem ~/.ssh/.
$ chmod 600 ~/.ssh/kp_01.pem

After the instance is created, check the Public DNS in the management console and log in with SSH.

$ ssh -i ~/.ssh/kp_01.pem ubuntu@<Public DNS>

The following is the work on EC2. First, update the package.

$ sudo apt-get update
$ sudo apt-get upgrade


Install CUDA 8.0

URL: https://developer.nvidia.com/cuda-downloads Installation guide: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html

Prior confirmation

Check if you have a GPU that supports CUDA

$ lspci | grep -i nvidia
00:1e.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80](rev a1)

Check if the OS is compatible with CUDA

$ uname -m && cat /etc/*release
VERSION="16.04.2 LTS (Xenial Xerus)"
PRETTY_NAME="Ubuntu 16.04.2 LTS"

Installation of gcc (+ development tools)

$ sudo apt-get install build-essential

Install the same version of kernel header as the running kernel

$ sudo apt-get install linux-headers-$(uname -r)


In "Select Target Platform" of https://developer.nvidia.com/cuda-downloads, select as follows to display the download link and installation procedure. Get the file with wget from the linked URL and install it. (Here we will install `` `cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb```.)

$ wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
$ sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64-deb
$ sudo apt-get update
$ sudo apt-get install cuda

Setting environment variables

Set in ~ / .bash_profile as follows.


export CUDA_HOME="/usr/local/cuda-8.0"
export PATH="${CUDA_HOME}/bin${PATH:+:${PATH}}"

Please log in again to reflect the settings.

$ exec $SHELL -l

Operation check

Build the sample program and check the operation. (It doesn't matter if you don't run it.)

$ cuda-install-samples-8.0.61.sh test
$ cd test/NVIDIA_CUDA-8.0_Samples
$ sed -i "s/nvidia-367/nvidia-375/g" `grep "nvidia-367" -r ./ -l`
$ make

Installation of cuDNN 5.1

URL: https://developer.nvidia.com/cudnn You need to be a member of the NVIDIA Developer Program to download. Since authentication is required, download the file to your local PC and upload it to EC2 via SCP. (Here, `` `cudnn-8.0-linux-x64-v5.1.tgz``` is used.)

SCP from local

$ scp -i ~/.ssh/kp_01.pem ~/Downloads/cudnn-8.0-linux-x64-v5.1.tgz ubuntu@<Public DNS>:~/.

Install on EC2 (file extraction and placement only)

$ tar zxvf cudnn-8.0-linux-x64-v5.1.tgz
$ sudo cp cuda/include/* ${CUDA_HOME}/include/.
$ sudo cp cuda/lib64/* ${CUDA_HOME}/lib64/.

Install NVIDIA CUDA Profile Tools Interface (libcupti-dev)

You can install it with apt-get.

$ sudo apt-get install libcupti-dev

However, this time, when I executed it, I got the error "*** is not a symbolic link", so I solved it as follows. (Reference: http://stackoverflow.com/questions/43016255/libegl-so-1-is-not-a-symbolic-link)

$ sudo mv /usr/lib/nvidia-375/libEGL.so.1 /usr/lib/nvidia-375/libEGL.so.1.org
$ sudo ln -s /usr/lib/nvidia-375/libEGL.so.375.39 /usr/lib/nvidia-375/libEGL.so.1

$ sudo mv /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.5 /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.5.org
$ sudo ln -s /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.5.1.10 /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.5

$ sudo mv /usr/lib32/nvidia-375/libEGL.so.1 /usr/lib32/nvidia-375/libEGL.so.1.org
$ sudo ln -s /usr/lib32/nvidia-375/libEGL.so.375.39 /usr/lib32/nvidia-375/libEGL.so.1

GPU settings

http://docs.aws.amazon.com/ja_jp/AWSEC2/latest/UserGuide/accelerated-computing-instances.html Apply "Optimize GPU settings (P2 instance only)" by referring to.

$ sudo nvidia-smi -pm 1
$ sudo nvidia-smi --auto-boost-default=0
$ sudo nvidia-smi -ac 2505,875

Python environment

Create the environment of pyenv + miniconda by referring to the article of here. ("There is actually a problem with anaconda alone.")

pyenv https://github.com/pyenv/pyenv#installation

git clone and set ``` ~ / .bash_profile` ``.

$ git clone https://github.com/pyenv/pyenv.git ~/.pyenv


export PYENV_ROOT="${HOME}/.pyenv"
export PATH="${PYENV_ROOT}/bin:${PATH:+:${PATH}}"
eval "$(pyenv init -)"

miniconda Install the latest miniconda (here, miniconda 3-4.3.11) with pyenv.

$ pyenv install -l | grep miniconda

$ pyenv install miniconda3-4.3.11


export CONDA_HOME="${PYENV_ROOT}/versions/miniconda3-4.3.11"
export PATH="${CONDA_HOME}/bin${PATH:+:${PATH}}"

Tensorflow Install with Anaconda Create an Anaconda environment with conda and install Tensorflow.

$ conda create -n tensorflow python=3.5 anaconda
$ source activate tensorflow
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.1.0-cp35-cp35m-linux_x86_64.whl

Jupyter notebook http://jupyter-notebook.readthedocs.io/en/latest/public_server.html Make settings to connect to Jupyter notebook started on EC2 from a local PC.

Creating a server certificate and key file

(tensorflow)$ mkdir certificate
(tensorflow)$ cd certificate
(tensorflow)$ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mykey.key -out mycert.pem

Password hash value creation

(tensorflow)$ python
>>> from notebook.auth import passwd
>>> passwd()
Enter password: 
Verify password:
>>> exit()

Jupyter configuration file creation

Output the template of the configuration file

(tensorflow)$ jupyter notebook --generate-config

Add the following settings


c.NotebookApp.certfile = '/home/ubuntu/certificate/mycert.pem'
c.NotebookApp.keyfile = '/home/ubuntu/certificate/mykey.key'
c.NotebookApp.ip = '*'
c.NotebookApp.port = 9999
c.NotebookApp.open_browser = False

Start Jupyter notebook

(tensorflow)$ jupyter notebook

When you access https: // <Public DNS>: 9999 with the browser of your local PC, the password input screen will be displayed. Enter the password you entered when you created the password hash value. To log in.

Recommended Posts

Try Tensorflow with a GPU instance on AWS
Run TensorFlow on a GPU instance on AWS
Use jupyter on AWS GPU instance
Try TensorFlow RNN with a basic model
Building an environment to run ChainerMN on a GPU instance on AWS
Build a WardPress environment on AWS with pulumi
June 2017 version to build Tensorflow / Keras environment on GPU instance of AWS
Try regression with TensorFlow
# 2 Build a Python environment on AWS EC2 instance (ubuntu18.04)
Building a TensorFlow environment that uses GPU on Windows 10
Run GPU version tensorflow on AWS EC2 Spot Instances
Try deep learning with TensorFlow
Try programming with a shell!
Try TensorFlow MNIST with RNN
Try running a Schedule to start and stop an instance on AWS Lambda (Python)
If you think tensorflow doesn't recognize your GPU on AWS
Try clustering with a mixed Gaussian model on a Jupyter Notebook
Try running tensorflow on Docker + anaconda
Try deep learning with TensorFlow Part 2
Use Tensorflow 2.1.0 with Anaconda on Windows 10!
Run TensorFlow2 on a VPS server
Try data parallelism with Distributed TensorFlow
Build a Flask / Bottle-like web application on AWS Lambda with Chalice
# 3 Build a Python (Django) environment on AWS EC2 instance (ubuntu18.04) part2
Serverless scraping on a regular basis with AWS lambda + scrapy Part 1
Launched a web application on AWS with django and changed jobs
I tried object detection with YOLO v3 (TensorFlow 2.0) on a windows CPU!
Procedure for building a kube environment on amazon linux2 (aws) ~ (with bonus)
Try using Bash on Windows 10 2 (TensorFlow installation)
I was addicted to running tensorflow on GPU with NVIDIA driver 440 + CUDA 10.2
Put TensorFlow in P2 instance with pip3
Build a Tensorflow environment with Raspberry Pi [2020]
Try drawing a normal distribution with matplotlib
Try SVM with scikit-learn on Jupyter Notebook
A memo with Python2.7 and Python3 on CentOS
Map rent information on a map with python
Issue a signed URL with AWS SQS
Throw a request with a certificate on httpie
I installed TensorFlow (GPU version) on Ubuntu
Try server-side encryption on S3 with boto3
Try HTML scraping with a Python library
A note on enabling PostgreSQL with Django
Prepare the environment of Chainer on EC2 spot instance with AWS Lambda
I built a TensorFlow environment on windows10
# 1 Until you deploy Django's web application (instance construction with EC2 on AWS)
Try drawing a map with python + cartopy 0.18.0
Create a private repository with AWS CodeArtifact
Try to predict FX with LSTM using Keras + Tensorflow Part 2 (Calculate with GPU)
Steps to run TensorFlow 2.1 from Jupyter on supercomputer ITO front end (with GPU)
Try implementing a Cisco Spark bot with AWS Lambda + Amazon API Gateway (Python)
Build a flask app made with tensorflow and dlib to work on centos7
I tried object detection with YOLO v3 (TensorFlow 2.1) on the GPU of windows!
Build a TensorFlow development environment on Amazon EC2 with command copy and paste
A addictive story when using tensorflow on Android
Periodically run a python program on AWS Lambda
Consider a cloud-native WebSocket application running on AWS
Build a python environment with ansible on centos6
Try using a QR code on a Raspberry Pi
Try to draw a life curve with python
Try sending a message with Twilio's SMS service
Build a cheap summarization system with AWS components