[PYTHON] I was addicted to running tensorflow on GPU with NVIDIA driver 440 + CUDA 10.2

A difficult story to move tensorflow. In conclusion, CUDA 10.2 alone does not work, it is necessary to include 10.1. (* For tensorflow 2.2.0)

Advance preparation

Install the latest version (10.2) of the driver and CUDA from the NVIDIA page.

$ nvidia-smi  
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82       Driver Version: 440.82       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 207...  Off  | 00000000:01:00.0  On |                  N/A |

+-------------------------------+----------------------+----------------------+

Preparations for using tensorflow gpu

https://www.tensorflow.org/install/gpu?hl=ja Install the package that matches CUDA 10.2 with the following as a reference. The latest version is included as of May 10, 2020.

$ sudo apt-get install --no-install-recommends \
        libcudnn7=7.6.5.32-1+cuda10.2  \
        libcudnn7-dev=7.6.5.32-1+cuda10.2

Install tensorflow

$ python -m pip install -U pip #Keep pip up to date
$ pip install tensorflow
$ pip install tf-nightly #These two recommendations
$ pip install tensorflow-gpu #I will put it in for the time being
$ pip install tensorflow-addons 
#Check version
$ pip list |grep tensor
tensorboard             2.2.1
tensorboard-plugin-wit  1.6.0.post3
tensorflow              2.2.0
tensorflow-addons       0.9.1
tensorflow-estimator    2.2.0
tensorflow-gpu          2.2.0

It looks like it entered safely

Check if GPU works with tensorflow

$ python
Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.test.is_gpu_available()
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2020-05-12 22:03:50.049513: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-12 22:03:50.095310: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3000000000 Hz
2020-05-12 22:03:50.097049: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fc6ec000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-12 22:03:50.097116: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-05-12 22:03:50.109698: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-05-12 22:03:50.217541: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-12 22:03:50.217838: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4703970 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-05-12 22:03:50.217852: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce RTX 2070 SUPER, Compute Capability 7.5
2020-05-12 22:03:50.218622: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-12 22:03:50.218835: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce RTX 2070 SUPER computeCapability: 7.5
coreClock: 1.77GHz coreCount: 40 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2020-05-12 22:03:50.218998: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:
2020-05-12 22:03:50.244848: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-05-12 22:03:50.263385: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-05-12 22:03:50.267797: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-05-12 22:03:50.304564: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-05-12 22:03:50.311052: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-05-12 22:03:50.378673: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-12 22:03:50.378768: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-05-12 22:03:50.378827: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-12 22:03:50.378855: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 
2020-05-12 22:03:50.378904: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
False

Failure ... Could not load dynamic library'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file, so it seems that cuda 10.1 needs to be inserted.

Re-challenge with cuda 10.1

$sudo apt-get install --no-install-recommends cuda-10-1
Installation process contents
$ python
Python 3.6.9 (default, Apr 18 2020, 01:56:04) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.test.is_gpu_available()
Omission
TRUE

In particular, it worked without the need to downgrade the libcudnn7 package.

Recommended Posts

I was addicted to running tensorflow on GPU with NVIDIA driver 440 + CUDA 10.2
I was addicted to Flask on dotCloud
Summary of points I was addicted to running Selenium on AWS Lambda (python)
A note I was addicted to when running Python with Visual Studio Code
I was addicted to scraping with Selenium (+ Python) in 2020
[IOS] GIF animation with Pythonista3. I was addicted to it.
I was addicted to multiprocessing + psycopg2
The record I was addicted to when putting MeCab on Heroku
What I was addicted to with json.dumps in Python base64 encoding
A note I was addicted to when making a beep on Linux
A note I was addicted to when creating a table with SQLAlchemy
I tried to implement Autoencoder with TensorFlow
I tried to visualize AutoEncoder with TensorFlow
I was addicted to pip install mysqlclient
I installed TensorFlow (GPU version) on Ubuntu
What I was addicted to Python autorun
Two things I was addicted to building Django + Apache + Nginx on Windows
I want to tweet on Twitter with Python, but I'm addicted to it
A story I was addicted to trying to install LightFM on Amazon Linux
I was addicted to creating a Python venv environment with VS Code
A story I was addicted to trying to get a video url with tweepy
Use Python from Java with Jython. I was also addicted to it.
I was addicted to not being able to use Markdown on pypi's long_description
I was addicted to trying Cython with PyCharm, so make a note
Try Tensorflow with a GPU instance on AWS
[Introduction to json] No, I was addicted to it. .. .. ♬
Three things I was addicted to when using Python and MySQL with Docker
I tried object detection with YOLO v3 (TensorFlow 2.1) on the GPU of windows!
A story that I was addicted to when I made SFTP communication with python
I tried to implement Minesweeper on terminal with python
I tried running BERT with Sakura VPS (without GPU)
I want to AWS Lambda with Python on Mac!
A story that I was addicted to at np.where
From running MINST on TensorFlow 2.0 to visualization on TensorBoard (2019 edition)
[TensorFlow] I want to process windows with Ragged Tensor
I tried to implement Grad-CAM with keras and tensorflow
I was addicted to trying logging.getLogger in Flask 1.1.x
What I was addicted to when using Python tornado
I tried to find an alternating series with tensorflow
I tried running TensorFlow
What I was addicted to when dealing with huge files in a Linux 32bit environment
Memo (March 2020) that I was addicted to when installing Arch Linux on MacBook Air 11'Early 2015
I tried to find the average of the sequence with TensorFlow
I was able to implement web app authentication with flask-login
What I was addicted to when migrating Processing users to Python
I stumbled on TensorFlow (What is Out of GPU Memory)
[Fixed] I was addicted to alphanumeric judgment of Python strings
I was a little addicted to installing Python3.3 + mod_wsgi3.4 on Sakura VPS (CentOS), so a retrospective memo
Note that I was addicted to accessing the DB with Python's mysql.connector using a web application.
I was addicted to not being able to get an email address from google with django-allauth authentication
When I put Django in my home directory, I was addicted to static files with permission errors
A story that I was addicted to calling Lambda from AWS Lambda.
Environment construction of Tensorflow and Chainer by Window with CUDA (with GPU)
What I was addicted to when introducing ALE to Vim for Python
Note that I was addicted to sklearn's missing value interpolation (Imputer)
I tried to summarize what was output with Qiita with Word cloud
I tried to get started with Bitcoin Systre on the weekend
I tried to display GUI on Mac with X Window System
I tried to summarize everyone's remarks on slack with wordcloud (Python)
I was addicted to confusing class variables and instance variables in Python
I tried running the TensorFlow tutorial with comments (_TensorFlow_2_0_Introduction for beginners)