――I want to create a data analysis environment with ** Docker **. -I want to use ** Python **, ** R ** or ** Julia ** with ** Jupyter **. ――I want you to include some standard libraries and packages for each language from the beginning.
--People who want to build a basic Data Science environment with Docker --People who are not very familiar with Docker and Jupyter --People who want to easily create a data analysis environment
If you are a person like the above and enter the command according to this article, we aim to build a data analysis environment.
Currently, we have confirmed the startup on macOS High Sierra and Ubuntu 18.04.
If Docker itself is not installed, you cannot perform the following operations. First of all, please install Docker itself according to the OS from Docker official website.
Docker has a Docker image that is packed with applications. You need to choose the Docker image you need according to what kind of environment you want to create.
This time, I want to use Jupyter Notebook
, so I will use the Docker image published on Jupyter official Github.
The Jupyter official publishes various types of Docker images. This time,
I want to create an environment with all of them, so I use something called `` `datascience-notebook```. For details on this Docker image, see Jupyter official website.
Other Docker images published by Jupyter official and their features are easy to understand by looking at the chart below.
Now that you have decided which image to use, first download the docker image to Local with `` `docker pull```.
$ docker pull jupyter/datascience-notebook
docker pull
rear,docker run jupyter/datascience-notebook
The container still starts, but there are the following problems.
If you don't persist the data, it will be difficult later. However, there are many cases where password is not necessary if you work only with Local, so you can read it through.
For password settings,
I will write it first because I will take the procedure. If you do not need a password, please read this chapter.
First, enter the `` `bash``` environment of the docker container with the following command.
$ docker run -it --rm jupyter/datascience-notebook /bin/bash
When I put it in the `` `bash``` environment of the container safely, it switches to the following output.
jovyan@Alphanumeric:~$
If the output is switched, use python inside the docker container to get the hash string of the password you want to use. The following command will launch a prompt to hash the password into a string.
(Inside docker container) $ python -c 'from notebook.auth import passwd;print(passwd())'
When you enter the command, you will be prompted for the password you want to use, so enter it twice.
(Inside docker container) $ python -c 'from notebook.auth import passwd;print(passwd())'
Enter password:
Verify password:
sha1:YOUR_PASSWORD_HASH_VALUE
Keep in mind that the `` `sha1: YOUR_PASSWORD_HASH_VALUE``` (YOUR_PASSWORD_HASH_VALUE depends on the environment) that is output after entering the password twice will be used later.
Once you get the hash string, you're back in the Local environment because you don't need to work in this container. Also, since I set the `` `--rm``` option when starting the docker container, this container is automatically deleted when the docker container is stopped.
(Inside docker container) $ exit
Once you have the hash string of the password you want to use, it's time to launch the docker container for analysis. You can set passwords and persist files by passing additional options when starting the docker container.
As mentioned earlier, there is no problem with passwords even if they do not depend on the environment. However, be aware that if you don't persist the files, everything you've done in the container will disappear.
Start the Docker container of the data analysis environment with the following command.
$ docker run \
--user root \
-e GRANT_SUDO=yes \
-e NB_UID=$UID \
-e NB_GID=$GID \
-e TZ=Asia/Tokyo \
-p 8888:8888 \
--name notebook \
-v ~/path/to/directory/:/home/jovyan/work \
jupyter/datascience-notebook \
start-notebook.sh \
--NotebookApp.password='sha1:YOUR_PASSWORD_HASH_VALUE'
To explain the options
-v ~/path/to/directory/:/home/jovyan/work/
By doing this, the work directory and below that can be seen from jupyter notebook will be synchronized with the local directory.
~/path/to/directory/
Is the part that sets the directory where files can be exchanged with the docker container, so please use your favorite directory such as each person's working directory.
Jupyter Notebook
is set in the following part.--NotebookApp.password='sha1:YOUR_PASSWORD_HASH_VALUE'
YOUR_PASSWORD_HASH_For VALUE, enter the hash string you generated earlier.
If you enter the above command and no error is thrown, you should have successfully started Jupyter.
Let's access it with a browser and check it.
In most cases, you can find it at http: // localhost: 8888.
When building in a server environment, replace localhost with the IP address of the target server.
In the server environment, there are cases where access is not possible because the port of the server itself is closed.
In that case, open the port used by Jupyter Notebook.
The image below is the one when it was successfully started.
If you see a page like this, you can log in using the password you set earlier.
![ss_2017-06-30_17.11.51.png](https://qiita-image-store.s3.amazonaws.com/0/43351/c99a80ea-9f4e-5c71-5cc8-3dafa9ae187c.png)
After that, enjoy the Jupyter environment in your favorite language!
Let's enjoy data science!