[PYTHON] Data science 100 knock (structured data processing) environment construction (Windows10)

Introduction

The Data Scientist Association has released ** "Data Science 100 Knock (Structured Data Processing)" **, a free learning environment where you can practically learn how to process structured data [on GitHub](https: / /github.com/The-Japan-DataScientist-Society/100knocks-preprocess). This article describes the details of the introduction procedure so that even beginners can build a free learning environment. (The execution environment to be built is shown in the figure below.) dss_structure.png

Prerequisites (Windows10)

  1. Docker Desktop for Windows
  1. Git for Windows
> git config --global core.autocrlf input

Environment

Create a directory for the learning environment (dss this time) and clone a repository of 100 knocks. After that, move to the 100 knock directory and use the docker-compose command to create a container. (It takes about 10 minutes.)

> mkdir dss
> cd dss
> git clone https://github.com/The-Japan-DataScientist-Society/100knocks-preprocess.git
> cd 100knocks-preprocess
> docker-compose up -d --build

If you can check the started container and check the output of ** "dss-notebook" ** and ** "dss-postgres" **, the environment construction is successful.

> docker ps

CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
b35f99d4148a        dss-notebook        "tini -g -- start-no…"   23 seconds ago      Up 22 seconds       0.0.0.0:8888->8888/tcp   dss-notebook
3cb559c7f66d        dss-postgres        "docker-entrypoint.s…"   27 seconds ago      Up 26 seconds       0.0.0.0:5432->5432/tcp   dss-postgres

How to use

You can access the built Jupyter environment by accessing the following URL with a browser.

http://localhost:8888

Under the work directory, there is an .ipynb file for structured data processing exercises. ** Import of required library and data acquisition before processing are already described in the first cell. ** ** Enter the process suitable for the exercise in the blank cell and execute it to proceed with the learning. dss_jupyter_work.png The answer to the exercise is in the .ipynb file in the work / answer directory. Therefore, you can work while checking the correctness of the processing answered in the exercise file. dss_jupyter_answer.png

Stop / start learning environment

You can stop the built environment with the following command.

> docker-compose stop

Also, if you want to start it after the second time, you can start it with the following command.

> docker-compose start

Supplementary information

When the response of the built environment is slow

Change the Memory value of Resources in Settings of Docker Desktop for Windows. The recommendation is 4.00GB or more. docker_settings_resources.png

If port 8888 is in use

If you are using the 8888 port of the local host in another development environment (LAMP etc.), you can handle it by changing docker-compose.yml as follows (change the port value of notebook).

docker-compose.yml


  notebook:
    ports:
      - "888:8888"

In the above case, it will be accessible at the following URL.

http://localhost:888

Summary

Described the environment construction procedure for 100 data science knocks (structured data processing) in the Windows 10 environment. If you have any questions or concerns regarding the above procedure, we would appreciate it if you could comment.

Reference link

Data Science 100 Knock Guide

Recommended Posts

Data science 100 knock (structured data processing) environment construction (Windows10)
"Data Science 100 Knock (Structured Data Processing)" Python-007 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-006 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-001 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-002 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 021 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-005 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-004 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 020 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 025 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-003 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 019 Explanation
Preparing to try "Data Science 100 Knock (Structured Data Processing)"
Data science environment construction with Docker
[Python] Data Science 100 Knock (Structured Data Processing) 001-010 Impressions + Explanation Link Summary
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 2]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 1]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 3]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 5]
That's why I quit pandas [Data Science 100 Knock (Structured Data Processing) # 6]
[Python] 100 knocks on data science (structured data processing) 018 Explanation
[Python] 100 knocks on data science (structured data processing) 023 Explanation
[Python] 100 knocks on data science (structured data processing) 030 Explanation
[Python] 100 knocks on data science (structured data processing) 022 Explanation
[Python] 100 knocks on data science (structured data processing) 017 Explanation
[Python] 100 knocks on data science (structured data processing) 026 Explanation
[Python] 100 knocks on data science (structured data processing) 016 Explanation
[Python] 100 knocks on data science (structured data processing) 024 Explanation
[Python] 100 knocks on data science (structured data processing) 027 Explanation
[Python] 100 knocks on data science (structured data processing) 029 Explanation
[Python] 100 knocks on data science (structured data processing) 015 Explanation
[Python] 100 knocks on data science (structured data processing) 028 Explanation
Data science 100 knock commentary (P021 ~ 040)
Data science 100 knock commentary (P061 ~ 080)
Data science 100 knock commentary (P041 ~ 060)
Python environment construction (Windows10 + Emacs)
Data science 100 knock commentary (P081 ~ 100)
Python environment construction under Windows7 environment
[Tensorflow] Tensorflow environment construction on Windows 10
Anaconda python environment construction on Windows 10
[Python3] Development environment construction << Windows edition >>
Ml-Agents Release 6 (0.19.0) Environment Construction Summary [Windows]
100 Language Processing Knock-91: Preparation of Analogy Data
Python project environment construction procedure (for windows)
VScode environment construction (Windows10, Python, C ++, C, Git)
Quickly build a python environment for deep learning and data science (Windows)
[Windows 10] "Deep Learning from scratch" environment construction
Windows + gVim + Poetry python development environment construction
Easy Python data analysis environment construction with Windows10 Pro x VS Code x Docker
Python environment construction (Anaconda + VSCode) @ Windows10 [January 2020 version]
100 language processing knock-20 (using pandas): reading JSON data
Image Processing with Python Environment Setup for Windows
Django environment construction
DeepIE3D environment construction
Emacs-based environment construction
Linux environment construction
Python environment construction
python windows environment
100 Language Processing Knock (2020): 28
Environment construction (python)
django environment construction