Python environment construction 2016 for those who aim to be data scientists

About python environment construction

If you google with "python environment construction", you will get about 200,000 hits, but the content is relatively old. The title says Data Scientist, but I recommend anaconda as well as Data Scientist.

―― 2.x or 3.x? 3.x has many libraries that don't work, so 2.x is recommended> There are libraries that don't work with 3.x. --Pip with easy_install, setuptools, but there is also a wheel ...> It's old. --virtualenv Required> That's not true. --In win, 64bit has many problems, so 32bit is recommended> old. --For win, download and install from Unofficial Binary> Thank you for your help, but I haven't used it recently.

** 2016 version python environment construction method definitive edition for each OS **

The important thing is that you don't have to include ** official python **.

Challenges of building a python environment

――It's better than before, but you need to use 2 system and 3 system properly depending on the purpose and library used. --Version control system required: pyenv, pythonz, etc. --There are cases where you want to use different development environments, such as some libraries that cannot coexist, or you want to try a new library for the dev version. --Environmental management system required: virtualenv, pyvenv (python 3.4 or later), etc. --Package management system makes beginners cry --Standard is pip (python package index) --Pip is not pre-installed up to python 3.3, so install it from the package management system called setuptools (currently distribute) ... ――The package management system needs package management. Isn't it Battery Included? ――I don't understand anymore. The page of here is detailed. --Linux and Mac come with python pre-installed, but most of them are python 2.x --When I try to upgrade the system python, it definitely fits. --In the end, you will have to put multiple versions of python with apt-get, yum, brew, but the environment variables will conflict. --I installed ipython with pip on python3 but can't use it> I installed it on another python orz ...

You don't have to worry about these in the definitive edition of the Environment Construction Law.

What is Anaconda?

Anaconda

--One of the python distributions, you can install major libraries all-in-one. (numpy, scipy, pandas, ipython, jupyter, scikit-learn etc ...) --There is also a miniconda that has a minimal configuration. --Both python 2.x and 3.x are supported. (python 3.x since summer 2014) --Linux version, Mac version, Win version are available. --There are both 32bit and 64bit. --There are similar things like enthought and python (x, y), but only Anaconda supports all platforms & 2/3.

What is conda?

--You can use a package management system called conda. : ** Instead of pip ** --Currently, there are over 400 supported packages. --Since pip is compiled on the client side, it may be moss depending on the environment. --Since it can be used together with pip, libraries not included in conda can be installed with pip. --conda can also version control. : ** Instead of pyenv ** --For example, if you include anaconda3-xx.xx, everything will be included based on python 3.5. - conda create -n py2 python=2.7 --By typing the above command, you can build a virtual environment for python2.7. --Use source activate py2 to enter the virtual environment of python2.7. --conda can also be used for virtual environment management. : ** Instead of virtualenv / venv ** --You can install packages with conda or pip in a virtual environment. --Since the virtual environment created by conda can absorb different versions of python, it can be said to be ** upward compatible with virtualenv **. --In fact, when you try to use virtualenv with anaconda, you get a warning to create an environment with conda. (A virtualenv can be created.) ――Conda is the best. ――Three of the items raised in the issue of environment construction will be solved. (Because there is a fourth, linux / Mac should be combined with pyenv.)

Environment construction for each OS

For Windows

windows is pre-installed and python is not included. However, since pyenv doesn't support windows at all, you should also consider version control. But don't worry. There is no problem if you add only anaconda.

Environment construction method

-Download the installer for windows from Download Site. ――I think 3.5 is fine, but if you only use 2.7, please drop for 2.7. ――You don't have to worry so much because you can create a 2.7 environment later with conda. --Run the exe and follow the instructions to install. (You will be asked if you want to add it to the path on the way.)

Informal binaries don't resolve dependencies, so I'm sick of the complex dependencies I want. ..

For Linux

The system comes pre-installed with python.

――Most of them are 2.x, so I would like to coexist with 3.x. ――It fits when you touch the system. --Even if you put it separately as ʻapt-get install python3` etc., you have to play with the environment variables yourself. ――There is actually a problem with anaconda alone. --Install curl, sqlite3, openssl, etc. together to keep the operation. --Recommendation is via pyenv. --The current pyenv is supports anaconda.

Environment construction method

  1. Install pyenv.
$ git clone https://github.com/yyuu/pyenv.git ~/.pyenv
$ echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
$ echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
$ echo 'eval "$(pyenv init -)"' >> ~/.bashrc
$ source ~/.bashrc
  1. Install anaconda. It doesn't matter whether it's 3 series or 2 series, but 3 is not inconvenient in most cases. It is possible to have both coexist with pyenv, but one is fine because you can switch 2/3 with conda.
$ pyenv install -l | grep ana
#Check the latest version. anaconda3-2.5.0 (2 system is anaconda-2.5.0)
#If you like miniconda, please add miniconda.
$ pyenv install anaconda3-2.5.0
$ pyenv rehash
$ pyenv global anaconda3-2.5.0
#Set anaconda as the main python.
$ echo 'export PATH="$PYENV_ROOT/versions/anaconda3-2.5.0/bin/:$PATH"' >> ~/.bashrc
$ source ~/.bashrc
#Since activate batting with pyenv and anaconda, specify it in path.
$ conda update conda
#Update conda itself just in case.

(Corrected on 3/24) $ echo 'export PATH="$PYENV_ROOT/versions/anaconda3-2.5.0/bin/:$PATH"' >> ~/.bashrc Has side effects and will always use anaconda without accepting any pyenv global or pyenv local. For those who use other than anaconda, or those who mainly want to operate with pyenv local, it is necessary to stop listing in Path and enter conda activate with the full path. source ~ / .pyenv / versions / anaconda3-2.5.0 / bin / activate <environment name> For those who do not switch the environment frequently and just use anaconda, it is better to specify it in the path.

(4/8) I created another article about activate collision problem when pyenv and anaconda coexist.

For Mac

** I don't have a Mac, so it's hearsay. Please pardon. ** ** ↓ Since there is such a story, it seems better to put pyenv like Linux. The story that the Homebrew environment was blown away when Anaconda was added

Environment construction method

I don't have it, so please refer to another person's article. I think it's the same except that pyenv becomes Homebrew instead of via github. Note from installing Homebrew to building an Anaconda environment for Python with pyenv

I haven't verified it myself, but as with linux, anaconda and pyenv activate will batting. I think it's easier to go through the path to anaconda.

$ echo 'export PATH="$PYENV_ROOT/versions/anaconda3-2.5.0/bin/:$PATH"' >> ~/.bashrc
$ source ~/.bashrc

For Chrome OS

You can buy a MacBook Air-like one for about one-third the price (^^

See here Create ubuntu environment with crouton It is possible to put it, but it is a little troublesome.

Basic usage of conda

Virtual environment

--Building a virtual environment

** conda create -n python = **

conda create -n py2 python=2.7 numpy scipy pandas jupyter
#It is also possible to put them together as anaconda.
conda create -n anaconda2 python=2.7 anaconda

--Check virtual environment

conda env list
#It also appears here.
conda info -e

--Entering and exiting the virtual environment

#Enter the virtual environment
source activate py2
#activate py2 on windows
#Get out of the virtual environment
source deactivate
#deactivate on windows

--Delete virtual environment

conda remove -n py2 --all

Package management

--Package installation & uninstallation

conda install numpy scipy #Same as pip, multiple OK with space delimiter
conda install numpy=1.10.4 #Version can be specified
conda install -n py2 numpy scipy # -You can also specify the environment name with n

conda update numpy # update

pip install numpy #You can also use pip. Use this when not in conda
source activate py2;pip install numpy #Must be activated when entering the virtual environment

conda uninstall -n py2 numpy #Uninstall

--Check the package

conda list
#List of currently included packages

conda list -n py2
# -You can also select under virtual environment with n

conda list --export > env.txt
conda create -n py2_copy --file env.txt
#You can export it and reuse it,
#Packages put in with pip cannot be exported, so you need to output them separately with pip freeze.

anaconda cloud edition

Someone may have posted a package on anaconda cloud (anaconda.org) that is not provided by anaconda.

anaconda search -t conda ggplot
#Come out in various ways
# ...
#bokeh/ggplot              |    0.6.8 | conda           | linux-64, win-32, win-64, linux-32, osx-64
#                                          : ggplot for python
#...
anaconda show bokeh/ggplot # <USER/PACKAGE>You can see the details by specifying with
#Using Anaconda Cloud api site https://api.anaconda.org
#Name:    ggplot
#Summary: ggplot for python
#Access:  public
#Package Types:  conda
#Versions:
#   + 0.6.5
#   + 0.6.8
conda install -c bokeh ggplot #USER name-Install the package by specifying with c

It's hard to see and it seems that you can't search by OS, so it may be faster to search from https://anaconda.org/.

bonus

――I think there are people who say that data science is R! --You can also install R with conda.

conda create -n r -c r r-irkernel

With this, you can even create an environment for using R in the now popular Jupyter.

――You can use R normally, or you can use it via Rstudio. --Jupyter usage of R is easy because it depends on the version of R and rzmq.

Great, Anaconda.

Postscript or snake foot

It is a little off the definitive edition. (3/24 postscript)

pyenv It can't be used on windows, but there are some good things to do if you bite pyenv on Mac / Linux.

--Easy to rebuild the environment. ――When you can't help it, you can assume that it wasn't pyenv uninstall. --Alternatively, when a new version of anaconda comes out, you can start using the new anaconda. --You can use other than Cpython such as pypy, jython, stackless. --pyenv local can be used.

pyenv local Of these, pyenv local is very useful.

For example, if you set the following, anaconda2 will start up instead of anaconda3 only when you move to py2.

mkdir py2
cd py2
pyenv local anaconda2-2.5.0

In fact, you can also specify the virtual environment of ** anaconda locally. ** **

conda create -n py2 python=2.7
pyenv local anaconda3-2.5.0/envs/py2
python
#Python 2.7.11 |Continuum Analytics, Inc.| (default, Dec  6 2015, 18:08:32)
#[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2

However, [Linux version environment construction](http://qiita.com/y__sama/items/5b62d31cb7e6ed50f02c#%E7%92%B0%E5%A2%83%E6%A7%8B%E7%AF%89%E6 As described in% 96% B9% E6% B3% 95-1), $ echo'export PATH =" $ PYENV_ROOT / versions / anaconda3-2.5.0 / bin /: $ PATH "'>> ~ / .bashrc If is set, pyenv local will be ignored and anaconda3-2.5.0 will be read.

Whether to cut the environment in the directory for each project and operate with pyenv local or explicitly hitsource activate <virtual environment name>when switching is necessary depends on your preference and usage content. I think so please choose the one you like.

pyenv-virtualenv

Very confusing is the pyenv plugin called pyenv-virtualenv. This is different from python's virtual environment management system virtualenv, and you can switch between virtual environments created with virtualenv or venv (python3.4 or later). Moreover, it supports even the environment created by conda. Therefore, it solves the batting problem of pyenv local and source activate.

Installation is easy, clone from git and place it in the plugin folder under the pyenv installation folder.

git clone git://github.com/yyuu/pyenv-virtualenv.git ~/.pyenv/plugins/pyenv-virtualenv
souce ~/.bashrc

Easy to use pyenv-virtualenv

#Check the environment
pyenv virtualenvs
#>>>anaconda3-2.5.0 (created from /home/vagrant/.pyenv/versions/anaconda3-2.5.0)
#>>>anaconda3-2.5.0/envs/py2 (created from/home/vagrant/.pyenv/versions/anaconda32.5.0/envs/py2)

#Activate the environment
pyenv activate anaconda3-2.5.0/envs/py2
python
#>>>Python 2.7.11 |Continuum Analytics, Inc.| (default, Dec  6 2015, 18:08:32)

#Deactivate the environment
pyenv deactivate

You can build an environment with conda and switch the environment with pyenv activate and pyenv local. For those who often switch virtual environments on Mac / Linux, it may be more productive to include pyenv-virtualenv.

Recommended Posts

Python environment construction 2016 for those who aim to be data scientists
Environment construction for those who want to study python easily with VS Code (for Mac)
Python environment construction For Mac
Python3 environment construction (for beginners)
For those who want to write Python with vim
For those who are new to programming but have decided to analyze data with Python
Python3 TensorFlow for Mac environment construction
Environment construction procedure for those who are not familiar with the python version control system
Python techniques for those who want to get rid of beginners
A modern environment building procedure for those who want to get started with Python right away
Python project environment construction procedure (for windows)
[Python] Road to snake charmer (1) Environment construction
Only 8 Techniques to Pretend to be Data Scientists
How to use "deque" for Python data
[Introduction to Data Scientists] Basics of Python ♬
I analyzed Airbnb data for those who want to stay in Amsterdam
Switch the module to be loaded for each execution environment in Python
Python environment construction
Environment construction (python)
python environment construction
Python --Environment construction
Python environment construction
python environment construction
Dart grammar for those who only know Python
From Python environment construction to virtual environment construction with anaconda
Tips for those who are wondering how to use is and == in Python
For those who want to learn Excel VBA and get started with Python
Things to keep in mind when using Python for those who use MATLAB
5 Reasons Processing is Useful for Those Who Want to Get Started with Python
[For beginners] How to study Python3 data analysis exam
List of Python libraries for data scientists and data engineers
The first step of machine learning ~ For those who want to implement with python ~
homebrew python environment construction
[For those who have mastered other programming languages] 10 points to catch up on Python points
Python development environment construction
Data analysis environment construction with Python (IPython notebook + Pandas)
Anxible points for those who want to introduce Ansible
Python development environment construction 2020 [From Python installation to poetry introduction]
python2.7 development environment construction
[Python] Django environment construction (pyenv + pyenv-virtualenv + Anaconda) for macOS
Procedure to exe python file from Ubunts environment construction
Mac environment construction Python
Python environment construction @ Win7
Python environment for projects
For those who can't install Python on Windows XP
ABC's A problem analysis for the past 15 times to send to those who are new to Python
Python environment construction and SQL execution example to DB and memo of basic processing for statistics 2019
I want to be able to analyze data with Python (Part 3)
Recommendation of Jupyter Notebook, a coding environment for data scientists
For those who are having trouble drawing graphs in python
[Python] It might be useful to list the data frames
I want to be able to analyze data with Python (Part 1)
Building a Hy environment for Lisper who hasn't touched Python
I want to be able to analyze data with Python (Part 4)
For those who want to start machine learning with TensorFlow2
From environment construction to deployment for flask + Heroku with Docker
I want to be able to analyze data with Python (Part 2)
Build a Python environment and transfer data to the server
Python code for writing CSV data to DSX object storage
[Introduction to Data Scientists] Basics of Python ♬ Functions and classes
Reference reference for those who want to code in Rhinoceros / Grasshopper