[PYTHON] [Introduction to Data Scientists] Basics of scientific calculation, data processing, and how to use the graph drawing library ♬ Environment construction

Continuing from last night's [Introduction to Data Scientists] Python Basics ♬ Functions and Classes, Chapter 2 Basics of Scientific Calculation, Data Processing, and Using Graph Drawing Library I will talk about environment construction though it is not in. 【Caution】 ["Data Scientist Training Course at the University of Tokyo"](https://www.amazon.co.jp/%E6%9D%B1%E4%BA%AC%E5%A4%A7%E5%AD%A6%E3 % 81% AE% E3% 83% 87% E3% 83% BC% E3% 82% BF% E3% 82% B5% E3% 82% A4% E3% 82% A8% E3% 83% B3% E3% 83 % 86% E3% 82% A3% E3% 82% B9% E3% 83% 88% E8% 82% B2% E6% 88% 90% E8% AC% 9B% E5% BA% A7-Python% E3% 81 % A7% E6% 89% 8B% E3% 82% 92% E5% 8B% 95% E3% 81% 8B% E3% 81% 97% E3% 81% A6% E5% AD% A6% E3% 81% B6 % E3% 83% 87% E2% 80% 95% E3% 82% BF% E5% 88% 86% E6% 9E% 90-% E5% A1% 9A% E6% 9C% AC% E9% 82% A6% I will read E5% B0% 8A / dp / 4839965250 / ref = tmm_pap_swatch_0? _ Encoding = UTF8 & qid = & sr =) and summarize the parts that I have some doubts or find useful. Therefore, I think the synopsis will be straightforward, but please read it, thinking that the content has nothing to do with this book.

Chapter 2 Basics of scientific calculation, data processing, and how to use the graph drawing library

Chapter 2-1 Library used for data analysis

Learn how to use Mumpy, Scipy, Pandas, Matplotlib.

Environment

I will build it with Raspi4. 【reference】 [Introduction to RasPi4] Environment construction; OpenCV / Tensorflow, Japanese input ♪

First, build these libraries in the following environment. The target is to install pytho3, numpy, scipy, pandas, matpotlib, jupiter notebook. The environment at the time of burning to the SD card is as follows That is, python 3.7.3 numpy 1.16.2 was installed.

Environment construction confirmation
$ python3
Python 3.7.3 (default, Dec 20 2019, 18:57:59) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> print(numpy.__version__)
1.16.2
>>> import scipy
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'scipy'
>>> import pandas
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'pandas'
>>> import matplotlib
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'matplotlib'

Library installation

This time it seems that there are no dependencies other than numpy-scipy, but for the time being, I put them in the following order. With the following apt-get command, it was possible to install in a very short time (about 10 minutes in total).

$ sudo apt install jupyter-notebook

$ sudo apt-get update
$ sudo apt-get upgrade

$ sudo apt-get install python3-matplotlib

$ sudo apt-get install python3-scipy

$ sudo apt-get install python3-pandas
Check the library

For the time being, it seems that it was installed as follows.

$ python3
Python 3.7.3 (default, Jul 25 2020, 13:03:44) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scipy
>>> print(scipy.__version__)
1.1.0
>>> import pandas
>>> print(pandas.__version__)
0.23.3
>>> import matplotlib
>>> print(matplotlib.__version__)
3.0.2

Chapter 2-1-1 Loading the library

There are two typical syntaxes for loading a library from a program.

(1) import module name as distinguished name Example; import numpy as np Functions (functions) defined in the numpy module can be used in the form of np. Functions. (2) from module name import attribute Example; from numpy import random The same is true here, you can call random in numpy and use the more defined functions as random. Functions.

Chapter 2-1-2 Magic Command

Although it is an explanation of% oresis% matplotlib used in Jupyter notebook, it is omitted because it is not used. If you run with% quickref, the command will appear.

Chapter 2-1-3 Importing libraries used in this chapter

something.py


import numpy as np
import numpy.random as random
import scipy as sp
import pandas as pd
from pandas import Series, DataFrame

import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns

#%matplotlib inline
#%precision 3

When I try to run the above as python something.py, I get the following error:

ModuleNotFoundError: No module named 'seaborn'

Install seaborn. I was able to install it in 5 seconds.

$ pip3 install seaborn
...
Successfully installed seaborn-0.10.1

Now the error is gone.

Summary

-Installed the necessary libraries for Raspi4 from scratch ・ When installed with apt-get, it took about 10 minutes in total. ・ As you can see, I had a hard time with the WiFi settings and the Japanese localization settings.

・ Is it finally like that tomorrow? ??

Recommended Posts

[Introduction to Data Scientists] Basics of scientific calculation, data processing, and how to use the graph drawing library ♬ Environment construction
[Introduction to Data Scientists] Basics of scientific calculation, data processing, and how to use graph drawing library ♬ Basics of Scipy
[Introduction to Data Scientists] Basics of scientific calculation, data processing, and how to use graph drawing library ♬ Basics of Pandas
[Introduction to Data Scientists] Basics of scientific calculation, data processing, and how to use graph drawing library ♬ Basics of Matplotlib
How to use the graph drawing library Bokeh
[Introduction to Data Scientists] Basics of Python ♬ Functions and classes
[Introduction to Data Scientists] Basics of Python ♬ Conditional branching and loops
[Introduction to Data Scientists] Basics of Python ♬ Functions and anonymous functions, etc.
[Introduction to Data Scientists] Basics of Python ♬
[Introduction to Data Scientists] Basics of Probability and Statistics ♬ Probability / Random Variables and Probability Distribution
[Python] How to use the graph creation library Altair
[python] How to use the library Matplotlib for drawing graphs
Introduction of DataLiner ver.1.3 and how to use Union Append
[Introduction to Python] How to use the Boolean operator (and ・ or ・ not)
From the introduction of GoogleCloudPlatform Natural Language API to how to use it
Introduction of cyber security framework "MITRE CALDERA": How to use and training
Jupyter Notebook Basics of how to use
Basics of PyTorch (1) -How to use Tensor-
How to make VS Code aware of the venv environment and its benefits
[Introduction to Python] How to get the index of data with a for statement
How to use the C library in Python
Use decorators to prevent re-execution of data processing
How to use PyTorch-based image processing library "Kornia"
[Introduction to Azure for kaggle users] Comparison of how to start and use Azure Notebooks and Azure Notebooks VM
How to get started with Visual Studio Online ~ The end of the environment construction era ~
It's time to seriously think about the definition and skill set of data scientists
[Introduction to Python] How to use while statements (repetitive processing)
How to use Python Kivy ① ~ Basics of Kv Language ~
How to use the grep command and frequent samples
[Introduction to Udemy Python3 + Application] 27. How to use the dictionary
[Introduction to Udemy Python3 + Application] 30. How to use the set
How to use argparse and the difference between optparse
How to use the Rubik's Cube solver library "kociemba"
[Introduction to Python] Basic usage of the library matplotlib
[Introduction to logarithmic graph] Predict the end time of each country from the logarithmic graph of infection number data ♬
Python environment construction and SQL execution example to DB and memo of basic processing for statistics 2019
How to prepare the execution environment of the ultra-lightweight Python "Python embeddable" (about 15MB) Memo (until the introduction of pip and other libraries (eg psutil))
How to increase the processing speed of vertex position acquisition
[Introduction to Data Scientists] Descriptive Statistics and Simple Regression Analysis ♬
[Introduction to Udemy Python 3 + Application] 36. How to use In and Not
How to calculate the amount of calculation learned from ABC134-D
[Python] Summary of how to use split and join functions
Build a Python environment and transfer data to the server
Comparison of how to use higher-order functions in Python 2 and 3
[Introduction to Scipy] Calculation of Lorenz curve and Gini coefficient ♬
Notes on how to use marshmallow in the schema library
[Introduction to Python] How to get data with the listdir function
Overview of Python virtual environment and how to create it
When you want to use multiple versions of the same Python library (virtual environment using venv)
Learn the flow of Bayesian estimation and how to use Pystan through a simple regression model
A super beginner who does not know the basics of Python tried to graph the realized profit and loss data of Rakuten Securities in Python
How to use the generator
How to use the decorator
[Introduction] How to use open3d
How to extract features of time series data with PySpark Basics
How to handle multiple versions of CUDA in the same environment
[EC2] How to install chrome and the contents of each command
[Introduction to Python] I compared the naming conventions of C # and Python.
Python environment construction 2016 for those who aim to be data scientists
[Python] How to get the first and last days of the month
How to use Serverless Framework & Python environment variables and manage stages