[PYTHON] How to use numpy

People who heard that python is good for numerical calculation and machine learning, and started python, but it's a little difficult, but C ++ is faster at all. People who have heard the name of numpy but have never used it. I tried using numpy, but the point is an advanced version of the math package? Someone who thinks. For those people, I'll teach you how to use numpy correctly!

numpy installation

If you installed python using anaconda, you probably already have numpy. There is a possibility that numpy is included depending on the installation method. So let's first check if numpy is included. Start the python console by typing python in the terminal (command prompt for windows). In the python console

> import numpy

If there is no error when you type, it is installed. If you get an error like No Module Named numpy, it is not installed and you need to install it. In the terminal (not the python console)

$ pip install numpy

You can install it with.

How to make an array

From here, we will look specifically at programming using numpy, but numpy is

> import numpy as np

Is imported. This means importing the module numpy with the name np.

How to make a one-dimensional array

The basics of numpy start with creating an array. The array with the contents 1,2,3 is

> arr = np.asarray([1,2,3])
> arr
array([1, 2, 3])

You can make it. In addition, you can specify the type of the array by specifying dtype. Frequently used types include np.int32, np.float32, and np.float64. To use this to create an array of type np.int32

> arr = np.asarray([1,2,3], dtype=np.int32)
> arr
array([1, 2, 3], dtype=int32)

will do. To change the type of an array that already exists

> i_arr = np.asarray([1,2,3], dtype=np.int32)
> f_arr = i_arr.astype(np.float32)
> f_arr
array([ 1.,  2.,  3.], dtype=float32)

will do. At this time, the original array ʻi_arr` does not change.

> i_arr
array([1, 2, 3], dtype=int32)

How to make a multidimensional array

To make a multidimensional array

> arr = np.asarray([[1,2,3], [4,5,6]])
> arr
array([[1, 2, 3],
       [4, 5, 6]])

will do. You can specify and change the type as in the case of one-dimensional. The shape element contains the shape of the array.

> arr.shape
(2, 3)

This is a tuple type. By the way, the shape of the one-dimensional array is

> arr = np.asarray([1,2,3])
> arr.shape
(3,)

Will be. This is a tuple type with only one element.

How to make a special array

You can easily create special arrays with numpy.

> #Array with all 0 elements
> np.zeros((2, 3))
array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])
> #An array with all 1 elements
> np.ones((2, 3))
array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])
> #Element[0-1)Randomly initialize in the range of
> np.random.rand(2, 3)
array([[ 0.24025569,  0.48947483,  0.61541917],
       [ 0.01197138,  0.6885749 ,  0.48316059]])
> #Generate elements according to a normal distribution
> np.random.randn(2, 3)
array([[ 0.23397941, -1.58230063, -0.46831152],
       [ 1.01000451, -0.21079169,  0.80247674]])

There are many other functions that generate arrays. If you want this kind of arrangement, you may find it by google.

Array calculation

Basic calculation

The power of numpy is that it's very easy to calculate arrays.

> a = np.asarray([[1,2,3],[4,5,6]])

Against

> 3 * a
array([[ 3,  6,  9],
       [12, 15, 18]])

It will be. Scalarizing an array multiplies each element by a constant. If you add a scalar

> 3 + a
array([[4, 5, 6],
       [7, 8, 9]])

And are added to each element. Calculation between arrays

> b = np.asarray([[2,3,4],[5,6,7]])
> a + b
array([[ 3,  5,  7],
       [ 9, 11, 13]])
> a * b
array([[ 2,  6, 12],
       [20, 30, 42]])

In the calculation of arrays of the same shape, the elements at the same position are calculated and the array of that shape is returned. Sometimes arrays of different shapes can be calculated.

> v = np.asarray([2,1,3])
> a * v
array([[ 2,  2,  9],
       [ 8,  5, 18]])
> a + v
array([[3, 3, 6],
       [6, 6, 9]])

In the calculation of a two-dimensional array and a one-dimensional array, the result of calculating each row of the two-dimensional array as a one-dimensional array when the number of columns of the two-dimensional array is the same as the length of the one-dimensional array is I will return. Therefore, the array has the same shape as the two-dimensional array.

It is also possible to perform operations using a two-dimensional array as a single matrix.

> M = np.asarray([[1,2,3], [2,3,4]])
> N = np.asarray([[1,2],[3,4], [5,6]])

To find the product of two arrays of

> M.dot(N)
array([[22, 28],
       [31, 40]])

will do. Here we are multiplying the $ 2 \ times 3 $ matrix by the $ 3 \ times 2 $ matrix, so the $ 2 \ times 2 $ matrix is returned.

Function call

With numpy you can put arrays into various functions. At this time, the function acts on each element. For example

> a = np.asarray([[1,2], [3,1])
> np.log(a)
array([[ 0.        ,  0.69314718],
       [ 1.09861229,  0.        ]])

It will be. At this time, the original array does not change. There are many other possible functions such as trigonometric functions, ʻexp, sqrt`, and so on.

Take statistics

numpy is also good at collecting array statistics. First 100 generate this random number.

> arr = np.random.rand(100)

To average the array

> np.mean(arr)
0.52133315138159586

will do. The maximum and minimum values are

> np.max(arr)
0.98159897843423383
> np.min(arr)
0.031486992721019846

You can get it. The standard deviation is

> np.std(arr)
0.2918171894076691

To get the sum

> np.sum(arr)
52.133315138159588

will do. You can also specify in which direction statistics should be taken for 2D arrays. For example

> arr = np.asarray([[1,2,3], [2,3,4]])
> np.sum(arr, axis=0)
array([3, 5, 7])
> np.sum(arr, axis=1)
array([6, 9])

It will be.

Actually use

Including the above, let's use numpy to calculate the code for averaging the Euclidean distances from the origins of 100 vectors in 3D space.

First, suppose the data array is an array of shape (100, 3), with the first column at $ x $ coordinates, the second column at $ y $ coordinates, and the third column at $ z $ coordinates. here

> data = np.random.randn(100, 3)

Generated as. Euclidean distance

d(x,y,z) = \sqrt{x^2+y^2+z^3}

So, first, square each element.

> squared = data**2

Then sum the lines.

> squared_sum = np.sum(squared, axis=1)

At this time, squared_sum becomes a one-dimensional array. On the other hand, if you take the square root, you can find the Euclidean distance of each point.

> dist = np.sqrt(squared_sum)

If you take the average of this distance

> np.mean(dist)
1.5423905808984208

have become. (Since the data is randomly generated, the result will be slightly different.)

If you run this code without using numpy, you can use the for loop to calculate each of the 100 points, and use the for loop for each point as the dimension increases. Not only does it complicate the code, but it also slows down execution. In this way, the basic idea of numpy is to calculate a large array at once. As a result, numpy can perform complex operations faster than python can.

By the way, this time I calculated one by one for practice, but numpy has a function called np.linalg.norm, and you can easily calculate the Euclidean distance.

Summary

That's all for the basic usage of numpy, but numpy has many more features. For example, the np.where function that finds the index of an element that meets the conditions.

Not limited to numpy, I think that the shortest way to improve is to write and experience python while googled, so please do your best even if you get confused at first!

Recommended Posts

How to use numpy
How to use xml.etree.ElementTree
How to use Python-shell
How to use tf.data
How to use Seaboan
How to use image-match
How to use shogun
How to use Pandas 2
How to use Virtualenv
How to use numpy.vectorize
How to use pytest_report_header
How to use partial
How to use Bio.Phylo
How to use x-means
How to use WikiExtractor.py
How to use IPython
How to use virtualenv
How to use Matplotlib
How to use iptables
How to use TokyoTechFes2015
How to use venv
How to use dictionary {}
How to use Pyenv
How to use list []
How to use python-kabusapi
How to use OptParse
How to use return
How to use dotenv
How to operate NumPy
How to use pyenv-virtualenv
How to use Go.mod
How to use imutils
How to use import
How to use Qt Designer
How to use search sorted
[gensim] How to use Doc2Vec
python3: How to use bottle (2)
Understand how to use django-filter
How to use the generator
[Python] How to use list 1
How to use FastAPI ③ OpenAPI
How to use Python argparse
How to use IPython Notebook
How to use Pandas Rolling
[Note] How to use virtualenv
How to use redis-py Dictionaries
How to install mkl numpy
Python: How to use pydub
[Python] How to use checkio
[Go] How to use "... (3 periods)"
How to use Django's GeoIp2
Use Numpy
[Python] How to use input ()
How to use the decorator
[Introduction] How to use open3d
How to use Python lambda
How to use Jupyter Notebook
[Python] How to use virtualenv
python3: How to use bottle (3)
python3: How to use bottle
How to use Google Colaboratory