[PYTHON] JDLA E Qualification Measures Applied Mathematics

Take the 1st JDAL E qualification in 2020. A memorandum of study because it seems that applied mathematics will appear in the problem. We plan to summarize ①, ②, and ③ below.

① Inverse matrix calculation ② Singular value decomposition ③ Probability

① Inverse matrix calculation

This section describes the following contents. ・ Inverse matrix calculation of quadratic square matrix ・ Inverse matrix calculation of cubic square matrix ・ Calculation using numpy's linalg

Creating an arbitrary matrix

The matrix used in this example was created with the following code.

matrix.py


import numpy as np
n = 3 #Matrix size
A = np.random.randint(0,9,(n,n)) #N × n matrix generation with integers from 0 to 9

Inverse matrix calculation of quadratic square matrix

A =
\begin{pmatrix}
a & b \\
c & d
\end{pmatrix}
A^{-1} = \frac{1}{det(A)}
\begin{pmatrix}
d & -b \\
-c & a
\end{pmatrix}
det(A) = ad-bc

◆ Example

A =
\begin{pmatrix}
4 & 7 \\
6 & 9
\end{pmatrix}
A^{-1}=
\frac{1}{\rm 4 ・ 9-7.6}\
\begin{pmatrix}
9 & -7 \\
-6 & 4
\end{pmatrix}
=
\begin{pmatrix}
-3/2 & 7/6 \\
1 & -2/3
\end{pmatrix}

◆ Check with numpy's linalg linalg is a linear algebra module for numpy. Please refer to the link below for details. <a href="http://docs.scipy.org/doc/numpy/reference/routines.linalg.html"_blank>Linear algebra (numpy.linalg)

two.py


import numpy as np
A = np.array([[4,7],[6,9]])
#Inverse matrix calculation
invA = np.linalg.inv(A)
print(invA)
#[[-1.5         1.16666667]
# [1.         -0.66666667]]
print(np.matmul(A,invA))
#[[ 1.00000000e+00 -3.33066907e-16]
# [ 0.00000000e+00  1.00000000e+00]]
#It is a unit matrix properly

Inverse matrix calculation of a cubic square matrix

A =
\begin{pmatrix}
a_{11} & a_{12} & a_{13}\\
a_{21} & a_{22} & a_{23}\\
a_{31} & a_{32} & a_{33}
\end{pmatrix}
A^{-1} = \frac{1}{det(A)}
\begin{pmatrix}
Δ_{11} & Δ_{12} & Δ_{13}\\
Δ_{21} & Δ_{22} & Δ_{23}\\
Δ_{31} & Δ_{32} & Δ_{33}
\end{pmatrix}^{T}
det(A) = a_{11}a_{22}a_{33} + a_{12}a_{23}a_{31} + a_{13}a_{21}a_{32} - a_{13}a_{22}a_{31} -a_{11}a_{23}a_{32}-a_{12}a_{21}a_{33}
Δ_{ij}: Determinant of 2 × 2 matrix excluding the i-th row and j-th column of A(−1)^{i+j}Doubled(Cofactor)

◆ Image of determinant It is commonly called the Rule of Sarrus. 三次行列式.jpg

◆ Cofactor image 余因子.jpg

◆ Example

A =
\begin{pmatrix}
7 & 7 & 5\\
8 & 5 & 6\\
2 & 0 & 7
\end{pmatrix}
det(A)=
7 ・ 5 ・ 7+7 ・ 6 ・ 2+5.8.0-5 ・ 5 ・ 2-7.6.0-7/8/7= -113
Δ_{11} = (-1)^{1+1}
\begin{vmatrix}
5 & 6 \\
0 & 7
\end{vmatrix}
=5.7-6.0=35
Δ_{12} = (-1)^{1+2}
\begin{vmatrix}
8 & 6 \\
2 & 7
\end{vmatrix}
=-8.7+6.2=-44\\
・ ・ ・ The following is omitted
A^{-1} =
\frac{-1}{113}\
\begin{pmatrix}
35 & -44 & -10\\
-49 & 39 & 14\\
17 & -2 & -21
\end{pmatrix}^{T}
=\frac{1}{113}\
\begin{pmatrix}
-35 & 49 & -17\\
44 & -39 & 2\\
10 & -14 & 21
\end{pmatrix}

◆ Check with numpy's linalg

three.py


import numpy as np
A = np.array([[7,7,5],[8,5,6],[2,0,7])
#Inverse matrix calculation
invA = np.linalg.inv(A)
print(invA)
#[[-0.30973451  0.43362832 -0.15044248]
# [ 0.38938053 -0.34513274  0.01769912]
# [ 0.08849558 -0.12389381  0.18584071]]
#It is difficult to understand because it is a decimal number
print(np.matmul(A,invA))
#[[ 1.00000000e+00 -1.80411242e-16  2.77555756e-17]
# [ 1.94289029e-16  1.00000000e+00 -5.55111512e-17]
# [ 4.16333634e-17  1.38777878e-17  1.00000000e+00]]
#It is properly an identity matrix

Since the inverse matrix is a decimal and it is difficult to tell if the answer is correct, correct it to a fraction. First, I used Fraction, a built-in function of python, to make it a fraction.

three.py continued


from fractions import Fraction
print(Fraction(invA[0,0]))
# -2789840477132165/9007199254740992

It has not returned to the original fraction due to the effect of rounding error. Even if I looked it up, I couldn't find the conversion code from decimal to fraction, so I made it myself for studying. This time, I implemented it with a lot of effort, so if anyone knows a better way, please let me know.

Below is the process flow and source. (1) Create a fraction table in advance (2) Extract table elements whose difference from the inverse matrix element is close to 0 (3) Reconvert from index of table element to fraction

three.py continued


#Prepare a fractional table
# table_size =At 1000
# [[1/1,2/1,...,1000/1],[1/2,2/2,...,1000/2],...,[1/1000,2/1000,..,1000/1000]]
table_size = 3000
l_frac = np.array([[x/y for x in range(1,table_size+1)] for y in range(1,table_size+1)])
#Set below the error when converting a fraction to floating point
e = 1.0e-10
#Ndarray to store the fraction of the answer, later"1/5"Since a fraction is entered as a character string like, cast with str
ans_flac = np.random.rand(len(A)**2).astype("str")

counter = 0

#Compare the difference with the fractional table for each element of the inverse matrix
for i in range(len(A)):
    for j in range(len(A)):
        if invA[i,j] != 0:
            #Get the index of the fractional table that meets the conditions
            # idx = (Index of rows that meet the conditions,Index of columns that meet the conditions)
            idx = np.where(abs(l_frac-abs(invA[i,j])) < e)
            #Conditional branch with positive and negative
            if invA[i,j] < 0:
                #Since there are multiple matches, the first index(idx[][0])Select ans_Store in flac
                ans_flac[counter] = str(-(idx[1][0]+1)) + "/" + str(idx[0][0]+1)
                counter += 1
            else:
                ans_flac[counter] = str(idx[1][0]+1) + "/" + str(idx[0][0]+1)
                counter += 1                
        else:
            #When the inverse matrix element is 0
            ans_flac[counter] = "0"
            counter += 1
#Transformed into the same shape as the original matrix
ans_flac = ans_flac.reshape([len(A),len(A)])
print(ans_flac)
#[['-35/113' '49/113' '-17/113']
# ['44/113' '-39/113' '2/113']
# ['10/113' '-14/113' '21/113']]

If a fraction that deviates from table_size comes, an error will be thrown, so if the digits of the matrix element become large, you need to prepare a larger table.

② Singular value decomposition

This section describes the following contents. ・ Calculation of singular value decomposition ・ Calculation using numpy's linalg

The following article was helpful for the concept and usage of singular values. <a href="https://qiita.com/sakami/items/d01fa353b4e1f48623a8"_blank> SVD (singular value decomposition) explanation

As introduced in the above article, the book is detailed in the natural language processing edition with zero. <a href="https://www.amazon.co.jp/%E3%82%BC%E3%83%AD%E3%81%8B%E3%82%89%E4%BD%9C%E3% 82% 8BDeep-Learning-% E2% 80% 95% E8% 87% AA% E7% 84% B6% E8% A8% 80% E8% AA% 9E% E5% 87% A6% E7% 90% 86% E7 % B7% A8-% E6% 96% 8E% E8% 97% A4-% E5% BA% B7% E6% AF% 85 / dp / 4873118360 "_ blank>" Deep Learning from scratch ② Natural language processing "Yasutake Saito

I will leave the details to the link and practice the definition and calculation of singular value decomposition.

Definition: For the m × n matrix A, the m × m orthogonal matrix U that satisfies the following relationship,There is an orthogonal matrix V of n × n and a diagonal matrix S of m × n.
A = USV^{T}
U,V is AA respectively^{T},A^{T}It is a matrix made up of the eigenvectors of A, and the vectors are called left singular vector and right singular vector. The diagonal component of S is the square root of the eigenvalue and is called the singular value.

◆ Example

A = 
\begin{pmatrix}
1 & 1 & 0\\
0 & 0 & 1
\end{pmatrix}

◆ Matrix U on the left First, find the left singular vector and find the matrix U.

AA^{T} = 
\begin{pmatrix}
2 & 0\\
0 & 1
\end{pmatrix}\If you set the eigenvalue as λ\\
det(AA^{T}-λE) =
 \begin{vmatrix}
2-λ & 0\\
0 & 1-λ
\end{vmatrix} = 0\\
∴λ = 1,2

When λ = 2, if the eigenvector is set as (ux1, uy1)

 \begin{pmatrix}
2-λ & 0\\
0 & 1-λ
\end{pmatrix} 
 \begin{pmatrix}
u_{x1}\\
u_{y1}
\end{pmatrix} =
 \begin{pmatrix}
0 & 0\\
0 & -1
\end{pmatrix} 
 \begin{pmatrix}
u_{x1}\\
u_{y1}
\end{pmatrix}
= 0\\
\\

\left\{
\begin{array}{ll}
u_{x1}・ 0+ u_{y1}・ 0 = 0 \\
u_{x1}・ 0- u_{y1}・ 1= 0
\end{array}
\right.

When this always holds,

∴u_{y1} = 0

Therefore, the eigenvector when λ = 2 is as follows when the vector length is normalized to 1.

 \begin{pmatrix}
u_{x1}\\
u_{y1}
\end{pmatrix}=
\begin{pmatrix}
1\\
0
\end{pmatrix}・ ・ ・(1)

Similarly, when λ = 1, if the eigenvector is set as (ux2, uy2),

 \begin{pmatrix}
2-λ & 0\\
0 & 1-λ
\end{pmatrix} 
 \begin{pmatrix}
u_{x2}\\
u_{y2}
\end{pmatrix} =
 \begin{pmatrix}
1 & 0\\
0 & 0
\end{pmatrix} 
 \begin{pmatrix}
u_{x2}\\
u_{y2}
\end{pmatrix}
= 0\\
\\

\left\{
\begin{array}{ll}
u_{x2}・ 1+ u_{y2}・ 0= 0 \\
u_{x2}・ 0+ u_{y2}・ 0 = 0
\end{array}
\right.\\
∴u_{y1} = 0

The eigenvectors are below.

 \begin{pmatrix}
u_{x2}\\
u_{y2}
\end{pmatrix}=
\begin{pmatrix}
0\\
1
\end{pmatrix}・ ・ ・(2)

U can be obtained from (1) and (2).

U = 
 \begin{pmatrix}
u_{x1} & u_{x2}\\
u_{y1} & u_{y2}
\end{pmatrix}=
\begin{pmatrix}
1 & 0\\
0 & 1
\end{pmatrix}

◆ Right matrix V Next, find the right singular vector and find the matrix V.

A^{T}A = 
\begin{pmatrix}
1 & 1 & 0\\
1 & 1 & 0\\
0 & 0 & 1
\end{pmatrix}\If you set the eigenvalue as λ\\
det(AA^{T}-λE) =
 \begin{vmatrix}
1-λ & 1 & 0\\
1 & 1-λ & 0\\
0 & 0 & 1-λ
\end{vmatrix} = 0\\
(1-λ)(λ-2)λ = 0\\
∴λ = 0,1,2

I omitted the intermediate calculation, but you can calculate the determinant by the Sarrus method. When λ = 2, if the eigenvector is set as (vx1, vy1, vz1)

\begin{pmatrix}
1-λ & 1 & 0\\
1 & 1-λ & 0\\
0 & 0 & 1-λ
\end{pmatrix} 
\begin{pmatrix}
v_{x1}\\
v_{y1}\\
v_{z1}
\end{pmatrix} =
\begin{pmatrix}
-1 & 1 & 0\\
1 & -1 & 0\\
0 & 0 & -1
\end{pmatrix} 
\begin{pmatrix}
v_{x1}\\
v_{y1}\\
v_{z1}
\end{pmatrix}
= 0\\
\\

\left\{
\begin{array}{ll}
-1 ・ v_{x1} + 1 ・ v_{y1} +0 ・ v_{z1}  = 0 \\
1 ・ v_{x1} -1 ・ v_{y1} +0 ・ v_{z1}  = 0 \\
0 ・ v_{x1} +0 ・ v_{y1} -1 ・ v_{z1}  = 0
\end{array}
\right.

When this always holds,

∴v_{x1} = v_{y1},v_{z1} = 0

Therefore, the eigenvector is

\begin{pmatrix}
v_{x1}\\
v_{y1}\\
v_{z1}
\end{pmatrix}=
\begin{pmatrix}
1/\sqrt{2}\\
1/\sqrt{2}\\
0
\end{pmatrix}・ ・ ・(3)

The same calculation can be performed when λ = 1,0. Only the answer is given here.

λ=When 1
\begin{pmatrix}
v_{x2}\\
v_{y2}\\
v_{z2}
\end{pmatrix}=
\begin{pmatrix}
0\\
0\\
1
\end{pmatrix}・ ・ ・(4)\\
λ=When 0
\begin{pmatrix}
v_{x3}\\
v_{y3}\\
v_{z3}
\end{pmatrix}=
\begin{pmatrix}
1/\sqrt{2}\\
-1/\sqrt{2}\\
0
\end{pmatrix}・ ・ ・(5)

V can be obtained from (3), (4), and (5).

V = 
\begin{pmatrix}
v_{x1} & v_{x2} & v_{x3}\\
v_{y1} & v_{y2} & v_{y3}\\
v_{z1} & v_{z2} & v_{z3}
\end{pmatrix}=
\begin{pmatrix}
1/\sqrt{2} & 0 & 1/\sqrt{2}\\
1/\sqrt{2} & 0 & -1/\sqrt{2}\\
0 & 1 & 0
\end{pmatrix}

◆ Singular value matrix S Arrange the square roots of the eigenvalues diagonally in descending order.

S = 
\begin{pmatrix}
\sqrt{2} & 0 & 0\\
0 & 1 & 0
\end{pmatrix}

◆ Check Again, use numpy's linalg module to match the answers. You can get U, S, V with a function called linalg.svd ().

singular_value_decomposition.py



import numpy as np

A = np.array([[1,1,0],[0,0,1]])

#V is calculated transpose
U,S,V_T = np.linalg.svd(A, full_matrices = True)
S = np.diag(S) #The singular value is[λ1,λ2]Diagonalized because it is one-dimensional like
S = np.insert(S,2,[0,0],axis = 0) #Dimension alignment
print("U=\n",U)
print("S=\n",S)
print("V=\n",V.T) #Display the original V by transposing transpose
print("A=\n",np.dot(np.dot(U,S),V))
# U=
# [[1. 0.]
# [0. 1.]]
# S=
# [[1.41421356 0.         0.        ]
# [0.         1.         0.        ]]
# V=
# [[ 0.70710678  0.          0.70710678]
# [ 0.70710678  0.         -0.70710678]
# [ 0.          1.          0.        ]]
# A=
# [[1. 1. 0.]
# [0. 0. 1.]]
#Match the original matrix

For the basics of other linear algebra, the following article was also helpful. <a href="https://qiita.com/jun40vn/items/15627062d5f1e1dcec2e"_blank> Basics of Machine Learning 2 Linear Algebra Memo

③ Probability

This section describes the following contents. ・ Conditional probability ・ Bayes' theorem ・ Bernoulli distribution ・ Binomial distribution ・ Multi-nooy (categorical) distribution ・ Multinomial distribution

Regarding the probability distribution, the following site is very easy to understand and I used it as a reference. <a href="http://machine-learning.hatenablog.com/entry/2016/03/26/211106"_blank> Summary of complicated discrete distributions

Conditional probability

Consider the case where two people, A and B, draw one lottery for 3 out of 10 lottery tickets. After drawing the lottery, I will put it back.

The probability of hitting is expressed as $ P (A) $, and the probability of losing is expressed as $ P (\ overline {A}) $, which is summarized in the Venn diagram and table as follows.

弁図.jpg

条件付き確率1.jpg

At this time, if the conditional probability that A draws a hit and B also draws a hit is set as $ P (B \ mid A) $,

P(B \mid A) =\frac{P(A∩B)}{P(A)}・ ・ ・(6)

It will be. It is an image of the figure below.

条件付き確率3.jpg

  • Simultaneous probability that A and B hold at the same time $ P (A∩B) $ is the image in the figure below. 条件付き確率4.jpg

Bayes' theorem

The following relationship also holds by the conditional probability equation (6).

P(A \mid B) =\frac{P(A∩B)}{P(B)}・ ・ ・(7)

条件付き確率5.jpg

Bayes' theorem is the formula that eliminates $ P (A∩B) $ from (6) and (7).

P(B \mid A) =\frac{P(A \mid B) P(B)}{P(A)}・ ・ ・(8)

The usage will be updated separately.

Bernoulli distribution

Consider an event with only two results. A common example is when you throw a coin and try to get the back out or try it only once. Assuming that the front and back probabilities are $ μ $, the front and back probabilities are summarized below.

\left\{
\begin{array}{ll}
Probability of appearing in the table μ= 0.5 \\
Probability of getting out of the way 1-μ  = 0.5 \\
\end{array}
\right.

If the event on the front side of the coin is $ X = 1 $ and the event on the back side is $ X = 0 $, the probability distribution of coin throwing $ Bern (X \ mid μ) $ is

Bern(X \mid μ) = 
μ^{X}(1-μ)^{1-X} \\(X=0,1)

It will be. It looks like this in the graph.

bern.png

bern.py


#Drawing the Bernoulli distribution
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

mu = 0.5
x = np.array([0,1])
y = (mu**x)*(1-mu)**(1-x)
plt.scatter(x,y)

Also, the expected value $ E (X) = μ $ and the variance $ σ (X) = μ (1-μ) $.

Binomial distribution

Probability distribution when coin throwing is repeated multiple times. The binomial distribution $ Bin (m \ mid \ μ, N) $ is that the coin is thrown N times and the table appears m times.

Bin(m \mid\ μ,N)=\frac{N!}{m!(N-m)!}μ^{m}(1-μ)^{N-m}

It will be.

Also, the expected value $ E (X) = Nμ $ and the variance $ σ (X) = Nμ (1-μ) $.

Martineuy (categorical) distribution

An extension of the Bernoulli distribution. Consider an event with n different results. A common example is when you roll a dice to try a roll. Assuming that the probability of each eye appearing is $ μ_ {k} $, the multi-nuy distribution $ Cat (X \ mid \ textbf {μ}) $ is as follows.

Cat(X\mid\textbf{μ})=\prod_{k=1}^{n}\mu_k^{X_{k}}

Multinomial distribution

An extension of the Martineuy distribution. This is the probability distribution when an event with n different results is performed N times. The multinomial distribution $ Multi (\ textbf {m} \ mid \ textbf {μ}, N) $ is as follows.

Multi(\textbf{m}\mid\textbf{μ},N)= \frac{N!}{m_{1}!...m_{k}!}\prod_{k=1}^{n}\mu_k^{m_{k}}

normal distribution

The most commonly used distribution.

f(x) = \frac{1}{\sqrt{2\pi\sigma}}\exp{\left\{-\frac{(x-\mu)^2}{2\sigma^2}\right\}}

$ \ Mu $ is the mean, $ \ sigma ^ {2} $ is the variance. Putting $ Z = (x- \ mu) / \ sigma $ gives a standard normal distribution.

f(x) = \frac{1}{\sqrt{2\pi}}\exp{\left\{-\frac{Z^2}{2}\right\}}

bonus

Norm

Definition of $ L_ {p} $ norm

{\| {\bf x} \|_p = (\ |x_1|^p + |x_2|^p + \cdots + |x_n|^p )^{1/p}}

It comes out in the context of regularization. The $ L_ {2} $ norm is also known as the Euclidean distance, and the $ L_ {1} $ norm is also known as the Manhattan distance. It is given to the loss function as weight decay to prevent overfitting. [Machine learning] What is the LP norm? [DL] What is weight decay?

Recommended Posts

JDLA E Qualification Measures Applied Mathematics
[Rabbit Challenge (E qualification)] Applied Mathematics
E qualification report bullet points