The goal is to understand what a perceptron is.
It takes multiple signals as inputs and outputs one signal. The output signal is binary 0 or 1.
x1 and x2 are input signals, y is output signals, w1 and w2 are weights. ○ is called a node or neuron. Each input signal is multiplied by a unique weight and sent to the neuron. The neuron outputs to the sum of the sent signals (x1w1 + x2w2). At that time, a certain threshold value (limit value) If it exceeds, 1 is output, and if it is less than the threshold, 0 is output.
A simplified model in which the input and output are directly connected is called a simple perceptron, while a model with a layer (intermediate layer, hidden layer) between the input and output is called a multi-layer perceptron.
y = \left\{
\begin{array}{ll}
0 & (x_1 w_1 + x_2 w_2 \leq \phi) \\
1 & (x_1 w_1 + x_2 w_2 \gt \phi)
\end{array}
\right.
x: Input signal, w: Weight, $ \ phi $: Threshold
Since the threshold is difficult to handle, add a negative value and transfer to the left side. The transferred one is $ b $ (bias). Bias determines the ease and difficulty of ignition. Large bias → ignites Easy (the sum of input signals tends to exceed 0).
y = \left\{
\begin{array}{ll}
0 & (b + x_1 w_1 + x_2 w_2 \leq 0) \\
1 & (b + x_1 w_1 + x_2 w_2 \gt 0)
\end{array}
\right.
Adjust the parameters (bias and weight) to meet the firing conditions.
import numpy as np
def AND(x1, x2):
x1_x2 = np.array([x1, x2])
b = -0.8
w1_w2 = np.array([0.5, 0.5])
tmp = b + np.sum(x1_x2 * w1_w2)
if tmp <= 0:
return 0
elif tmp > 0:
return 1
Adjust the parameters so that they meet the conditions. Substitute an appropriate value for ($ b $, $ w_1 $, $ w_2 $). In the code above, it is (-0.8, 0.5, 0.5), but you can reproduce it with (-0.3, 0.2, 0.2), for example.
The NAND operation inverts the sign of the bias and weight of the AND operation.
import numpy as np
def OR(x1, x2):
x1_x2 = np.array([x1, x2])
b = -0.3 #Change
w1_w2 = np.array([0.5, 0.5])
tmp = b + np.sum(x1_x2 * w1_w2)
if tmp <= 0:
return 0
elif tmp > 0:
return 1
Bias $ b $ has changed.
import numpy as np
#NAND operation
def NAND(x1, x2):
x1_x2 = np.array([x1, x2])
b = 0.8
w1_w2 = np.array([-0.5, -0.5])
tmp = b + np.sum(x1_x2 * w1_w2)
if tmp <= 0:
return 0
elif tmp > 0:
return 1
#OR operation
def OR(x1, x2):
x1_x2 = np.array([x1, x2])
b = -0.3
w1_w2 = np.array([0.5, 0.5])
tmp = b + np.sum(x1_x2 * w1_w2)
if tmp <= 0:
return 0
elif tmp > 0:
return 1
#AND operation
def AND(x1, x2):
x1_x2 = np.array([x1, x2])
b = -0.8
w1_w2 = np.array([0.5, 0.5])
tmp = b + np.sum(x1_x2 * w1_w2)
if tmp <= 0:
return 0
elif tmp > 0:
return 1
#XOR operation
def XOR(x1, x2):
nand_ = NAND(x1, x2)
or_ = OR(x1, x2)
xor_ = AND(nand_, or_)
return xor_
Since the XOR operation cannot be expressed by a model (simple perceptron) in which the input and output are directly connected, a layer is bitten inside. It is a multi-layer perceptron. This time, the result of ORing with NAND is ANDed. If you can't express it with a simple perceptron, it means that it is not linearly separable. Please draw a graph and check it.
Is it like this ...
Recommended Posts