Introduction

In neural networks, nonlinear functions are used as activation functions, but I will explain why they are not linear functions.

What is a linear function?

A function whose output is a constant multiple of its input, that is, a straight line function.

Like this.

What is a nonlinear function?

It's a function of non-linear, jerky or crooked lines.

Like this.

In neural networks, you need to use a non-linear function as the activation function. If you use a linear function, the output will be a constant multiple (straight line) of the input. This makes it meaningless to deepen the layer.

why?

Consider one example. Example) A three-layer network using the linear function $ h (x) = ax $ as the activation function

The output $ y $ is $ y (x) = h (h (h (x))) $, which is a one-time $ y (x) = kx $ (but $ k = a ^ 3 $) It can be expressed by multiplication. In other words, it can be expressed by a network without hidden layers. There is no point in making it multi-layered.

That's why neural networks use non-linear functions that aren't linear.

in conclusion

This article is recommended. Decompose "complexity" into many "simple" -forward propagation is a repetition of "linear function" and "simple nonlinearity"

[PYTHON] Why the activation function must be a non-linear function

Introduction

What is a linear function?

What is a nonlinear function?

why?

in conclusion