[PYTHON] DeepRunning ~ Level4.4.2 ~

Level4. Machine learning course (theory and practice)

alt What is a deep learning course that can be crushed in the field in 3 months

4-4. Principal component analysis

4-4-7. Hands-on

● Create a logistic regression model using breast cancer test data. ● Dimensional compression is performed on a two-dimensional space using the main component. (32 dimensions ⇒ 2D compression)

主成分分析0001.png

主成分分析0002.png

主成分分析0003.png

Unnecessary columns have been deleted properly.

主成分分析0004.png

Create objective variables and explanatory variables, and divide learning data and verification data.

主成分分析0005.png

As it is, it is 32 dimensions and the accuracy is 97.2%.

主成分分析0006.png

Analyze the main components, The axis of the first main component is 43% or more The axis of the second main component is about 20% The axis of the third main component is about 10% Therefore, it may be possible to maintain about 65% with the first and second main components.

⇒ Try to visualize it.

主成分分析0007.png 主成分分析0008.png

As you can see in the lecture, the boundaries are ambiguous in 2D.

4-4-8. Consideration

If you can pack data in everything and use many variables, the accuracy will increase. However, controlling the cost of calculation and maintaining accuracy will require experience and steady verification. I would like to try changing the explanatory variables to be adopted a little, and to accumulate failures and try. Principal component analysis is easy, but I felt it was a powerful analysis method.

Recommended Posts

DeepRunning ~ Level 6 ~
DeepRunning ~ Level4.4.2 ~
DeepRunning ~ Level 4.6 ~
DeepRunning ~ Level4.3.1 ~
DeepRunning ~ Level 3.2 ~
DeepRunning ~ Level3.3 ~
DeepRunning ~ Level4.3.2 ~
DeepRunning ~ Level7 ~
DeepRunning ~ Level4.5 ~
DeepRunning ~ Level2, Level3.1 ~
DeepRunning ~ Level4.4.1 ~
DeepRunning ~ Level 1 ~
DeepRunning ~ Level 4.7 ~
DeepRunning ~ Level5 ~