As a memorandum, I will summarize the outline, classes, examples, keywords to be used, and the sites that were helpful for learning about "supervised learning" and "unsupervised learning".

Mecha Zackri: A prediction model is created by giving training that represents the characteristics and the corresponding answer data. There are classification problems and regression problems in prediction.

Find the parameter that has the smallest loss function (error function) value among all straight lines.

--Class to use: `sklearn.linear_model.LinearRegression`

--Example: Relationship between the number of visitors and sales, etc.
--Keywords: simple regression, multiple regression, polynomial regression, non-linear regression
--Reference site: [Linear regression with scikit-learn (single regression analysis / multiple regression analysis)](https://pythondatascience.plavox.info/scikit-learn/%E7%B7%9A%E5%BD%A2%E5 % 9B% 9E% E5% B8% B0)

It is a binary classification algorithm and is applied to classification problems.

--Class to use: `sklearn.linear_model.LogisticRegression`

--Example: Relationship between sales visits / satisfaction and sales, etc.
--Keywords: sigmoid function, cross entropy error function
--Reference site: Classification of iris by logistic regression of scikit-learn

An algorithm that learns the decision boundary (straight line) away from the data and can be used for both classification and regression.

--Class to use: `sklearn.svm.SVC`

--Case: Text classification, number recognition, etc.
--Keywords: hard margin, soft margin
--Reference site: What is a support vector machine (SVM)? ~ From basic to Python implementation ~

After mapping the data in the real space to a space that can be separated by a hyperplane by the kernel function, the data set is separated.

--Class to use: `sklearn.svm.SVC`

--Case example: Product identification from color information, etc.
--Keywords: Kernel functions (sigmoid kernel, polynomial kernel, RBF [radial basis function] kernel)
--Reference site: [Python] Implementing support vector machines using various kernel functions [iris dataset]

Under the assumption that each feature is independent, we calculate the probability that the data is a label.

--Class to use: `sklearn.naive_bayes.MultinomialNB (Other GaussianNB, GaussianNB, etc.)`

--Case: Judgment of junk mail, etc.
--Keyword: Smoothing
--Reference site: Naive Bayes classifier by scikit-learn

Collect output from multiple decision trees with diversity and produce classification results by majority vote.

--Class to use: `sklearn.ensemble.RandomForestClassifier`

--Case: Classification by behavior history and attributes
--Keywords: Gini coefficient, bootstrap method
--Reference site: [Introduction] Decision tree analysis for beginners by beginners

By sandwiching an intermediate layer between the input and the output, a complex decision boundary is learned.

--Class to use: `sklearn.neural_network.MLPClassifier`

--Case: Image recognition, voice recognition
--Keywords: simple perceptron, activation function, early stopping
--Reference site: Let's make a neural network by yourself

Judgment is made by majority voting of k classifications in the vicinity of the input data.

--Class to use: `sklearn.neighbors.KNeighborsClassifier`

--Reference site: Machine learning ~ K-nearest neighbor method ~

-** a. For classification problems **
--a-1. Confusion matrix

Class to use: `sklearn.metrics.confusion_matrix`

--a-2. Correct answer rate

Class to use: `sklearn.metrics.accuracy_score`

--a-3. Compliance rate

Class to use: `sklearn.metrics.precision_score`

--a-4. Recall rate

Class to use: `sklearn.metrics.recall_score`

--a-5. F value

Class to use: `sklearn.metrics.f1_score`

- a-6. ROC-AUC

Class to use:`sklearn.metrics.roc_curve`

Reference site: Generate confusion matrix with scikit-learn, calculate precision rate, recall rate, F1 value, etc. Calculate ROC curve and its AUC with scikit-learn

-** b. For regression problems **
--b-1. Mean squared error

Class to use: `sklearn.metrics.mean_squared_error`

--b-2. Average absolute error

Class to use: `sklearn.metrics.mean_absolute_error`

--b-3. Coefficient of determination

Class to use: `sklearn.metrics.r2_score`

Reference site: [Evaluate the results of the regression model with scikit-learn](https://pythondatascience.plavox.info/scikit-learn/%E5%9B%9E%E5%B8%B0%E3%83%A2% E3% 83% 87% E3% 83% AB% E3% 81% AE% E8% A9% 95% E4% BE% A1)

-** a. Hyperparameters **
--a-1. Grid search

Class to use: `sklearn.grid_search.GridSearchCV`

--a-2. Random search

Class to use: `sklearn.grid_search.RandomizedSearchCV`

Reference site: Let's tune the model hyperparameters with scikit-learn!

-** b. Data (learning data & verification data) division **
--b-1. Holdout method

Class to use: `sklearn.model_selection.train_test_split`

--b-2. Cross-validation method

Class to use: `sklearn.model_selection.cross_val_score`` `

sklearn.model_selection.KFold` --b-3. Leave one-out method <br> Class to use:`

sklearn.model_selection.LeaveOneOut`

Reference site: [About the method of dividing learning data and test data in machine learning and deep learning](https://newtechnologylifestyle.net/%E6%A9%9F%E6%A2%B0%E5%AD%A6%E7 % BF% 92% E3% 80% 81% E3% 83% 87% E3% 82% A3% E3% 83% BC% E3% 83% 97% E3% 83% A9% E3% 83% BC% E3% 83 % 8B% E3% 83% B3% E3% 82% B0% E3% 81% A7% E3% 81% AE% E5% AD% A6% E7% BF% 92% E3% 83% 87% E3% 83% BC % E3% 82% BF% E3% 81% A8 /)

-** c. Regularization **
--c-1. Ridge regression

Class to use: `sklearn.linear_model.Ridge`

--c-2. Return to Rosso

Class to use: `sklearn.linear_model.Lasso`

Reference site: Explanation of ridge regression and lasso regression in the shortest time (learning of machine learning # 3)

Mecha Zackli: Unlike supervised learning, there are no objective variables. Here, the structure of the feature data is extracted by transforming it into another shape or finding a subset. Techniques include dimensionality reduction and clustering.

Summarize a large number of quantitative explanatory variables into fewer indicators and synthetic variables to reduce the variables in the data.

--Class to use: `sklearn.decomposition.PCA`

--Keywords: Covariance matrix, eigenvalue problem, cumulative contribution rate
--Reference site: Principal component analysis and eigenvalue problem

Classify the data into a given number of clusters and divide similar ones into groups.

--Class to use: `sklearn.cluster.KMeans`

--Case: Marketing data analysis, image classification
--Keywords: sum of squares in cluster, elbow method, silhouette analysis, k-means ++, k-medoids method
--Reference site: How to find the optimum number of clusters for k-means

In sentence data, the similarity between words and sentences is obtained by reducing the feature amount from the number of words to the number of latent topics.

--Class to use: `sklearn.decomposition.TruncatedSVD`

--Keywords: Singular value decomposition, topic model, tf-idf
--Reference site: Machine Learning Latent Semantics Theory

A dimension reduction method that has the property that all I / O data values are non-negative.

--Class to use: `sklearn.decomposition.NMF`

--Case: Recommendation, text mining
--Reference site: Understanding non-negative matrix factorization (NMF) softly

Create a topic from the words in the document and ask which topic the document consists of.

--Class to use: `sklearn.decomposition.LatentDirichletAllocation`

--Case: Natural language processing
--Keywords: Topic model, Dirichlet distribution
--Reference site: Explanation of points that are difficult for beginners to understand in the topic model (LDA)

Clustering is performed by linear combination of multiple Gaussian distributions.

--Class to use: `sklearn.mixture.GaussianMixture`

--Keyword: Gaussian distribution

Dimensionality reduction is performed for non-linear data.

--Class to use: `sklearn.manifold.LocallyLinearEmbedding`

It is a method of reducing high-dimensional data to two or three dimensions, and is used for data visualization.

--Class to use: `sklearn.manifold.TSNE`

Recommended Posts