[PYTHON] [Translation] scikit-learn 0.18 User Guide Table of Contents

Google translated http://scikit-learn.org/0.18/user_guide.html Tutorial here

User guide

1. Supervised learning

1.1. Generalized linear model : us: untranslated

Least squares
The complexity of the least squares method
Ridge regression
Ridge complexity
Normalization parameter setting: Generalized mutual validation
Minimum absolute contraction and selection operator --Lasso
Setting regularization parameters
Use of mutual verification
Information-based model selection
Multitasking Lasso
Elastic Net
Multitasking Elastic Net
Minimum Angle Regression-LARS
LARS Lasso
Mathematical prescription
Orthogonal Matching Pursuit (OMP)
Bayesian regression
Bayesian ridge regression
Automatic validation --ARD
Logistic regression
Stochastic Gradient Descent-SGD
Perceptron
Passive aggressive algorithm
Robustness regression: outliers and modeling errors
Different scenarios and useful concepts 2. RANSAC：RANdom SAmple Consensus
Algorithm details
Theil-Sen Estimator: Generalization-Median-based estimator
Theoretical consideration
Hoover regression
Notes
Polynomial regression: Extension of linear model with basis functions

1.2. Linear and quadratic discriminant analysis untranslated

Dimensionality reduction using linear discriminant analysis
Mathematical formulation of LDA and QDA classifiers
Mathematical formulation of LDA dimensionality reduction
Shrinkage
Estimating algorithm

1.3. Kernel Ridge Regression : us: Untranslated

1.4. Support Vector Machine: jp:

Classification
Multi-class classification
Score and probability
Imbalanced problem
Regression
Density estimation, novelty detection
Complex
Practical tips
Kernel function
Custom kernel
Use Python functions as kernel
Use of Gram matrix
RBF kernel parameters
Mathematical prescription
1. SVC
2. NuSVC
3. SVR
Implementation details

1.5. Stochastic Gradient Descent : us: Untranslated

Classification
Regression
Stochastic gradient descent of sparse data
Complex
Practical tips
Mathematical prescription
1. SGD
Implementation details

1.6. Nearest neighbor method : us: untranslated

Unsupervised nearest neighbor method
Find the closest neighbor
KDTree class and BallTree class
Nearest neighbor classification
Nearest neighbor regression
Nearest neighbor algorithm
Brute force
K-D tree
Ball tree
Selection of nearest neighbor algorithm
Effect of leaf_size
Nearest center of gravity classifier
Closest Shrinken Centroid
Approximate neighborhood nearby
Community sensitive hashing forest
Mathematical description of local sensitivity hash

1.7. Gaussian process : us: untranslated

Gaussian process regression (GPR)
GPR example
GPR with noise level estimation
Comparison of GPR and kernel ridge regression
GPR of Mauna Loa CO2 data
Gaussian process classification (GPC)
GPC example
Probabilistic prediction by GPC
Diagram of GPC on XOR dataset
Gaussian process classification (GPC) in the iris dataset
Gaussian process kernel
Gaussian process kernel API
Basic kernel
Kernel operator
Radial basis function (RBF) kernel
Matteran kernel
Reasonable secondary kernel
Exp-Sine-Squared kernel
Dot product kernel
References
Legacy Gaussian process
Example of introduction regression
Fitting noisy data
Mathematical prescription
First assumption
Best Linear Unbiased Predictions (BLUP)
Empirically Best Linear Bias Predictor (EBLUP)
Correlation model
Regression model
Implementation details

1.8. Cross decomposition : us: untranslated

1.9. Naive Bayes : us: Untranslated

Gauss Naive Bayes
Polynomial naive bayes
Bernoulli Naive Bayes
Out-of-core naive Bayes model fitting

1.10. Decision tree : us: untranslated

Classification
Regression
Multi-output problem
Complex
Practical tips
Tree algorithm: ID3, C4.5, C5.0 and CART
Mathematical prescription
Classification criteria
Regression criteria

1.11. Ensemble method: jp:

Bagging meta estimator
Randomized tree forest
Random forest
Very random tree
Parameters
Parallelization
Functional importance evaluation
Fully random tree embedding
AdaBoost
Usage
Gradient Tree Boost
Classification
Regression
Fit additional weak learners
Tree size control
Mathematical prescription
Loss function
Normalization
Shrinkage
Subsampling
Interpretation
Importance of function
Partially dependent
VotingClassifier
Most class labels (majority vote / carefully selected)
Usage
Weighted average probability (soft voting)
Use Voting Classifier with Grid Search
Usage

1.12. Multi-class algorithm and multi-label algorithm: jp:

Multi-label classification format
One rest
Multi-class learning
Multi-label learning
One-on-one
Multi-class learning
Error correction output code
Multi-class learning
Multi-output regression
Classification of multiple outputs

1.13. Feature selection: jp:

Delete features with low variance
Selection of univariate function
Recursive feature removal
Feature selection using SelectFromModel
L1-based feature selection
Random sparse model
Tree-based feature selection
Selection of functions as part of the pipeline

1.14. Semi-supervised : us: untranslated

Label propagation

1.15. Isotonic regression: jp:

1.16. Probability calibration: jp:

1.17. Neural network model (supervised) : us: untranslated

Multilayer perceptron
Classification
Regression
Normalization
Algorithm
Complex
Mathematical prescription
Practical tips
More control with warm_start

2. Unsupervised learning

2.1. Gaussian mixed model : us: untranslated

Gauss mixed
Advantages and disadvantages of Gaussian Mixture
Advantages
Disadvantages 2 Selection of the number of components in a classical Gaussian mixed model
Estimate algorithm maximize expected value
Variational Bayes Gauss mixed
Estimating algorithm: variational inference
Advantages and disadvantages of transformation reasoning with BayesianGaussianMixture
Advantages
Disadvantages
Dirichlet process

2.2. Manifold learning : us: untranslated

Introduction
Isomap
Complex
Locally linear embedding
Complex
Locally modified linear embedding
Complex
Unique mapping of Hessian matrix
Complex
Spectrum embedding
Complex
Local tangent space alignment
Complex
Multidimensional scaling (MDS)
Metric MDS
Non-metric MDS
t-Distributed Stochastic Neighborhood Embedding (t-SNE)
t-SNE optimization
Burns hat t-SNE
Practical tips

2.3. Clustering : us: Untranslated

2.4. Biclustering : us: Untranslated

Spectral co-clustering
Mathematical prescription
Spectrum by clustering
Mathematical prescription
Bi-clustering evaluation

2.5. Decompose the signal in the component (matrix factorization problem): jp:

Principal component analysis (PCA)
Accurate PCA and stochastic interpretation
Incremental PCA
PCA with randomized SVD
Kernel PCA
Sparse PCA and MiniBatchSparsePCA
Truncation singular value decomposition and latent semantic analysis
Dictionary learning
Sparse coding with a pre-computed dictionary
General dictionary learning
Mini-batch dictionary learning
Factor analysis
Independent Component Analysis (ICA)
Non-negative matrix factorization (NMF or NNMF)
Latent Dirichlet Allocation (LDA)

2.6. Covariance estimation : us: untranslated

Empirical covariance
Reduced covariance
Basic contraction
Ledoit-Wolf Shrink
Oracle approximate contraction
Sparse inverse covariance
Robust covariance estimation
Minimum covariance determinant

2.7. Detection of novelty and outliers: jp:

Detection of novelty
Outlier detection
Install the oval envelope
Isolation forest 3.1 Class SVM vs. Elliptical Envelope vs. Isolation Forest

2.8. Density estimation: jp:

Density estimation: Histogram
Kernel density estimation

2.9. Neural network model (unsupervised) : us: untranslated

Limited Boltzmann machine
Graphical model and parameterization
Bernoulli restricted Boltzmann machine
Stochastic maximum likelihood learning

3. Model selection and evaluation

3.1. Cross-validation: Evaluate estimator performance: jp:

Calculation of cross-validated metrics
Obtaining forecasts by cross-validation
Cross-validation iterator
i.i.d cross-validation iterator data
K times 2. Leave One Out（LOO） 3. Leave P Out（LPO）
Random Permutation Mutual Verification a.k.a. Shuffle & Split
Mutual validation iterator with hierarchy based on class label
Layered K times
Layered shuffle split
Mutual validation iterator for grouped data
Group k times
Leave one group
Leave P group
Group shuffle split
Predefined Fold-Splits / Validation-Sets
Mutual verification of time series data
Time series division
Shuffle precautions
Mutual validation and model selection

3.2. Tuning of hyperparameters of estimator: jp:

Complete grid search
Randomized parameter optimization
Parameter search tips
Objective metric specification
Composite estimates and parameter space
Model selection: development and evaluation
Parallel
Robustness to disability
Alternative to brute force parameter search
Model-specific mutual validation
Information standards
Other estimators

3.3. Model evaluation: Quantify the quality of prediction: jp:

3.4. Model persistence: flag_jp:

Persistence example
Security and maintainability limits

3.5. Verification curve: Plot the score to evaluate the model: jp:

Verification curve
Learning curve

4. Data set conversion

4.1. Pipeline and Feature Union: Estimator combination: jp:

Pipeline: Chain estimator
Usage
Note
FeatureUnion: Composite feature space
Usage

4.2. Feature extraction: jp:

Loading features from dicts
Feature hash
Implementation details
Text feature extraction
Word notation
Rarity
How to use the common vectorizer
Weighting of Tf-idf terms
Decoding text files
Applications and samples
Limitations of expression in Bag of Words
Vectorize a large text corpus using hash tricks
Perform out-of-core scaling with HashingVectorizer
Customized vectorizer class
Image feature extraction
Patch extraction
Image connectivity graph

4.3. Data preprocessing: jp:

Standardization, averaging and variance scaling
Scaling features to range
Sparse data scaling
Scaling data containing outliers
Centering kernel matrix
Normalization
Binarization
Feature binarization
Encode the function of the category
Completion of missing values
Generate polynomial features
Custom transformer

4.4. Unsupervised dimensionality reduction: jp:

PCA: Principal component analysis
Random projection
Feature agglomeration

4.5. Random projection: jp:

Johnson-Lindenstrauss lemma
Gauss random projection
Sparse random projection

4.6. Kernel approximation : us: untranslated

Nystroem method for kernel approximation
Radial basis function kernel
Additive Chi Squared Kernel
Skewed chi-square kernel
Math details

4.7. Pairwise Metrics, Similarities and Kernels

Cosine similarity
Linear kernel
Polynomial kernel
Sigmoid kernel
RBF kernel
Laplacian kernel
Chi-square kernel

4.8. Transform the prediction target (y): jp:

Label binarization
Label encoding

5. Dataset reading utility : us: untranslated

6. Computational Expansion Strategy: Larger Data : us: Untranslated

Scaling instances using out-of-core learning
Streaming instance
Feature extraction
Incremental learning
Example
Notes

7. Computation performance : us: untranslated

Predicted latency
Bulk vs. atomic mode
Impact of number of features
Impact of input data representation
Impact of model complexity
Feature extraction latency
Predicted throughput
Tips and techniques
Linear algebra library
Model compression
Model shape change
Link

© 2010 --2016, scikit-learn developers (BSD license).

Recommended Posts

[Translation] scikit-learn 0.18 User Guide Table of Contents

[Translation] scikit-learn 0.18 Tutorial Table of Contents

[Translation] scikit-learn 0.18 User Guide 2.7. Detection of novelty and outliers

[Translation] scikit-learn 0.18 User Guide 4.5. Random projection

[Translation] scikit-learn 0.18 User Guide 1.11. Ensemble method

[Translation] scikit-learn 0.18 User Guide 1.15. Isotonic regression

[Translation] scikit-learn 0.18 User Guide 4.2 Feature extraction

[Translation] scikit-learn 0.18 User Guide 1.16. Probability calibration

[Translation] scikit-learn 0.18 User Guide 1.13 Feature selection

[Translation] scikit-learn 0.18 User Guide 3.4. Model persistence

[Translation] scikit-learn 0.18 User Guide 2.8. Density estimation

[Translation] scikit-learn 0.18 User Guide 4.3. Data preprocessing

[Translation] scikit-learn 0.18 User Guide 3.1. Cross-validation: Evaluate the performance of the estimator

[Translation] scikit-learn 0.18 User Guide 4.4. Unsupervised dimensionality reduction

[Translation] scikit-learn 0.18 User Guide 1.4. Support Vector Machine

[Translation] scikit-learn 0.18 User Guide 3.3. Model evaluation: Quantify the quality of prediction

[Translation] scikit-learn 0.18 User Guide 4.1. Pipeline and Feature Union: Combination of estimators

[Translation] scikit-learn 0.18 User Guide 1.12. Multi-class algorithm and multi-label algorithm

taichi's Torisetsu ⓪ Table of contents

[Translation] scikit-learn 0.18 User Guide 4.8. Convert the prediction target (y)

[Linux] [Initial Settings] Table of Contents

[Python] [Table of Contents Links] Python Programming

Python Math Series ⓪ Table of Contents

Github Interesting Repository ⓪ Table of Contents

Introductory table of contents for python3

Pandas User Guide "Table Formatting and PivotTables" (Official Document Japanese Translation)

Pandas User Guide "merge, join and concatenate" (Japanese translation of official documentation)

[Translation] scikit-learn 0.18 User Guide 3.5. Verification curve: Plot the score to evaluate the model

[Translation] scikit-learn 0.18 User Guide 2.5. Decompose the signal in the component (matrix factorization problem)

Nogizaka recognition program (using Yolov5) Table of contents

Let Code Table of Contents Starting from Zero

Japanese translation of self-study "A Beginner's Guide to Getting User Input in Python"

Automating simple tasks with Python Table of contents

Translation of scsi_mid_low_api.txt

[Translation] scikit-learn 0.18 Tutorial Introduction of machine learning by scikit-learn

Contents of __name__

Create a table of contents with IPython notebook

Creating BINGO "Web Tools" with Python (Table of Contents)

Obtained contents of sosreport

[Note] Contents of shape [0], shape [1], shape [2]

Analyzing user dissatisfaction very easily from the contents of inquiries

Pandas User Guide "Multi-Index / Advanced Index" (Official document Japanese translation)

Deploy Django + React from scratch to GKE: Table of Contents

Pandas User Guide "Manipulating Missing Data" (Official Document Japanese Translation)

[Linux] [Initial Settings] Table of Contents for Development Environment Setup