[PYTHON] [Translation] scikit-learn 0.18 User Guide Table of Contents

Google translated http://scikit-learn.org/0.18/user_guide.html Tutorial here


User guide

1. Supervised learning

1.1. Generalized linear model : us: untranslated

  1. Least squares
  2. The complexity of the least squares method
  3. Ridge regression
  4. Ridge complexity
  5. Normalization parameter setting: Generalized mutual validation
  6. Minimum absolute contraction and selection operator --Lasso
  7. Setting regularization parameters
  8. Use of mutual verification
  9. Information-based model selection
  10. Multitasking Lasso
  11. Elastic Net
  12. Multitasking Elastic Net
  13. Minimum Angle Regression-LARS
  14. LARS Lasso
  15. Mathematical prescription
  16. Orthogonal Matching Pursuit (OMP)
  17. Bayesian regression
  18. Bayesian ridge regression
  19. Automatic validation --ARD
  20. Logistic regression
  21. Stochastic Gradient Descent-SGD
  22. Perceptron
  23. Passive aggressive algorithm
  24. Robustness regression: outliers and modeling errors
  25. Different scenarios and useful concepts 2. RANSAC:RANdom SAmple Consensus
  26. Algorithm details
  27. Theil-Sen Estimator: Generalization-Median-based estimator
  28. Theoretical consideration
  29. Hoover regression
  30. Notes
  31. Polynomial regression: Extension of linear model with basis functions

1.2. Linear and quadratic discriminant analysis untranslated

  1. Dimensionality reduction using linear discriminant analysis
  2. Mathematical formulation of LDA and QDA classifiers
  3. Mathematical formulation of LDA dimensionality reduction
  4. Shrinkage
  5. Estimating algorithm

1.3. Kernel Ridge Regression : us: Untranslated

1.4. Support Vector Machine: jp:

  1. Classification
  2. Multi-class classification
  3. Score and probability
  4. Imbalanced problem
  5. Regression
  6. Density estimation, novelty detection
  7. Complex
  8. Practical tips
  9. Kernel function
  10. Custom kernel
  11. Use Python functions as kernel
  12. Use of Gram matrix
  13. RBF kernel parameters
  14. Mathematical prescription
    1. SVC
    2. NuSVC
    3. SVR
  15. Implementation details

1.5. Stochastic Gradient Descent : us: Untranslated

  1. Classification
  2. Regression
  3. Stochastic gradient descent of sparse data
  4. Complex
  5. Practical tips
  6. Mathematical prescription
    1. SGD
  7. Implementation details

1.6. Nearest neighbor method : us: untranslated

  1. Unsupervised nearest neighbor method
  2. Find the closest neighbor
  3. KDTree class and BallTree class
  4. Nearest neighbor classification
  5. Nearest neighbor regression
  6. Nearest neighbor algorithm
  7. Brute force
  8. K-D tree
  9. Ball tree
  10. Selection of nearest neighbor algorithm
  11. Effect of leaf_size
  12. Nearest center of gravity classifier
  13. Closest Shrinken Centroid
  14. Approximate neighborhood nearby
  15. Community sensitive hashing forest
  16. Mathematical description of local sensitivity hash

1.7. Gaussian process : us: untranslated

  1. Gaussian process regression (GPR)
  2. GPR example
  3. GPR with noise level estimation
  4. Comparison of GPR and kernel ridge regression
  5. GPR of Mauna Loa CO2 data
  6. Gaussian process classification (GPC)
  7. GPC example
  8. Probabilistic prediction by GPC
  9. Diagram of GPC on XOR dataset
  10. Gaussian process classification (GPC) in the iris dataset
  11. Gaussian process kernel
  12. Gaussian process kernel API
  13. Basic kernel
  14. Kernel operator
  15. Radial basis function (RBF) kernel
  16. Matteran kernel
  17. Reasonable secondary kernel
  18. Exp-Sine-Squared kernel
  19. Dot product kernel
  20. References
  21. Legacy Gaussian process
  22. Example of introduction regression
  23. Fitting noisy data
  24. Mathematical prescription
  25. First assumption
  26. Best Linear Unbiased Predictions (BLUP)
  27. Empirically Best Linear Bias Predictor (EBLUP)
  28. Correlation model
  29. Regression model
  30. Implementation details

1.8. Cross decomposition : us: untranslated

1.9. Naive Bayes : us: Untranslated

  1. Gauss Naive Bayes
  2. Polynomial naive bayes
  3. Bernoulli Naive Bayes
  4. Out-of-core naive Bayes model fitting

1.10. Decision tree : us: untranslated

  1. Classification
  2. Regression
  3. Multi-output problem
  4. Complex
  5. Practical tips
  6. Tree algorithm: ID3, C4.5, C5.0 and CART
  7. Mathematical prescription
  8. Classification criteria
  9. Regression criteria

1.11. Ensemble method: jp:

  1. Bagging meta estimator
  2. Randomized tree forest
  3. Random forest
  4. Very random tree
  5. Parameters
  6. Parallelization
  7. Functional importance evaluation
  8. Fully random tree embedding
  9. AdaBoost
  10. Usage
  11. Gradient Tree Boost
  12. Classification
  13. Regression
  14. Fit additional weak learners
  15. Tree size control
  16. Mathematical prescription
  17. Loss function
  18. Normalization
  19. Shrinkage
  20. Subsampling
  21. Interpretation
  22. Importance of function
  23. Partially dependent
  24. VotingClassifier
  25. Most class labels (majority vote / carefully selected)
  26. Usage
  27. Weighted average probability (soft voting)
  28. Use Voting Classifier with Grid Search
  29. Usage

1.12. Multi-class algorithm and multi-label algorithm: jp:

  1. Multi-label classification format
  2. One rest
  3. Multi-class learning
  4. Multi-label learning
  5. One-on-one
  6. Multi-class learning
  7. Error correction output code
  8. Multi-class learning
  9. Multi-output regression
  10. Classification of multiple outputs

1.13. Feature selection: jp:

  1. Delete features with low variance
  2. Selection of univariate function
  3. Recursive feature removal
  4. Feature selection using SelectFromModel
  5. L1-based feature selection
  6. Random sparse model
  7. Tree-based feature selection
  8. Selection of functions as part of the pipeline

1.14. Semi-supervised : us: untranslated

  1. Label propagation

1.15. Isotonic regression: jp:

1.16. Probability calibration: jp:

1.17. Neural network model (supervised) : us: untranslated

  1. Multilayer perceptron
  2. Classification
  3. Regression
  4. Normalization
  5. Algorithm
  6. Complex
  7. Mathematical prescription
  8. Practical tips
  9. More control with warm_start

2. Unsupervised learning

2.1. Gaussian mixed model : us: untranslated

  1. Gauss mixed
  2. Advantages and disadvantages of Gaussian Mixture
  3. Advantages
  4. Disadvantages 2 Selection of the number of components in a classical Gaussian mixed model
  5. Estimate algorithm maximize expected value
  6. Variational Bayes Gauss mixed
  7. Estimating algorithm: variational inference
  8. Advantages and disadvantages of transformation reasoning with BayesianGaussianMixture
  9. Advantages
  10. Disadvantages
  11. Dirichlet process

2.2. Manifold learning : us: untranslated

  1. Introduction
  2. Isomap
  3. Complex
  4. Locally linear embedding
  5. Complex
  6. Locally modified linear embedding
  7. Complex
  8. Unique mapping of Hessian matrix
  9. Complex
  10. Spectrum embedding
  11. Complex
  12. Local tangent space alignment
  13. Complex
  14. Multidimensional scaling (MDS)
  15. Metric MDS
  16. Non-metric MDS
  17. t-Distributed Stochastic Neighborhood Embedding (t-SNE)
  18. t-SNE optimization
  19. Burns hat t-SNE
  20. Practical tips

2.3. Clustering : us: Untranslated

  1. Overview of clustering method
  2. K-means
  3. Mini batch K-Means
  4. Affinity propagation
  5. Average shift
  6. Spectrum clustering
  7. Difference in label assignment method
  8. Hierarchical clustering
  9. Different linkage types: Ward, full average linkage
  10. Add connection constraints
  11. Change metric
  12. Density-based spatial clustering (DBSCAN)
  13. Hierarchical balanced iterative reduction and clustering (BIRCH)
  14. Clustering performance evaluation
  15. Adjusted land index
  16. Benefits
  17. Disadvantages
  18. Mathematical prescription
  19. Mutual information-based scoring
  20. Benefits
  21. Disadvantages
  22. Mathematical prescription
  23. Homogeneity, integrity and V-scale
  24. Benefits
  25. Disadvantages
  26. Mathematical prescription
  27. Fowlkes-Mallows score
  28. Benefits
  29. Disadvantages
  30. Silhouette coefficient
  31. Benefits
  32. Disadvantages
  33. Karinsky Harabaz Index
  34. Benefits
  35. Disadvantages

2.4. Biclustering : us: Untranslated

  1. Spectral co-clustering
  2. Mathematical prescription
  3. Spectrum by clustering
  4. Mathematical prescription
  5. Bi-clustering evaluation

2.5. Decompose the signal in the component (matrix factorization problem): jp:

  1. Principal component analysis (PCA)
  2. Accurate PCA and stochastic interpretation
  3. Incremental PCA
  4. PCA with randomized SVD
  5. Kernel PCA
  6. Sparse PCA and MiniBatchSparsePCA
  7. Truncation singular value decomposition and latent semantic analysis
  8. Dictionary learning
  9. Sparse coding with a pre-computed dictionary
  10. General dictionary learning
  11. Mini-batch dictionary learning
  12. Factor analysis
  13. Independent Component Analysis (ICA)
  14. Non-negative matrix factorization (NMF or NNMF)
  15. Latent Dirichlet Allocation (LDA)

2.6. Covariance estimation : us: untranslated

  1. Empirical covariance
  2. Reduced covariance
  3. Basic contraction
  4. Ledoit-Wolf Shrink
  5. Oracle approximate contraction
  6. Sparse inverse covariance
  7. Robust covariance estimation
  8. Minimum covariance determinant

2.7. Detection of novelty and outliers: jp:

  1. Detection of novelty
  2. Outlier detection
  3. Install the oval envelope
  4. Isolation forest 3.1 Class SVM vs. Elliptical Envelope vs. Isolation Forest

2.8. Density estimation: jp:

  1. Density estimation: Histogram
  2. Kernel density estimation

2.9. Neural network model (unsupervised) : us: untranslated

  1. Limited Boltzmann machine
  2. Graphical model and parameterization
  3. Bernoulli restricted Boltzmann machine
  4. Stochastic maximum likelihood learning

3. Model selection and evaluation

3.1. Cross-validation: Evaluate estimator performance: jp:

  1. Calculation of cross-validated metrics
  2. Obtaining forecasts by cross-validation
  3. Cross-validation iterator
  4. i.i.d cross-validation iterator data
  5. K times 2. Leave One Out(LOO) 3. Leave P Out(LPO)
  6. Random Permutation Mutual Verification a.k.a. Shuffle & Split
  7. Mutual validation iterator with hierarchy based on class label
  8. Layered K times
  9. Layered shuffle split
  10. Mutual validation iterator for grouped data
  11. Group k times
  12. Leave one group
  13. Leave P group
  14. Group shuffle split
  15. Predefined Fold-Splits / Validation-Sets
  16. Mutual verification of time series data
  17. Time series division
  18. Shuffle precautions
  19. Mutual validation and model selection

3.2. Tuning of hyperparameters of estimator: jp:

  1. Complete grid search
  2. Randomized parameter optimization
  3. Parameter search tips
  4. Objective metric specification
  5. Composite estimates and parameter space
  6. Model selection: development and evaluation
  7. Parallel
  8. Robustness to disability
  9. Alternative to brute force parameter search
  10. Model-specific mutual validation
  11. Information standards
  12. Other estimators

3.3. Model evaluation: Quantify the quality of prediction: jp:

  1. Score parameter: Definition of model evaluation rule
  2. Common case: predefined values
  3. Define a scoring strategy from a metric function
  4. Implementation of your own scoring object
  5. Classification metric
  6. From binary to multi-class, multi-label
  7. Accuracy score
  8. Cohen's kappa
  9. Confusion Matrix
  10. Classification report
  11. Humming loss
  12. Jacquard similarity coefficient score
  13. Precision, recall, F-measures
  14. Binary classification
  15. Multi-class and multi-label classification
  16. Hinge loss
  17. Log loss
  18. Matthews correlation coefficient
  19. Receiver Operating Characteristic (ROC)
  20. Zero one loss
  21. Breather score loss
  22. Multi-label ranking metric
  23. Coverage error
  24. Average accuracy of label rank
  25. Loss of ranking
  26. Regression metric
  27. Explained variance score
  28. Average absolute error
  29. Mean squared error
  30. Central absolute error
  31. R² score, coefficient of determination
  32. Clustering metric
  33. Dummy estimator

3.4. Model persistence: flag_jp:

  1. Persistence example
  2. Security and maintainability limits

3.5. Verification curve: Plot the score to evaluate the model: jp:

  1. Verification curve
  2. Learning curve

4. Data set conversion

4.1. Pipeline and Feature Union: Estimator combination: jp:

  1. Pipeline: Chain estimator
  2. Usage
  3. Note
  4. FeatureUnion: Composite feature space
  5. Usage

4.2. Feature extraction: jp:

  1. Loading features from dicts
  2. Feature hash
  3. Implementation details
  4. Text feature extraction
  5. Word notation
  6. Rarity
  7. How to use the common vectorizer
  8. Weighting of Tf-idf terms
  9. Decoding text files
  10. Applications and samples
  11. Limitations of expression in Bag of Words
  12. Vectorize a large text corpus using hash tricks
  13. Perform out-of-core scaling with HashingVectorizer
  14. Customized vectorizer class
  15. Image feature extraction
  16. Patch extraction
  17. Image connectivity graph

4.3. Data preprocessing: jp:

  1. Standardization, averaging and variance scaling
  2. Scaling features to range
  3. Sparse data scaling
  4. Scaling data containing outliers
  5. Centering kernel matrix
  6. Normalization
  7. Binarization
  8. Feature binarization
  9. Encode the function of the category
  10. Completion of missing values
  11. Generate polynomial features
  12. Custom transformer

4.4. Unsupervised dimensionality reduction: jp:

  1. PCA: Principal component analysis
  2. Random projection
  3. Feature agglomeration

4.5. Random projection: jp:

  1. Johnson-Lindenstrauss lemma
  2. Gauss random projection
  3. Sparse random projection

4.6. Kernel approximation : us: untranslated

  1. Nystroem method for kernel approximation
  2. Radial basis function kernel
  3. Additive Chi Squared Kernel
  4. Skewed chi-square kernel
  5. Math details

4.7. Pairwise Metrics, Similarities and Kernels

  1. Cosine similarity
  2. Linear kernel
  3. Polynomial kernel
  4. Sigmoid kernel
  5. RBF kernel
  6. Laplacian kernel
  7. Chi-square kernel

4.8. Transform the prediction target (y): jp:

  1. Label binarization
  2. Label encoding

5. Dataset reading utility : us: untranslated

  1. General dataset API
  2. Toy dataset
  3. Sample image
  4. Sample generator
  5. Generator for classification and clustering
  6. Single label
  7. Multi-label 3. Biclustering
  8. Regression generator
  9. Generator for diverse learning
  10. Disassembly generator
  11. Dataset in svmlight / libsvm format
  12. Loading from an external dataset
  13. Olivetti faces dataset
  14. 20 newsgroup text datasets
  15. Usage
  16. Convert text to vector
  17. Text filtering for more realistic training
  18. Download the dataset from the mldata.org repository
  19. Labeled faces in the wild face recognition dataset
  20. Usage
  21. Example
  22. Deforestation
  23. RCV1 dataset
  24. Boston Home Price Dataset
  25. Note
  26. Breast Cancer Wisconsin (Diagnosis) Database
  27. Note
  28. References
  29. Diabetes dataset
  30. Note
  31. Optical recognition of handwritten digit data
  32. Note
  33. References
  34. Iris plant database
  35. Note
  36. References
  37. Linnerrud dataset
  38. Note
  39. References

6. Computational Expansion Strategy: Larger Data : us: Untranslated

  1. Scaling instances using out-of-core learning
  2. Streaming instance
  3. Feature extraction
  4. Incremental learning
  5. Example
  6. Notes

7. Computation performance : us: untranslated

  1. Predicted latency
  2. Bulk vs. atomic mode
  3. Impact of number of features
  4. Impact of input data representation
  5. Impact of model complexity
  6. Feature extraction latency
  7. Predicted throughput
  8. Tips and techniques
  9. Linear algebra library
  10. Model compression
  11. Model shape change
  12. Link

Tutorial here

© 2010 --2016, scikit-learn developers (BSD license).

Recommended Posts

[Translation] scikit-learn 0.18 User Guide Table of Contents
[Translation] scikit-learn 0.18 Tutorial Table of Contents
[Translation] scikit-learn 0.18 User Guide 2.7. Detection of novelty and outliers
[Translation] scikit-learn 0.18 User Guide 4.5. Random projection
[Translation] scikit-learn 0.18 User Guide 1.11. Ensemble method
[Translation] scikit-learn 0.18 User Guide 1.15. Isotonic regression
[Translation] scikit-learn 0.18 User Guide 4.2 Feature extraction
[Translation] scikit-learn 0.18 User Guide 1.16. Probability calibration
[Translation] scikit-learn 0.18 User Guide 1.13 Feature selection
[Translation] scikit-learn 0.18 User Guide 3.4. Model persistence
[Translation] scikit-learn 0.18 User Guide 2.8. Density estimation
[Translation] scikit-learn 0.18 User Guide 4.3. Data preprocessing
[Translation] scikit-learn 0.18 User Guide 3.1. Cross-validation: Evaluate the performance of the estimator
[Translation] scikit-learn 0.18 User Guide 4.4. Unsupervised dimensionality reduction
[Translation] scikit-learn 0.18 User Guide 1.4. Support Vector Machine
[Translation] scikit-learn 0.18 User Guide 3.3. Model evaluation: Quantify the quality of prediction
[Translation] scikit-learn 0.18 User Guide 4.1. Pipeline and Feature Union: Combination of estimators
[Translation] scikit-learn 0.18 User Guide 1.12. Multi-class algorithm and multi-label algorithm
taichi's Torisetsu ⓪ Table of contents
[Translation] scikit-learn 0.18 User Guide 4.8. Convert the prediction target (y)
[Linux] [Initial Settings] Table of Contents
[Python] [Table of Contents Links] Python Programming
Python Math Series ⓪ Table of Contents
Github Interesting Repository ⓪ Table of Contents
Introductory table of contents for python3
Pandas User Guide "Table Formatting and PivotTables" (Official Document Japanese Translation)
Pandas User Guide "merge, join and concatenate" (Japanese translation of official documentation)
[Translation] scikit-learn 0.18 User Guide 3.5. Verification curve: Plot the score to evaluate the model
[Translation] scikit-learn 0.18 User Guide 2.5. Decompose the signal in the component (matrix factorization problem)
Nogizaka recognition program (using Yolov5) Table of contents
Let Code Table of Contents Starting from Zero
Japanese translation of self-study "A Beginner's Guide to Getting User Input in Python"
Automating simple tasks with Python Table of contents
Translation of scsi_mid_low_api.txt
[Translation] scikit-learn 0.18 Tutorial Introduction of machine learning by scikit-learn
Contents of __name__
Create a table of contents with IPython notebook
Creating BINGO "Web Tools" with Python (Table of Contents)
Obtained contents of sosreport
[Note] Contents of shape [0], shape [1], shape [2]
Analyzing user dissatisfaction very easily from the contents of inquiries
Pandas User Guide "Multi-Index / Advanced Index" (Official document Japanese translation)
Deploy Django + React from scratch to GKE: Table of Contents
Pandas User Guide "Manipulating Missing Data" (Official Document Japanese Translation)
[Linux] [Initial Settings] Table of Contents for Development Environment Setup