[PYTHON] Ensemble learning and basket analysis

Chapters 7 and 8 of Practical Machine Learning System were close to what I wanted to do, so I summarized them quickly. (No formulas and codes, just an overview) I didn't write the association rule mining because there was a beautifully organized material. (Reference at the bottom)

Ensemble learning

Ensemble learning is the process of fusing multiple individually learned learners to improve generalization ability (prediction ability for unlearned data) and create one learner.

As a feature, each learner can be considered as a new feature quantity, and a new combination method is learned based on the training data.

As the saying goes, ** 3 people are the wisdom of Manjushri **. Discrimination ability improves by increasing the number of learners. Ensemble learning has the advantages of simplicity (just prepare multiple learning devices) and versatility (applicable to any learning device) in addition to high discrimination ability.

For that purpose, it is necessary to adjust the weight of each learner. (It is rare that the weights of all learners are uniform). That is, the value obtained by adding all the products of the ratings predicted by each learner and the determined weights is adopted as the final predicted rating. (Weighted average) The optimal ** weight ** is learned from the data.

Figure (like a multi-learner version of a neural network)

It was also included in Python's Scikit-Learn. 1.9. Ensemble methods — scikit-learn 0.15.2 documentation

Typical ensemble learning algorithm

Bagging, Bootstrap Aggregating

How it works
Feature
Feature

[Random Forest](http://ibisforest.org/index.php?%E3%83%A9%E3%83%B3%E3%83%80%E3%83%A0%E3%83% 95% E3% 82% A9% E3% 83% AC% E3% 82% B9% E3% 83% 88)

How it works
Feature

[Boosting](http://ibisforest.org/index.php?%E3%83%96%E3%83%BC%E3%82%B9%E3%83%86%E3%82%A3 % E3% 83% B3% E3% 82% B0)

How it works
Method

Basket analysis

Another analytical method for learning recommender systems. The data handled by the basket analysis is only about which items were purchased together, and does not require information such as whether or not you like the item. (Item-based pockets in collaborative filtering)

This basket analysis is not just applicable to "shopping carts". It can be grouped together and can be applied to any target if you need to recommend the items in it. For example, recommending a recommended web page to the user from the browsing history of the browser.

Famous story of beer and diapers in basket analysis

reference

Practical machine learning system Ensemble learning A story about studying ensemble learning (miscellaneous notes?) Implementation of AdaBoost by Splus [Data science by R] Group learning Random forest [What is out of bag error in Random Forests?] (http://stackoverflow.com/questions/18541923/what-is-out-of-bag-error-in-random-forests) 2nd How to find out what products sell well with a certain product-The idea of market basket analysis 6. Product analysis method (ABC analysis, association analysis) 2nd: Association Analysis ~ "Statistics You Want to Use" Series ~ Association analysis

Recommended Posts

Ensemble learning and basket analysis
What is ensemble learning?
Deep learning image analysis starting with Kaggle and Keras
Sample program and execution example of ensemble learning (Stacked generalization)
Ensemble learning summary! !! (With implementation)
Supervised Learning 3 Hyperparameters and Tuning (2)
Unsupervised learning 3 Principal component analysis
Python data analysis learning notes
Machine learning and mathematical optimization
Learning model creation, learning and reasoning
Supervised learning 2 Hyperparameters and tuning (1)
Machine learning algorithm (multiple regression analysis)
Basic machine learning procedure: ④ Classifier learning + ensemble learning
Machine learning algorithm (simple regression analysis)
Classification and regression in machine learning
Organize machine learning and deep learning platforms
Clash of Clans and image analysis (3)
Kaggle Summary: Instacart Market Basket Analysis
Python: Unsupervised Learning: Principal Component Analysis
Machine Learning: Supervised --Linear Discriminant Analysis