[PYTHON] [Recommendation] Content-based filtering and collaborative filtering

About the recommendation system

The recommendation system selects objects, information, products, etc. that are considered to be useful to the user. It is a system that presents it in a form that suits the user's purpose. In recent years, it has been implemented in many web services such as Amazon, and it has become familiar to many people.

It seems that Amazon and others are building a recommendation system by combining multiple complex algorithms, but this time we will summarize ** content-based filtering ** and ** collaborative filtering **, which are the basis for making recommendations.

Content-based filtering

This is a method of searching for and recommending similar products from the tag information of products purchased by a user. Based on the tag information of the product purchased by the user, we accumulate what kind of field the user is interested in, search for similar products, and propose them.

slide1.PNG

While this method allows you to make various recommendations based on tag information, it has the disadvantage that you must tag the product you want to recommend **. For example, "Introduction to Python 3" in the image above is recommended after another Python-related book is sold because the tag "#python" was added when the product was registered. It will not be recommended without the tag "#python". You also have to consider whether the tag "#python" is appropriate in the first place. It is necessary to ** design the tag by conducting detailed marketing in consideration of the user's taste. As a result, content-based filtering has the property of being time consuming and costly.

Collaborative filtering

Based on the user's behavior history, etc., ** find users with similar purchase patterns and recommend products **. At this time, there is a feature that the property of the item, tag information, etc. are not considered at all.

slide2.PNG

In the image above, first of all, each product is not tagged. The DB also records information about what products each user bought in the past. Then select users with high similarity from the purchase history of each user. (In the example, only one similar user is selected) As a result, products purchased in the past by Mr. D, who has a similar purchase preference to Mr. A, are recommended by Mr. A. Since the user's purchase history is the axis, tag information for each product is not required.

However, collaborative filtering has the major disadvantage that products that are not purchased by anyone are not recommended.

Hybrid filtering

Content-based filtering, collaborative filtering, and practical use each have major disadvantages. Therefore, when building a system, it is common to build a recommendation system by combining the good points of each filtering, and these are called hybrids.

Recommended Posts

[Recommendation] Content-based filtering and collaborative filtering
[Recommendation] Summary of advantages and disadvantages of content-based and collaborative filtering / implementation method
I implemented collaborative filtering (recommendation) with redis and python
Try implementing content-based filtering or document recommendation
Movie Recommendation utilizing collaborative filtering of Spark MLlib
Collaborative filtering with PySpark
Collaborative filtering with principal component analysis and K-means clustering
I tried to implement a recommendation system (content-based filtering)
Think about specifications for collaborative filtering recommendation engine development
User-based collaborative filtering in python
PySpark learning record ③ Recommendation overview + Collaborative filtering easily implemented with Spark ML