# 推荐算法总结

## 一、什么是推荐算法

[En]

The so-called recommendation algorithm is to use some user behavior, through some mathematical algorithms, to speculate what the user may like.

[En]

The concept of personalized recommendation first appeared at the American artificial Intelligence Association in March 1995, by Robert Armstrong of Carnegie Mellon University.

[En]

Put forward the personalized navigation system Web Watcher. At the same time, Marko balabanovic of Stanford University has also launched a personalized recommendation system called LIRA–. Since then, the research of personalized recommendation began to flourish.

## 二、推荐算法的几个条件

[En]

Today’s various recommendation algorithms, but in any case, can not bypass a few conditions, this is the basic condition of recommendation:

1.根据和你共同喜好的人来给你推荐
2.根据你喜欢的物品找出和它相似的来给你推荐
3.根据你给出的关键字来给你推荐，这实际上就退化成搜索算法了
4.根据上面的几种条件组合起来给你推荐

## 三、推荐算法的分类

[En]

Recommendation algorithms can be divided into three categories: content-based recommendation algorithm, collaborative filtering recommendation algorithm and knowledge-based recommendation algorithm.

1、基于内容的推荐算法，原理是用户喜欢和自己关注过的Item在内容上类似的Item，比如你看了哈利波特I，基于内容的推荐算法发现哈利波特II-VI，与你以前观看的在内容上面（共有很多关键词）有很大关联性，就把后者推荐给你，这种方法可以避免Item的冷启动问题（冷启动：如果一个Item从没有被关注过，其他推荐算法则很少会去推荐，但是基于内容的推荐算法可以分析Item之间的关系，实现推荐），弊端在于推荐的Item可能会重复，典型的就是新闻推荐，如果你看了一则关于MH370的新闻，很可能推荐的新闻和你浏览过的，内容一致；另外一个弊端则是对于一些多媒体的推荐（比如音乐、电影、图片等)由于很难提内容特征，则很难进行推荐，一种解决方式则是人工给这些Item打标签。
2、基于协同过滤的推荐算法

[En]

Collaborative filtering is a recommendation method widely used in recommendation system. This algorithm is based on the assumption that “birds of a feather flock together”. Users who like the same items are more likely to have the same interests. The recommendation system based on collaborative filtering is generally used in the system with user rating, and the score is used to describe users’ preferences for items. Collaborative filtering is seen as a model for the use of collective wisdom, which does not require special treatment of the project, but through the user to establish the relationship between objects. At present, collaborative filtering recommendation system is divided into two types: user (User-based)-based recommendation and item-based (Item-based) recommendation.

a.基于用户(User-based)的推荐

[En]

The basic principle of collaborative filtering recommendation based on users is to find “neighbor” user groups similar to current users’ tastes and preferences according to all users’ preferences (scores) for items or information. in general applications, the algorithm for calculating K nearest neighbors is adopted, and recommendations are made for current users based on the historical preference information of these K neighbors. The advantage of this recommendation system is that the recommended items may be completely irrelevant in content, so the potential interest of users can be found and personalized recommendation results can be generated for each user. The disadvantage is that in the general Web system, the growth rate of users is far greater than the growth rate of goods, so the growth of the amount of computation is huge, and the system performance is easy to become a bottleneck. Therefore, there are few user-based collaborative filtering systems in the industry.

b.基于物品(Item-based)的推荐

[En]

Item-based collaborative filtering is similar to user-based collaborative filtering, which uses all users’ preferences (scores) for items or information to find the similarity between items and objects, and then according to the user’s historical preference information, recommend similar items to the user. Item-based collaborative filtering can be regarded as a degradation of association rule recommendation, but because collaborative filtering takes more into account the actual score of users, and only calculates similarity rather than finding frequent sets, it can be considered that item-based collaborative filtering has higher accuracy and higher coverage. Compared with user-based recommendation, item-based recommendation has more extensive applications, better scalability and better algorithm performance. Because the growth rate of the project is generally relatively slow, the performance has not changed much. The disadvantage is that personalized recommendation results cannot be provided.

[En]

Two kinds of collaborative filtering: how to choose between user-based and item-based strategies? In fact, the item-based collaborative filtering recommendation mechanism is an improved strategy of Amazon on the user-based mechanism, because in most Web sites, the number of items is far less than the number of users, and the number and similarity of items are relatively stable; at the same time, the item-based mechanism is better than the user-based real-time. However, this is not the case in all scenes. In some news recommendation systems, perhaps the number of items, that is, news, may be greater than the number of users, and the news is updated very quickly, so its similarity is still unstable. Therefore, the choice of recommendation strategy actually has a lot to do with the specific application scenarios.

[En]

The recommendation mechanism based on collaborative filtering is the most widely used recommendation mechanism nowadays, and it has the following significant advantages:

[En]

It does not require strict modeling of objects or users, and does not require the description of objects to be machine-understandable, so this approach is domain-independent.

[En]

The recommendations calculated by this method are open, can share the experience of others, and support users to discover potential interests and preferences.

[En]

Then it also has the following shortcomings:

a、方法的核心是基于历史数据，所以对新物品和新用户都有”冷启动”的问题。
b、推荐的效果依赖于用户历史偏好数据的多少和准确性。
c、在大部分的实现中，用户历史偏好是用稀疏矩阵进行存储的，而稀疏矩阵上的计算有些明显的问题，包括可能少部分人的错误偏好会对推荐的准确度有很大的影响等等。
d、对于一些特殊品味的用户不能给予很好的推荐。
e、由于以历史数据为基础，抓取和建模用户的偏好后，很难利用获取的用户偏好演变，从而导致这个方法不够灵活。
3、 基于关联规则的推荐算法

[En]

Recommendation based on association rules is more common in e-commerce systems, and has been proved to be effective. Its practical significance is that users who have bought some items are more likely to buy other items. The primary goal of the recommendation system based on association rules is to mine association rules, that is, the collection of items purchased by many users at the same time, and the items in these sets can recommend each other. At present, the algorithm of mining association rules is mainly evolved from Apriori and FP-Growth. Recommendation systems based on association rules generally have a higher conversion rate, because when users have purchased several items in a frequent set, they are more likely to purchase other items in the frequent set.

[En]

The disadvantages of this mechanism are as follows:

1.计算量较大，但是可以离线计算，因此影响不大。 2.由于采用用户数据，不可避免的存在冷启动和稀疏性问题。 3.存在热门项目容易被过度推荐的问题。
4、基于模型的推荐算法

[En]

There are many model-based methods, mainly using commonly used machine learning algorithms to establish a recommendation algorithm model for the target user, and then predict and recommend the user’s preferences and rank the recommended results and so on. The commonly used models include Aspect Model,pLSA,LDA, clustering, SVD,Matrix Factorization,LR,GBDT and so on. The training process of this method is relatively long, but after the training is completed, the recommendation process is fast and accurate. Therefore, it is more suitable for real-time services such as news, advertising and so on. Of course, if this algorithm is needed to achieve better results, it needs manual intervention to combine and filter attributes repeatedly, which is what we often call feature engineering. Because of the timeliness of the news, the system also needs to update the online mathematical model repeatedly to adapt to the changes.

5、 混合推荐算法

[En]

In real applications, it is rare to use a single recommendation algorithm to implement recommendation tasks. Therefore, the recommendation systems of large and mature websites are “hybrid algorithms” based on the advantages and disadvantages of various recommendation algorithms and the combination suitable for scenario analysis. Of course, hybrid strategies will also be very rich, such as weighting algorithms for different strategies, using different algorithms for different scenarios and phases, and so on. Specific how to mix needs to be combined with the actual application scenarios for analysis and application.

Original: https://blog.csdn.net/qq_40394960/article/details/105868978
Author: Q&Cui
Title: 推荐算法总结

(0)