推荐算法介绍

1. 推荐算法知识架构

推荐算法有很多种,可以分为以下几类:

[En]

There are many kinds of recommendation algorithms, which can be divided into the following categories:

2. 协同过滤推荐算法(Collaborative Filter,CF)

由于该推荐算法通过基于统计的机器学习算法可以获得更好的推荐效果,并且易于工程实现,所以绝大多数推荐算法都是基于统计的推荐算法。Cf可通过以下方式实施:

[En]

Because this recommendation algorithm can get better recommendation effect through statistics-based machine learning algorithm, and it is easy to implement in engineering, so the vast majority of recommendation algorithms are CF. CF can be implemented in the following ways:

一般来说,如果题数不大,例如不超过100,000,并且没有显著增加,则使用项库。因为当项目数量较少且没有显著增加时,说明项目之间的关系在一段时间内相对稳定(与用户之间的关系相比),因此需要实时更新项目相似度,减少了大量的推荐系统,提高了效率,因此项目库更好。相反,当条目数量较多时,建议使用用户库。

[En]

Generally speaking, if the number of item is not large, for example, no more than 100, 000, and does not increase significantly, use item-base. Because when the number of item is small and does not increase significantly, it shows that the relationship between item is relatively stable over a period of time (compared with the relationship between user), the need for real-time updating of item-similarity reduces a lot of recommendation systems and improves the efficiency, so item-base is better. On the contrary, when the number of item is large, it is recommended to use user-base.

协同过滤作为一种经典的推荐算法,在行业中得到了广泛的应用。它具有很多优点,模型通用性强,不需要太多的相应数据领域的专业知识,工程实现简单,效果好。这就是它受欢迎的原因。

[En]

As a classical recommendation algorithm, collaborative filtering is widely used in industry. It has many advantages, strong versatility of the model, does not need much professional knowledge in the corresponding data field, simple engineering implementation, and good results. These are the reasons why it is popular.

当然,协同过滤也有一些绕不开的问题,比如让人头疼的“冷启动”。当我们没有任何新用户的数据时,我们就不能向新用户推荐商品。同时,它没有考虑到场景的差异,例如基于用户的场景和用户当前的情绪。当然,你不能得到一些小众的独特偏好,这是擅长基于内容的推荐。

[En]

Of course, collaborative filtering also has some unavoidable problems, such as the headache of “cold start”. When we don’t have any data for new users, we can’t recommend items for new users. At the same time, it does not take into account the differences in scenarios, such as based on the user’s scenario and the user’s current mood. Of course, you can’t get some minority’s unique preferences, which is good at content-based recommendations.

3. 基于内容的推荐算法(Content-based Filter,CB)

CB的思想是这样的:根据用户在过去喜欢的内容,为用户推荐与其过去喜欢内容相似的内容。CB的关键在于内容相似性的度量,这是CB在运用过程中的核心。CB的过程一般包括以下三步:

CF和CB都有自己的局限性。目前,大多数推荐系统都是基于CB(如CF)以外的算法,以CB为辅助,形成混合式推荐系统。

[En]

Both CF and CB have their own limitations. At present, most recommendation systems are based on algorithms other than CB (such as CF), with CB as the auxiliary to form a hybrid recommendation system.

4.基于人口统计信息的推荐算法(Demographic-based,DB)

基于人口学的推荐算法应该是最容易实现的。因为它只使用用户的基本信息,如年龄、性别等来衡量用户的相似度,然后将与用户相似的其他用户偏好的项目推荐给当前用户。

[En]

The recommendation algorithm based on demography should be the easiest to implement. Because it only uses the basic information of the user, such as age, gender, etc., to measure the similarity of the user, and then recommends the items of other user preferences similar to the user to the current user.

5.混合推荐算法(Hybrid Recommender,HR)

上面提到的CF、CB、DB等推荐算法的共同问题是它们各有优缺点。为了得到一个更好的推荐算法,将多种推荐算法组合在一起是一个很自然的想法。在多种推荐算法融合后,HR在理论上不会比任何单一的推荐算法差,但HR的复杂度也会相应增加,所以在实际应用中,HR用于推荐的情况并不像CF那样普遍。

[En]

The common problem of CF, CB, DB and other recommendation algorithms mentioned above is that they have both advantages and disadvantages. In order to get a better recommendation algorithm, it is a natural idea to combine many recommendation algorithms as a whole. After the fusion of a variety of recommendation algorithms, HR will not be worse than any single recommendation algorithm in theory, but the complexity of HR will also increase accordingly, so in practical use, the use of HR for recommendation is not as common as CF.

Original: https://blog.csdn.net/u010451780/article/details/109495597
Author: jack_201316888
Title: 推荐算法介绍

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/6453/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

最近整理资源【免费获取】:   👉 程序员最新必读书单  | 👏 互联网各方向面试题下载 | ✌️计算机核心资源汇总