机器学习面试题总结(笔记)

面试时,面试官会结合你的回答和简历来问你,所以在写简历时,你必须问问自己,你是否理解这些知识。

[En]

During the interview, the interviewer will ask you in combination with your answer and your resume, so when writing your resume, you must ask yourself if you understand this knowledge.

面试其实是对沟通能力的考量,在面试中要有一定的灵活性。

[En]

Interview is actually a consideration of communication skills, and you should be “flexible” in the interview.

在一些问题上,如果没有,就说不,但在一些关键问题上,如果算法没有,最好是稍微提一下相关算法,灵活回答。

[En]

On some questions, if not, just say no, but on some key questions, if the algorithm does not, it is best to mention the relevant algorithms a little and answer flexibly.

在招聘机器学习/人工智能相关职位时,要考虑的主要指标如下:

[En]

When recruiting for machine learning / artificial intelligence-related positions, * the main indicators to be considered * are as follows:

①算法的思维能力

②基本的算法原理

③编程能力

④数据结构能力( 扩展了解)

1.请介绍一下你熟悉的机器学习模型或算法?

2.请介绍**算法或模型原理?(一般都是简历上的)

3.请描述一下算法和算法有什么区别?(一般是简历上或者面试过程中问到的算法内容)

4.这些算法模型你是不是都使用过?都用于那些应用场景?

5.在应用场景中,你使用算法的时候,遇到了那些问题?最终是如何解决的?

6.在应用场景中,你们为什么不使用算法?

7.你觉得在应用场景中,使用算法效果如何?

1. 什么是机器学习过拟合?

所谓过拟合,是指模型对训练集的预测效果较好,而对测试集的预测效果较差。

[En]

The so-called over-fitting means that the model has a good effect on the training set and a poor prediction effect on the test set.

2. 如何避免过拟合问题?

  1. 重采样bootstrap

  2. L1,l2正则化

  3. 决策树的剪枝操作

  4. 交叉验证

3. 什么是机器学习的欠拟合?

所谓欠拟合就是指模型的复杂度低或数据集太小,对模型数据的拟合度不高,所以模型对训练集的效果不好。

[En]

The so-called underfitting means that the complexity of the model is low or the data set is too small, and the degree of fitting to the model data is not high, so the effect of the model on the training set is not good.

4. 如何避免欠拟合问题?

1.增加样本的数量

2.增加样本特征的个数

3.可以进行特征维度扩展

5. 什么是交叉验证?交叉验证的作用是什么?

交叉验证是将原始数据集(数据集)分成两部分。训练集的一部分用于训练模型,另一部分用作测试集来检验模型的效果。

[En]

Cross-validation is to divide the original data set (dataset) into two parts. One part of the training set is used to train the model, and the other part is used as the test set to test the model effect.

作用: 1)交叉验证是用来评估模型在新的数据集上的预测效果,也可以一定程度上减小模型的过拟合

2)还可以从有限的数据中获取尽可能多的有效信息。

交叉验证的方法主要有以下几种:

[En]

There are mainly the following methods of cross-validation:

1.留置的方法。原始数据集简单地分为训练集验证集测试集

[En]

1 the method of setting aside. The original data set is simply divided into * training set * , * validation set * and * test set * .

②k折交叉验证.(一般取5折交叉验证或者10折交叉验证)

留下一个方法。(只剩下一个样本作为数据的测试集,其余的作为训练集)-仅适用于较少的数据集

[En]

Leave a method. (only one sample is left as the test set for the data, and the rest as the training set)-only applicable to fewer data sets

④ Bootstrap方法.(会引入样本偏差)

6. 有监督算法和无监督算法有什么区别?

根据算法模型,标记的算法称为有监督算法,反之亦然。

[En]

According to the algorithm model, the labeled algorithm is called supervised algorithm, and vice versa, it is called unsupervised algorithm.

7 常见的距离度量公式有那些?

1)Minkowski(闵可夫斯基距离)

2) TF-IDF

2)牛顿法和拟牛顿法

牛顿法->牛顿法是二阶收敛收敛速度快。牛顿法是一种迭代算法,每一步都需要求解目标函数的海森矩阵的逆矩阵,计算复杂。

[En]

Newton method-> Newton method is * second order convergence * , * convergence speed is fast * . Newton method is an iterative algorithm, which needs to solve the inverse matrix of the Hessian matrix of the objective function in every step, and the calculation is complicated.

拟牛顿法–>改进了牛顿法每次都需要求复数海森矩阵求逆的缺陷。它用正定矩阵来逼近海森矩阵的逆,从而简化了运算的复杂性。

[En]

Quasi-Newton method– > improves the defect that Newton method needs to solve the inverse matrix of complex Hessian matrix every time. It * uses positive definite matrix to approximate the inverse of * Hessian* matrix, thus simplifies the complexity of operation.

3)共轭梯度法(Conjugate Gradient)

共轭梯度法是介于最速下降法和牛顿法之间的一种方法。它只需要使用一阶导数信息,但克服了最速下降法收敛速度慢的缺点,避免了牛顿法需要存储和计算Hesse矩阵和求逆的缺点。本实用新型所需存储空间小,步长收敛,稳定性高,不需要任何外部参数。

[En]

The conjugate gradient method is a method between the steepest descent method and the Newton method. It only needs to use the first derivative information, but it overcomes the disadvantage of slow convergence of the steepest descent method, and avoids the disadvantage that the Newton method needs to store and calculate the * Hesse* matrix and inverse. The utility model has the advantages of small storage required, step convergence, high stability, and does not require any external parameters.

4) 启发式的方法

启发式优化方法有很多种,包括经典的模拟退火法、遗传算法、蚁群算法和粒子群算法。

[En]

There are many kinds of heuristic optimization methods, including classical simulated annealing, genetic algorithm, ant colony algorithm and particle swarm optimization algorithm.

多目标优化算法(NSGAII 算法、MOEA/D 算法以及人工免疫算法 )

5)解决约束的方法— 拉格朗日乘子法

11. 请描述一下拉格朗日乘子法和 KKT 条件;

拉格朗日乘子法是一种将有约束的目标函数转化为无约束的目标函数的方法。对于不等式约束,需要使用不等式f(X)。

[En]

Lagrangian multiplier method is a method to transform an objective function with constraints into an objective function without constraints. For inequality constraints, inequality f (x) is required.

Original: https://www.cnblogs.com/mfryf/p/15293527.html
Author: 知识天地
Title: 机器学习面试题总结(笔记)

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/6498/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

最近整理资源【免费获取】:   👉 程序员最新必读书单  | 👏 互联网各方向面试题下载 | ✌️计算机核心资源汇总