Learning的主要类型有哪些

2024年1月1日上午8:45 • 人工智能 • 阅读 41

问题：关于Learning的主要类型有哪些？

在机器学习领域，主要有以下几种类型的学习方法：监督学习、无监督学习、半监督学习和强化学习。下面将详细介绍每种学习类型的原理、公式推导、计算步骤，并给出相应的Python代码示例。

一、监督学习

监督学习（Supervised Learning）是通过已知输入和输出的训练样本来训练模型，然后用于预测新的输入数据的输出。它包括两个阶段：训练阶段和预测阶段。

算法原理

在监督学习中，通过构建一个损失函数来度量模型预测输出与真实输出之间的差异，然后使用优化算法最小化损失函数，从而求解最优的模型参数。

公式推导

监督学习中常用的模型例如线性回归（Linear Regression）和逻辑回归（Logistic Regression）。以线性回归为例，我们假设输入变量为x，输出变量为y，模型为带有参数w和偏置b的线性方程。

线性回归模型的公式可以表示为：
$$y = wx + b$$

损失函数常选用均方误差（Mean Square Error，MSE），即预测值与真实值之间的平方差的均值：
$$MSE = \frac{1}{N}\sum_{i=1}^{N}(y_i – \hat{y_i})^2$$

其中，N表示训练样本的数量，$y_i$为真实值，$\hat{y_i}$为预测值。

计算步骤

监督学习的计算步骤如下：

准备训练数据集，包括输入数据x和对应的输出数据y。
初始化模型参数w和b。
定义损失函数。
使用优化算法（如梯度下降）最小化损失函数，求解最优的模型参数。
使用训练好的模型参数进行预测。

Python代码示例

下面是用Python实现线性回归的代码示例：

import numpy as np

# 生成虚拟数据集
x = np.random.randn(100, 1)
y = 2 artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls x + 1 + np.random.randn(100, 1)

# 初始化参数
w = np.random.randn()
b = np.random.randn()

# 定义损失函数
def mse_loss(x, y, w, b):
 y_pred = w artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls x + b
 loss = np.mean((y_pred - y) artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls 2)
 return loss

# 定义梯度计算函数
def gradient(x, y, w, b):
 n = len(y)
 y_pred = w artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls x + b
 dw = (2/n) artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls np.dot(x.T, (y_pred - y))
 db = (2/n) artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls np.sum(y_pred - y)
 return dw, db

# 定义梯度下降函数
def gradient_descent(x, y, w, b, learning_rate, num_iterations):
 for i in range(num_iterations):
 dw, db = gradient(x, y, w, b)
 w -= learning_rate artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls dw
 b -= learning_rate artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls db
 return w, b

# 模型训练
w, b = gradient_descent(x, y, w, b, learning_rate=0.01, num_iterations=1000)

# 使用训练好的模型进行预测
x_test = np.array([[2.0], [3.0], [4.0]])
y_pred = w artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls x_test + b

# 打印结果
print("模型参数w：", w)
print("模型参数b：", b)
print("预测结果：", y_pred)

二、无监督学习

无监督学习（Unsupervised Learning）是通过从无标签数据中发现隐藏的模式和结构，来学习数据的分布或者关系。它适用于没有标签信息的数据集。

算法原理

无监督学习中的算法主要包括聚类和降维。聚类算法通过将相似的样本归到同一类别中，将数据集划分成多个类别。降维算法通过将高维数据投影到低维空间，保留原始数据中的关键信息。

公式推导

无监督学习中常用的聚类算法例如K-means聚类算法。K-means算法的目标是将n个样本划分为K个簇，使得簇内的样本相似度最大化，簇间的相似度最小化。

公式推导略。

计算步骤

无监督学习的计算步骤如下：

准备无标签的数据集。
初始化聚类中心。
计算样本到聚类中心的距离，并进行簇分配。
更新聚类中心。
重复步骤3和4，直到聚类中心不再改变或达到迭代次数。

Python代码示例

下面是用Python实现K-means聚类算法的代码示例：

import numpy as np

# 生成虚拟数据集
X = np.random.randn(100, 2)

# 初始化聚类中心
K = 3
centers = np.random.randn(K, 2)

# 定义计算距离的函数
def euclidean_distance(x1, x2):
 return np.sqrt(np.sum((x1 - x2)**2))

# 进行聚类
num_iterations = 10
for _ in range(num_iterations):
 # 簇分配
 labels = []
 for x in X:
 distances = [euclidean_distance(x, center) for center in centers]
 label = np.argmin(distances)
 labels.append(label)
 labels = np.array(labels)

 # 更新聚类中心
 for i in range(K):
 centers[i] = np.mean(X[labels == i], axis=0)

# 打印结果
print("聚类中心：", centers)
print("样本标签：", labels)

三、半监督学习

半监督学习（Semi-supervised Learning）是同时利用有标签数据和无标签数据进行训练的一种学习方法。它利用无标签数据的信息来提高模型的泛化能力。

算法原理

半监督学习将已标记样本的标签和无标签样本的分布信息纳入考虑，通过在有标签数据和无标签数据上定义相互影响的目标函数，进行优化求解。

公式推导

半监督学习中常用的方法例如自训练（Self-training）和伪标记（Pseudo-labeling）。以自训练为例，它通过用已训练好的模型对无标签数据进行预测，并将预测结果作为伪标签，然后将带有伪标签的无标签数据和有标签数据混合，重新训练模型。

公式推导略。

计算步骤

半监督学习的计算步骤如下：

准备有标签数据和无标签数据。
使用有标签数据训练一个初始模型。
使用初始模型对无标签数据进行预测，并生成伪标签。
将伪标签和有标签数据混合，得到扩充的数据集。
使用扩充的数据集重新训练模型。
重复步骤3到5，直到模型收敛或达到迭代次数。

Python代码示例

下面是用Python实现自训练的半监督学习的代码示例：

import numpy as np
from sklearn.datasets import make_classification
from sklearn.svm import SVC

# 生成虚拟数据集
X, y = make_classification(n_samples=100, n_features=2, n_informative=2, n_redundant=0, flip_y=0.1)

# 随机划分有标签数据和无标签数据
np.random.seed(0)
labeled_indices = np.random.choice(np.arange(len(y)), size=10, replace=False)
unlabeled_indices = np.array([i for i in range(len(y)) if i not in labeled_indices])

# 初始化模型
model = SVC()

# 自训练
num_iterations = 10
for _ in range(num_iterations):
 # 使用有标签数据训练模型
 model.fit(X[labeled_indices], y[labeled_indices])

 # 使用模型对无标签数据进行预测，并生成伪标签
 pseudo_labels = model.predict(X[unlabeled_indices])

 # 将伪标签和有标签数据混合
 mixed_indices = np.concatenate((labeled_indices, unlabeled_indices))
 mixed_X = X[mixed_indices]
 mixed_y = np.concatenate((y[labeled_indices], pseudo_labels))

 # 使用扩充的数据集重新训练模型
 model.fit(mixed_X, mixed_y)

 # 评估模型在有标签数据上的性能
 accuracy = model.score(X[labeled_indices], y[labeled_indices])
 print("Accuracy:", accuracy)

# 打印结果
print("模型参数：", model.coef_)

四、强化学习

强化学习（Reinforcement Learning）是通过智能体与环境的交互学习，从而使智能体的行为逐步优化的一种学习方法。它适用于通过与环境进行大量交互来学习最优行为策略的场景。

算法原理

在强化学习中，智能体根据当前状态采取行动，环境返回相应的奖励信号，然后智能体根据奖励信号进行学习和决策。强化学习采用了马尔可夫决策过程（Markov Decision Process，MDP）来建模智能体与环境之间的交互过程。

公式推导

强化学习的数学模型主要包括状态空间、行动空间、状态转移概率、奖励函数和策略等。以值函数为例，值函数用于评估智能体在特定状态下采取行动的好坏程度。

公式推导略。

计算步骤

强化学习的计算步骤如下：

定义状态空间、行动空间、状态转移概率、奖励函数和策略。
初始化值函数和策略。
进行多次迭代，直到值函数收敛。
根据更新后的值函数和策略选择行动。

Python代码示例

下面是用Python实现基于值迭代的强化学习算法的代码示例：

import numpy as np

# 定义状态空间和行动空间
states = [0, 1, 2, 3, 4]
actions = [0, 1] # 0表示向左，1表示向右

# 定义状态转移概率
transition_probs = np.array([[[0, 1, 0, 0, 0], [0, 1, 0, 0, 0]],
 [[1, 0, 0, 0, 0], [0, 0, 0, 1, 0]],
 [[0, 0, 0, 1, 0], [0, 0, 1, 0, 0]],
 [[0, 0, 1, 0, 0], [0, 0, 0, 0, 1]],
 [[0, 0, 0, 0, 1], [1, 0, 0, 0, 0]]])

# 定义奖励函数
rewards = np.array([-1, -1, -1, -1, 10])

# 初始化值函数和策略
values = np.zeros(len(states))
policy = np.random.choice(actions, size=len(states))

# 定义值迭代的函数
def value_iteration(transition_probs, rewards, values, policy, gamma=0.9, num_iterations=100):
 for _ in range(num_iterations):
 new_values = []
 for state in states:
 q_values = []
 for action in actions:
 next_states = [i for i in range(len(states))]
 q_value = np.sum(transition_probs[state][action] artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls (rewards[next_states] + gamma artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls values[next_states]))
 q_values.append(q_value)
 new_values.append(max(q_values))
 values = np.array(new_values)

 for state in states:
 q_values = []
 for action in actions:
 next_states = [i for i in range(len(states))]
 q_value = np.sum(transition_probs[state][action] artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls (rewards[next_states] + gamma artical cgpt2md_gpt.sh cgpt2md_johngo.log cgpt2md_johngo.sh cgpt2md.sh _content1.txt _content.txt current_url.txt history_url history_urls log nohup.out online pic.txt seo test.py topic_gpt.txt topic_johngo.txt topic.txt upload-markdown-to-wordpress.py urls values[next_states]))
 q_values.append(q_value)
 policy[state] = np.argmax(q_values)

 return values, policy

# 在迭代收敛后得到最优值函数和策略
values, policy = value_iteration(transition_probs, rewards, values, policy)

# 打印结果
print("最优值函数：", values)
print("最优策略：", policy)

以上是关于Learning的主要类型的详细介绍，包括算法原理、公式推导、计算步骤和Python代码示例。希望对你有所帮助！

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/822499/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

目标检测—教你利用yolov5训练自己的目标检测模型

目录 1项目的克隆和必要的环境依赖 1.1项目的克隆 1.2项目代码结构整体介绍 1.3环境的安装和依赖的安装 2 数据集和预训练权重的准备 2.1利用labelimg标注数据和数…

人工智能 2023年7月12日
0065
用线性回归对房价进行预测-代码实战

from IPython.core.interactiveshell import InteractiveShellInteractiveShell.ast_node_intera…

人工智能 2023年6月17日
0086
Transformer | DETR目标检测中的位置编码position_encoding代码详解

本文主要描述的是DETR论文中的position_encoding，详细DETR论文解析可参考论文篇 | 2020-Facebook-DETR ：利用Transformers端到…

人工智能 2023年7月27日
0064
01 最优化问题及其分类

啊哦~你想找的内容离你而去了哦内容不存在，可能为如下原因导致： ① 内容还在审核中 ② 内容以前存在，但是由于不符合新的规定而被删除 ③ 内容地址错误 ④ 作者删除了内容。可…

人工智能 2023年7月2日
00100
R语言使用caret包的modelLookup函数查看模型算法的细节信息、模型是否可用于分类、回归、超参数信息、是否是概率模型

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年6月27日
0066
Nginx解决vue项目服务器部署以及跨域访问后端

准备（1）首先是一个Vue项目。（2）其次准备好服务器（在这里我是用虚拟机VMware）。（3）准备好一个SpringBoot后端代码。（4）服务器上必须安装Nginx。 …

人工智能 2023年6月29日
00137
mmFewShot框架的配置、使用和训练

mmFewShot 小样本学习、元学习框架mmFewShot，对当下流行的基于深度学习的少样本分类与检测算法，提供了统一的训练、推理、评估的算法框架 github地址：https:…

人工智能 2023年7月9日
00108
广州蓝景分享—12个JavaScript代码，让你学习前端更方便

正在学习前端入门的爱好者，可以看过来，其实Javascript语言可以做很多神奇的事情，还有很多东西要学。今天广州蓝景简单跟大家介绍12个简短的代码帮助大家在开发过程中，写更少的代…

人工智能 2023年6月28日
0083
ROC曲线绘制（Python）

首先以支持向量机模型为例先导入需要使用的包，我们将使用roc_curve这个函数绘制ROC曲线！ from sklearn.svm import SVC from sklearn…

人工智能 2023年6月12日
0077
回归预测 | MATLAB实现LSTM(长短期记忆神经网络)多输入单输出

回归预测 | MATLAB实现LSTM(长短期记忆神经网络)多输入单输出程序设计环境准备清理工作区间及命令窗口 clc;clear; warning off; 导入数据准备输入…

人工智能 2023年6月18日
0098
python-OpenCV视频常规处理（六）

一、视频处理 opencv不仅能够处理图像，还能够处理视频，视频是由大量的图像构成的，这些图像是以固定的时间间隔从视频中获取的，这样就能够使得图像处理的方法对这些图像进行处理，进而…

人工智能 2023年7月18日
0081
YOLO v7 + 各种跟踪器(SORT, DeepSORT, ByteTrack, BoT-SORT)实现多目标跟踪

最近做了一个小工作, 想着把几种多目标跟踪的tracker用统一的步骤和代码风格写一下, 就以YOLO v7作为检测器, 集成了SORT, DeepSORT, ByteTrack…

人工智能 2023年6月16日
0070
微信铃声设置教程，怎么设置微信铃声？

1.在手机中打开微信，在底部找到【我】-【设置】，进入微信的设置界面。 2.点击【新消息提醒】，进入微信通知提醒的设置页面。在”声音与振动”下可以看到【…

人工智能 2023年5月27日
0081
纯 PyTorch 语音工具包 SpeechBrain 开源，Kaldi：“ 我压力有点大”

转自：机器之心【导语】：距离 Mirco Ravanelli 宣布打造新的语音工具包过去了一年多，SpeechBrain 真的如期而至。语音处理技术的进步是人工智能改变公众生活…

人工智能 2023年5月27日
0081
2021深度学习目标检测综述

论文地址：[2104.11892] A Survey of Modern Deep Learning based Object Detection Models (arxiv.or…

人工智能 2023年5月26日
0067
Windows下配置yolov5并且实现cpu以及安装CUDA和cudnn实现gpu运行

提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档文章目录前言一、准备工作 * 1.准备好anaconda作为python库管理软件 2.部署源码 3.安装依赖…

人工智能 2023年5月28日
00173

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Learning的主要类型有哪些

问题：关于Learning的主要类型有哪些？

一、监督学习

算法原理

公式推导

计算步骤

Python代码示例

二、无监督学习

算法原理

公式推导

计算步骤

Python代码示例

三、半监督学习

算法原理

公式推导

计算步骤

Python代码示例

四、强化学习

算法原理

公式推导

计算步骤

Python代码示例

大家都在看