数值优化：经典二阶确定性算法与对偶方法

2023年6月3日上午3:59 • 大数据 • 阅读 83

1 牛顿法

牛顿法[1]的基本思想是将目标函数在当前迭代点处进行二阶泰勒展开，然后最小化这个近似目标函数，即

[\underset{w\in \mathcal{W}}{\text{min}} f(w) \approx \underset{w \in W}{\text{min}} f(w^t) + \nabla f(w^t)^T(w – w^t) + \frac{1}{2}(w – w^t)^T\nabla^2f(w^t)(w-w^t) ]

此处(\nabla^2f(w^t))是目标函数在当前迭代点(w^t)处的海森矩阵。如果该海森矩阵是正定的，则上述问题的最优值 (w^t -[\nabla^2f(w^t)]^{-1}\nabla f(w^t))处取到，牛顿法将其做为下一时刻的状态，即

[w^{t+1} = w^t -[\nabla^2f(w^t)]^{-1}\nabla f(w^t) ]

下图给了一个简单实例：

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:c5ab9fb1-d6e9-4199-8e37-c766442f4b63

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:3b9259eb-8d98-475c-9b36-f4ece2971724

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:c8b8ed4e-f51b-4261-98ae-2520adcc72c8

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:be7af184-b54e-4e58-8d25-6110d67216f8

牛顿法收敛性 假设目标函数(f: \mathbb{R}^d \rightarrow \mathbb{R})的导数(\nabla f(w))是光滑的，存在二阶导数，并且在其最优点(x^)处满足(\nabla f\left(x^{}\right)=0, \nabla^{2} f\left(x^{}\right) \succ 0)且初始点(x^0)与最优点(x^)足够近，那么牛顿法具有(\mathcal{O}(e^{-2^t}))的 超线性（二阶）收敛速率：

[\lVert w^t – w^*\rVert \leqslant \mathcal{O}(e^{-2^t}) ]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:1d96a1f3-a440-41cd-90df-151e3e26b274

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:79b409ae-45a5-4917-9010-7e25e07e86c5

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:3dddd34b-2ebe-4b39-9188-48e801ab1f31

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:22343861-0d17-4e5a-8bd6-aef0ad19aea6

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:88223c70-35fe-4bfa-a3ab-a6fc4266122d

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:59622fdf-cd91-4143-a32e-cc16caf65e27

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:886d1289-2f21-45d5-8b01-ae16e7813b2e

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:2d77091d-dfce-427c-8825-d642cb0b936e

2 拟牛顿法

既然海森矩阵不一定正定，那就构造一个与海森矩阵相差不太远的正定矩阵作为其替代品，这正是拟牛顿法[5]的主要思想。此外，逆牛顿法可以迭代更新海森逆矩阵，而不是每次迭代都需要重新进行逆矩阵的计算。

记(H_t = \nabla^2 f(w^t))，(M^t = [\nabla^2 f(w^t)]^{-1})。在第(t+1)轮迭代，对第(t)轮迭代的海森矩阵(H_t)加上一个或者两个秩为1的矩阵作为对(t+1)时刻的海森矩阵(H^{t+1})的估计。例如：

[H^{t+1} = H^t + aa^T + bb^T ]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:4ffa3eac-a39c-4618-b209-e77a1ad74507

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:878a48e7-a6e6-4461-9fc2-e9af0bcba28d

对(f(w))在(w^t)处的二阶泰勒展开的左右两边计算梯度，可得

[\nabla f(w) \approx \nabla f(w^t) + H^t(w-w^t) ]

令(\delta^t_1 = w^{t+1}-w^t)为模型的更新量，(\delta^t_2=\nabla f(w^{t+1}) – \nabla f(w^t))为目标函数导数的更新量，则我们有：

[(H^t)^{-1}\delta^t_2 \approx \delta^t_1 ]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:a1ff0be8-2b6d-4b77-b761-23456fcd0016

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:a8d18394-0c54-4498-b6ae-90fc7e4c5e67

[\begin{aligned} &H^0 = I\ &H^{t+1}= H^t + \frac{\delta^t_2 (\delta^t_2)^{T}}{(\delta^t_2)^T\delta^t_1} – \frac{H^t\delta^t_1(H^t\delta^t_1)^T }{(\delta^t_1)^TH^t\delta^t_1} \ &M^{t+1} = \left(I – \frac{\delta^t_1(\delta^t_2)^T}{(\delta^t_2)^T\delta^t_1}\right)M^t(I – \frac{\delta^t_2 (\delta^t_1)^T}{(\delta^t_2)^T\delta^t_1}) + \frac{\delta_1^t (\delta_1)^T}{(\delta^t_2)^T\delta^t_1} \end{aligned} ]

此时，所对应的算法被称作BFGS算法，如下面伪代码所示：

在实际应用中基于(M^t)的拟牛顿法更实用，应为根据(M^t)计算下降方向的方法不需要对(H^t)矩阵求逆（解线性方程，其复杂度为(\mathcal{O}(d^3))，非常耗时）。但基于(H^t)的拟牛顿法有比较好的理论性质，产生的迭代序列比较稳定，但如果有办法快速求解线性方程组，我们也可以采用基于(H^t)的拟牛顿法。此外在某些场景下，比如有些带约束的优化问题的算法设计，由于需要用到海森矩阵的近似，(H^t)的使用也很常见。

当使用其它的海森矩阵近似方式和约束条件时，我们还可以设计出其他拟牛顿法，比如DFP[3][4]、Broyden[5]、SR1[6]等。

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:61a28642-51e3-4ca1-8cf5-36518fc94068

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:7119ca93-5467-4891-88c8-32c79c16c9cb

3 对偶方法

我们到目前为止提到的各种优化算法都是直接求解原始优化问题。某些时候，如果把原始优化问题转化成对偶优化问题，会更容易求解。比如，当原始问题的变量维度很高，但是约束条件个数不太多时，对偶问题的复杂度（对偶变量的维度对应于约束条件的个数）会远小于原始问题的复杂度，因而更容易求解（这也是支持向量机高效的主要原因）。本节我们将介绍与此相关的对偶理论以及常见的对偶优化算法[7]。

考虑下述原始优化问题(P_0)：

[\underset{w}{\text{min}} f(w) \ \begin{aligned} \quad\quad\quad\quad\quad\text{s.t.} \quad\quad\quad &g_i(w) \leqslant 0, \quad i=1,\cdots,m_1 \ &h_j(w) = 0, \quad j=1,\cdots,m_2 \end{aligned} ]

包含(m_1)个不等式约束和(m_2)个等式约束。假设问题(P_0)的最优解为(p^*)。

首先，我们将有约束问题转化为无约束优化问题，即构造(P_0)的拉格朗日函数，将约束条件转化到目标函数中。考虑原始问题(P_0)中带有的约束条件，例如，当(g_1(w^0)\geqslant0)时，(w^0)就不是一个可行解，这时我们需要把对应的目标函数之补充定义为正无穷。于是可将原始优化问题转换为如下的无约束优化形式：

[\underset{w}{\text{min}} f(w) + \infty \sum_{i=1}^{m_1} I_{g_i(w)>0} + \infty \sum_{j=1}^{m_2}I_{h_j(w)\neq 0} ]

进一步地，我们用线性函数(\lambda x (\lambda > 0))和(\nu x)来近似指示函数(I_{[x>0]})和(I_{[x\neq 0]})，得到原始问题(P_0)所对应的拉格朗日函数：

[L(w, \lambda, \nu) \triangleq f(w) + \sum_{i=1}^{m_1}\lambda_ig_i(w) + \sum_{j=1}^{m_2}\nu_jh_j(w) ]

其中，(\lambda_i\geqslant 0)，(\nu_j\in \mathbb{R})称为拉格朗日乘子。

接下来，我们讨论拉格朗日函数和原目标函数取值的大小关系，引出拉格朗日对偶函数的定义。如果假设(w^0)满足所有约束条件，那额拉格朗日函数中的第二项小于等于零，第三项等于零，即

[L(w, \lambda, \nu) \leqslant f(w),\quad \text{对任意可行解}w ]

在可行域(\mathcal{W})上对上述不等式取最小值，得到

[\underset{w\in \mathcal{W}}{\text{inf}}L(w, \lambda, v) \leqslant \underset{w\in\mathcal{W}}{\text{min}} f(w) = p^* ]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:725eabfc-f54c-4754-bf30-2e865b62648a

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:efb4fbc3-b131-41c0-a0ad-b610df0950c3

[h(\lambda, \nu) \triangleq \underset{w\in \mathcal{W}}{\text{inf}}L(w, \lambda, v) \leqslant p^*, \quad 其中\lambda_i\geqslant0, \nu_j \in \mathbb{R} ]

由于拉格朗日对偶函数(h(\lambda, \nu))是一族关于((\lambda, \nu))仿射函数的逐点下确界，所以即使原始问题(P_0)不是凸的，拉格朗日对偶函数(h(\lambda, \nu))也是凹的。

接下来我们定义对偶优化问题。既然拉格朗日对偶函数的取值是原始物体最优值的下界，在拉格朗日乘子(\lambda)，(\nu)的取值空间对函数(h(\lambda, \nu))取最大值，将会得到原始问题最优解的最大下界。这个问题通常被定义为原始问题(P_0)的对偶问题，记为(D_0)，具体如下：

[\underset{\lambda, \nu}{\text{max}} \space h(\lambda, \nu) \ \begin{aligned} \quad\quad\quad\text{s.t.} \quad\quad\quad &\lambda_i \geqslant 0, \quad i=1,\cdots,m \ \end{aligned} ]

对偶问题是一个对凹函数在可行域（凸集）内求解最大值的问题，记最优值为(d^)。通过上面的讨论，(d^

最后，可通过求解对偶问题来得到原始问题的解。在求解对偶问题的过程中，我们可以使用前面的各种一阶或二阶优化方法。假设对偶问题求得的解为((\lambda^, \nu^))，将其代入拉格朗日函数中，对原始变量求拉格朗日函数的最小值，即:

[\underset{w}{\text{min}}\space L(w, \lambda^, \nu^) ]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:0a6d96b1-440a-4234-8159-6b8f8e8ed8d8

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:905118ec-5f53-40b5-8fae-e33b3742959f

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:27db1583-22be-4973-9898-9e99c52d3723

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:967aa7fb-2417-4698-a568-3128d1e51131

对于带有线性约束的凸优化问题，对偶坐标上升法被证明至少具有线性收敛速率[6]。

4 对确定性算法方法的小结

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:b97b3fa3-81b0-4c60-a424-2a2b012d30b7

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:3e466456-563e-4d98-9085-6f7bc6ad5a88

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:24fcd858-e71e-4230-9c3d-56163dad02ba

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:073fe00c-4b73-419d-ac5d-d71f62e9ce30

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:11cc7b28-0d27-4733-a656-f493378437dc

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:ed57b9a0-05ab-4cfb-8229-71d7f73a7065

* Nesterov加速法将梯度下降法速率中关于条件数的阶数进一步改进（但仍然是线性收敛速率）。
* 投影次梯度法和Frank-Wolfe方法都可以用于解决带有约束的优化问题，两种方法的收敛速率都为次线性，具体可以依据投影操作的难易程度来选择。
* 一些拟牛顿法（比如BFGS算法）也可以和牛顿法一样达到二阶收敛速率。

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:35ccbf47-5c96-418f-b9a9-ee494629ced3

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:7eed50d3-0581-4d39-8cdc-6b47e7c1952a

[1] Polyak B T. Newton’s method and its use in optimization[J]. European Journal of Operational Research, 2007, 181(3): 1086-1096.
[2] Dennis, Jr J E, Moré J J. Quasi-Newton methods, motivation and theory[J]. SIAM review, 1977, 19(1): 46-89.
[3] Davidon W C. Variable metric method for minimization[J]. SIAM Journal on Optimization, 1991, 1(1): 1-17.
[4] Fletcher R, Powell M J D. A rapidly convergent descent method for minimization[J]. The computer journal, 1963, 6(2): 163-168.
[5] Broyden C G. A class of methods for solving nonlinear simultaneous equations[J]. Mathematics of computation, 1965, 19(92): 577-593.
[6] Luo Z Q, Tseng P. Error bounds and convergence analysis of feasible descent methods: a general approach[J]. Annals of Operations Research, 1993, 46(1): 157-178.
[7] Numerical optimization[M]. New York, NY: Springer New York, 1999.
[8] 刘浩洋，户将等. 最优化：建模、算法与理论[M]. 高教出版社, 2020.
[9] 刘铁岩，陈薇等. 分布式机器学习：算法、理论与实践[M]. 机械工业出版社, 2018.
[10] Stanford CME 323: Distributed Algorithms and Optimization (Lecture 7)

Original: https://www.cnblogs.com/orion-orion/p/16376453.html
Author: orion-orion
Title: 数值优化：经典二阶确定性算法与对偶方法

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/562901/

转载文章受原作者版权保护。转载请注明原作者出处！

大数据

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Sqlite 记事本

Sqlite 记事本-支持标签搜索内容搜索主要功能：引导页用户登录注册搜索按照标签查询按照关键字查询标签数据库增删改查显示方式瀑布流显示 *线性显…

大数据 2023年11月11日
0031
mapreduce输出结果到mysql

Mapreduce程序在reduce阶段执行后，默认会把结果输出到hdfs文件系统上(使用context.wirte())，如果要把结果写到mysql中，需要重写Writable、…

大数据 2023年5月26日
00122
解决hudi hms catalog中flink建表，spark无法写入问题

大数据 2023年11月13日
0052
oracle分析函数——over

1、分析函数： 1）分析函数： Oracle从8.1.6开始提供分析函数，专门用于解决复杂报表统计需求的功能强大的函数，它可以在数据中进行分组然后计算基于组的某种统计值，并且每一组…

大数据 2023年5月26日
0077
saless.csv数据的导入，清洗，分析，导出，可视化（利用hive，sqoop，mysql工具）

目录题目描述 hadoop配置 hive配置 mysql配置完成问题1，2 zookeeper、hbase配置 sqoop配置利用sqoop将hive数据导出到mysql 注…

大数据 2023年11月13日
0044
Java如何实现大文件分片上传,断点续传和秒传

大数据 2023年11月15日
0043
DolphinScheduler——流程调度工具

目录一、平台简介二、竞品软件分析三、安装部署 3.1 基础环境 3.2 安装前操作 3.3具体安装操作步骤 3.4 名词解释四.功能介绍 4.1 首页&项目管理 4…

大数据 2023年11月13日
0058
GraphScope v0.12.0 版本发布

GraphScope 每月进行常规版本的迭代与发布，GraphScope v0.12.0 全新版本在四月如期而至。v0.12.0 为交互式图查询 GAIA 引入全新的 IR 层以及…

大数据 2023年6月3日
0091
Admin3，一个灵活高效不臃肿的后台管理框架

项目使用最新的技术栈（后端Java17、SpringBoot3.0，前端TypeScript、Vite3、Vue3、Element Plus），只提供了用户和权限管理的核心功能最佳…

大数据 2023年11月13日
0041
andorid定时器应用

404. 抱歉，您访问的资源不存在。可能是网址有误，或者对应的内容被删除，或者处于私有状态。代码改变世界，联系邮箱 contact@cnblogs.com 园子的商业化努力-困…

大数据 2023年6月3日
0075
消息队列与kafka

为什么用消息队列 比如在一个企&#x4E…

大数据 2023年5月28日
0080
Linux 单用户模式修改密码 Centos、Ubuntu

Centos 7 开机进入选择系统界面按e选择进入系统，一般默认第一个（进入系统过程中可以持续按上下键，避免跳过此页面）修改Grud 进入界面后往下找，找到以linux16开头…

大数据 2023年5月27日
0071
【PyTorch基础教程29】DIN模型

学习总结推荐系统排序部分中的损失函数大部分都是二分类的交叉熵损失函数，但是召回的模型很多都不是。召回模型那块常见的还有sampled softmax损失函数；模型训练时，在se…

大数据 2023年5月24日
00110
Linux安装Node.js

1、下载node.js 源码包 2、检查是否安装了python。要求是python2.6或者python2.6以上版本查看python版本如果没有，先安装python 3、检…

大数据 2023年6月3日
0051
燕千云AITSM重塑IT服务管理

IT服务经历了三个阶段,缘起于设备管理，兴起于灾难恢复，发展于IT服务管理。IT服务发展到目前的阶段，企业所使用的系统功能也由孤立的系统转变为综合的集成系统，IT服务所管理的对象也…

大数据 2023年6月3日
0062
Kafka简介和特点 *

什么是Kafka Kafka是一款分布式消息发布和订阅系统，它的特点是高性能、高吞吐量。最早设计的目的是作为LinkedIn的活动流和运营数据的处理管道。这些数据主要是用来对用户做…

大数据 2023年5月25日
0062

2024 年 4 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

数值优化：经典二阶确定性算法与对偶方法

1 牛顿法

2 拟牛顿法

3 对偶方法

4 对确定性算法方法的小结

大家都在看