Arcface详解(通透、清晰)

2023年5月28日上午5:00 • 人工智能 • 阅读 90

文章目录

(一)、研究背景
(二)、论文详解
*
1.1、Abstract
1.2、Introduction
(三)、Arcface Loss代码详解

人脸识别中Softmax-based Loss的演化史
 ArcFace: Additive Angular Margin Loss for Deep Face Recognition论文

(一)、研究背景

1、提出问题yellow

1、需要特征的d i s c r i m n a t i o n discrimnation d i s c r i m n a t i o n
2、之前提出的一些方法，如triplet loss，center loss， L-Softmax, A-Softmax, cosface, 但是都有缺陷。
3、需要找到一个更加有效的方法（Arcface）

2、之前主要方法的缺陷yellow

之前的主要方法：

1、Center Loss
2、Triplet loss
3、Sphereface（a-softmax）
4、CosFace

缺陷：

1、强制改变特征的内在分布,需要联合softmax一起训练，类别多的时候很难训练。
2、训练数据选择复杂度高，semi-hard样本的选择也很复杂，大大降低训练的效率。
3、计算复杂度高，需要辅助函数(保证单调性)，以及计算cos(m*theta)，并且训练很不稳定，难收敛，需要很多策略，比如一开始要和softmax的loss做插值。

研究成果以及意义

1.代码修改简单
2.训练效率高，几乎没有额外的计算开销
3.有着直观的几何解释
4.性能取得很好的效果

(二)、论文详解

1.1、Abstract

用深度学习提取特征的主要挑战是设计好的具有区分性的损失函数
最近也有一些方法解决这个问题如center loss, sphereface等
提出arcface来获得高度具有区分性的特征用作人脸识别，并且有着清晰的几何解释
在很多人脸识别的任务上取得很好的效果

1、提出问题yellow

1.2、Introduction

用DCNN来学习人脸表示，把人脸图片映射到特征空间，使得类内距离小类间距离大
一般两条主线方法，代表分别为softmax loss和triplet loss,但是都有一些缺陷
一些方法用来加强特征的区分性，比如center loss,但是也有不足
Sphereface 和cosface 的优点与缺点 5. 提出arcface,并给出它的算法框架，并且总结了其优势

Face recognition的DCNN训练主要有两种方法和他们的缺点

1、The Softmax Classifer
缺点如下：

(1) 线性变换矩阵的大小W∈Rd×n随着身份数的增加而线性增加n;
(2) 学习到的特征对于封闭集的分类问题是可分离的的分类问题是可分离的，但对于开放性的人脸识别问题来说，却没有足够的鉴别力。脸部识别的问题。

softmax损失函数并没有明确地对特征嵌入进行优化。特征嵌入，对类内样本执行更高的相似性，对类间样本执行更高的多样性。这导致了在大的类内外观变化（如姿势变化）下，深度人脸识别的性能差距。大的类内外观变化（例如：姿势变化 [28, 44]和年龄差距[19, 45]）和大规模测试场景下的性能差距。(例如，百万[12, 37, 18]或万亿对[1]）。

2、The Triplet Loss
缺点如下：

（1）脸的数量会出现组合式爆炸，特别是对于大规模的数据集来说。特别是对于大规模的数据集，会导致迭代步骤的显著增加；
（2）半硬样本挖掘是一个相当困难的问题，需要有效的模型训练是一个相当困难的问题。

Softmax Loss的一些变体，以提高softmax损失的判别能力。

1.the centre loss
具体细节可以查看我的这篇博客：CenterLoss原理详解（通透）
(1)、每个特征向量与其类中心之间的欧氏距离，以获得类内的紧凑性，而类间的分散性则由softmax损失的联合惩罚来保证。
(2)、Nevertheless, updating the actual centres during training is extremely difficult as the number of faceclasses available for training has recently dramatically increased.

2、Sphereface
θ θθ乘以决策余量m m m，进行权重归一化，并将偏置项归零（∣ ∣ W i ∣ ∣ = 1 ， b i = 0 ||W_i||=1，b_i=0 ∣∣W i ∣∣=1 ，b i =0）

3、Cosface
CosFace [35, 33] directly adds cosine margin penalty to the target logit, which obtains better performance compared to SphereFace but admits much easier implementation and relieves the need for joint supervision from the softmax loss.

4、Arcface
性能：further improve the discriminative power of the face recognition model and to stabilise the trainingprocess。
如下图所示，DCNN特征与最后一个完全连接层之间的点积等于特征和权值归一化后的余弦距离。
论文中详细讲述的关于x i x_i x i 和w w w 细节如下：
For simplicity, we fix the bias b j = 0 b_j = 0 b j =0 as in [15]. Then,we transform the logit [24] as W j T x i = ∣ ∣ W j ∣ ∣ ∣ ∣ x i ∣ ∣ c o s θ j W_j^T x_i = ||W_j|| ||x_i|| cosθ_j W j T x i =∣∣W j ∣∣∣∣x i ∣∣c o s θj ,where θ j θ_j θj is the angle between the weight W j W_j W j and the feature x i x_i x i . Following [15, 35, 34], we fix the individual weight W j W_j W j = 1 by L 2 L_2 L 2 normalisation. Following [26, 35, 34, 33],we also fix the embedding feature x i x_i x i by L 2 L_2 L 2 normalisation and re-scale it to s s s. The normalisation step on features and weights makes the predictions only depend on the angle between the feature and the weight. The learned embedding features are thus distributed on a hypersphere with a radius of s s s.
关于x i x_i x i 和w w w 处理公式如下：

1、x i − − > x i ∣ ∣ x i ∣ ∣ − − > s x i ∣ ∣ x i ∣ ∣ x_i –> \frac{x_i}{||x_i||}–>s\frac{x_i}{||x_i||}x i −−>∣∣x i ∣∣x i −−>s ∣∣x i ∣∣x i
2、∣ ∣ s x i ∣ ∣ x i ∣ ∣ ∣ ∣ = s ∣ ∣ x i ∣ ∣ ∣ ∣ x i ∣ ∣ = s ||s\frac{x_i}{||x_i||}|| = \frac{s}{||x_i||}||x_i|| = s ∣∣s ∣∣x i ∣∣x i ∣∣=∣∣x i ∣∣s ∣∣x i ∣∣=s

L 2 = − 1 N ∑ i = 1 N l o g e s ∗ c o s ( θ y j + m ) e s ( c o s ( θ y i + m ) ) + ∑ j = 1 , j ≠ y i n e s ∗ c o s ( θ j ) L_2 = -\frac{1}{N}\sum_{i=1}^Nlog\frac{e^{scos(θ_{y_j + m})}}{e^{s(cos(θ_{y_i}+m))}+ \sum_{j=1,j\neq y_i}^n e^{scos(θ_j)}}L 2 =−N 1 ∑i =1 N l o g e s (c o s (θy i +m ))+∑j =1 ,j =y i n e s ∗c o s (θj )e s ∗c o s (θy j +m )

; (三)、Arcface Loss代码详解

Arcface为什么很容易实现？

我们看看论文中是如何叙述的：
Easy： ArcFace only needs several lines of code as given in Algorithm 1 and is extremely easy to implement in thecomputational-graph-based deep learning frameworks, e.g. MxNet [5], Pytorch [23] and Tensorflow [2].

ArcfaceMarginProduct官方源码

class ArcMarginProduct(nn.Module):
    r"""Implement of large margin arc distance: :
        Args:
            in_features: size of each input sample
            out_features: size of each output sample
            s: norm of input feature
            m: margin

            cos(theta + m)
"""
    def __init__(self, in_features, out_features, s=30.0, m=0.50, easy_margin=False):
        super(ArcMarginProduct, self).__init__()
        self.in_features = in_features
        self.out_features = out_features
        self.s = s
        self.m = m
        self.weight = Parameter(torch.FloatTensor(out_features, in_features))
        nn.init.xavier_uniform_(self.weight)

        self.easy_margin = easy_margin
        self.cos_m = math.cos(m)
        self.sin_m = math.sin(m)
        self.th = math.cos(math.pi - m)
        self.mm = math.sin(math.pi - m) * m

    def forward(self, input, label):

        cosine = F.linear(F.normalize(input), F.normalize(self.weight))

        sine = torch.sqrt((1.0 - torch.pow(cosine, 2)).clamp(0, 1))

        phi = cosine * self.cos_m - sine * self.sin_m

        if self.easy_margin:
            phi = torch.where(cosine > 0, phi, cosine)
        else:
            phi = torch.where(cosine > self.th, phi, cosine - self.mm)

        one_hot = torch.zeros(cosine.size(), device='cuda')
        one_hot.scatter_(1, label.view(-1, 1).long(), 1)

        output = (one_hot * phi) + ((1.0 - one_hot) * cosine)
        output *= self.s

        return output

ArcfaceMarginProduct官方源码我在下图中做了详细的推导，方便理解。

下面的和源码差不多，但是比官方源码好跑通，不缺文件。
bubbliiiing/arcface-pytorch代码

class Arcface_Head(Module):
    def __init__(self, embedding_size=128, num_classes=10575, s=64., m=0.5):
        super(Arcface_Head, self).__init__()
        self.s = s
        self.m = m
        self.weight = Parameter(torch.FloatTensor(num_classes, embedding_size))
        nn.init.xavier_uniform_(self.weight)

        self.cos_m = math.cos(m)
        self.sin_m = math.sin(m)
        self.th = math.cos(math.pi - m)
        self.mm = math.sin(math.pi - m) * m

    def forward(self, input, label):
        cosine  = F.linear(input, F.normalize(self.weight))
        sine    = torch.sqrt((1.0 - torch.pow(cosine, 2)).clamp(0, 1))
        phi     = cosine * self.cos_m - sine * self.sin_m

        phi     = torch.where(cosine.float() > self.th, phi.float(), cosine.float() - self.mm)

        one_hot = torch.zeros(cosine.size()).type_as(phi).long()
        one_hot.scatter_(1, label.view(-1, 1).long(), 1)
        output  = (one_hot * phi) + ((1.0 - one_hot) * cosine)
        output  *= self.s
        return output

Original: https://blog.csdn.net/weixin_54546190/article/details/124606131
Author: ☞源仔
Title: Arcface详解(通透、清晰)

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/529877/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Pandas学习笔记（二）—— Pandas索引

导入需要使用的库和文件： >>> import numpy as np >>> import pandas as pd >>>…

人工智能 2023年7月7日
0076
SESS: Self-Ensembling Semi-Supervised 3D Object Detection

papercode ; 问题与创新点问题 3D 点云数据标注代价大创新点第一个提出只以点云作为输入的3D目标检测半监督算法将随机扰动泛化到点云数据根据点云的特性设置3个一…

人工智能 2023年7月12日
0086
算法如何进行评分预测

评分预测问题概述评分预测是指根据用户的历史评分数据，通过算法来预测用户对未评分项目的评分。在推荐系统、电影评价和产品推荐等领域中，评分预测是一个常见而重要的问题。本文将详细介绍一…

人工智能 2024年1月2日
0047
【自用】图像算法、计算机视觉面试问题及答案1.0

传统机器算法 2022.4.11 图像预处理图像预处理的主要目的是消除图像中无关的信息，恢复有用的真实信息，增强有关信息的可检测性、最大限度地简化数据，从而改进特征提取、图像分割…

人工智能 2023年7月19日
0069
【机器学习】监督学习模型中的线性回归模型和分类模型

系列文章目录第三章 Python 机器学习入门之线性回归模型和分类模型目录系列文章目录一、线性回归模型二、分类模型三、监督学习的过程一、线性回归模型下面来学习监督学…

人工智能 2023年7月2日
0092
ASP.NET美容化妆小程序（独立后台+前端小程序）源码分享

啊哦~你想找的内容离你而去了哦内容不存在，可能为如下原因导致： ① 内容还在审核中 ② 内容以前存在，但是由于不符合新的规定而被删除 ③ 内容地址错误 ④ 作者删除了内容。可…

人工智能 2023年6月27日
00104
【超详细】机器学习sklearn之分类模型评估混淆矩阵、ROC曲线、召回率与精度、F1分数

学习目标：机器学习之分类模型的评估学习内容：学习分类模型评估的方法：1、混淆矩阵2、分类结果汇总3、ROC曲线4、召回率与精度5、F1分数基本知识：一、评估分类器性能的度…

人工智能 2023年6月30日
0097
FAQ智能问答系统设计与实现

一、项目介绍 FAQ（FAQ，frequently-asked questions）问答系统表示常见问题问答系统，常用于一些特定领域的智能客服，将用户经常问到的高频问答对索引起来，…

人工智能 2023年5月27日
0083
改变conda虚拟环境的默认路径

anaconda下指定虚拟环境的创建路径conda环境默认安装在用户目录C:\Users\username.conda\envs下，如果选择默认路径，那么之后创建虚拟环境，也是安装…

人工智能 2023年7月29日
0077
python微信公众号自动推送（十分简单的教程）

目录一、注册微信公众号 1.注册链接 2.登录成功 3.关注该公众号 4.创建模板二、代码实现 1.爬取天气信息 2.计算生日天数 3.获取access token 4.获取关…

人工智能 2023年7月4日
0070
python3.9安装tensorflow-gpu2.6以上

1、机器环境说明： CPU：i5-7300HQ GPU：NVIDIA GeForce GTX 1050 1、查询对应版本链接：在 Windows 环境中从源代码构建 | Tens…

人工智能 2023年5月24日
0093
python手写逻辑回归算法【机器学习】

算法介绍在生活中，我们常常能听见这样的说法，”您的这辆车已经使用了5年了，有80%的概率会出一些小的故障。” 我们会不会觉得很奇怪，一件事情会发生就是会发…

人工智能 2023年6月17日
0077
Windows10系统CUDA和CUDNN安装教程

目录一、查看CUDA版本二、下载并安装CUDA 三、测试CUDA是否安装成功四、下载并安装CUDNN 五、测试CUDNN是否安装成功方案1 方案2 一、查看CUDA版本 1…

人工智能 2023年7月26日
0073
【预训练语言模型】RoBERTa: A Robustly Optimized BERT Pretraining Approach

【预训练语言模型】RoBERTa: A Robustly Optimized BERT Pretraining Approach 作者发现BERT以及提供的预训练语言模型并没有得到…

人工智能 2023年5月30日
00115
机器学习之数据均衡算法种类大全+Python代码一文详解

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年7月4日
0095
Deepmd-lammps在集群上的编译方法（以南方科大的集群为例）

Deepmd是一款高效的神经网络势函数训练软件，只需要有一些Linux和基本的lammps分子动力学基础的使用者即可上手。Deepmd具有很好的lammps接口。但是对于初学者而言…

人工智能 2023年5月26日
0084

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Arcface详解(通透、清晰)

文章目录

1.1、Abstract

1.2、Introduction

大家都在看