人脸聚类Learning to Cluster Faces（翻译）

2023年5月31日上午11:17 • 人工智能 • 阅读 113

This repo provides an official implementation for [1, 2] and a re-implementation of [3].

这个 repo 提供了 [1, 2] 的官方实现和 [3] 的复现。

conda install faiss-gpu -c pytorch
pip install -r requirements.txt

1、数据格式

文件夹目录

.
├── data
| ├── features
| | └── xxx.bin
│ ├── labels
| | └── xxx.meta
│ ├── knns
| | └── …

features currently supports binary file. (We plan to support np.save file in near future.)
labels supports plain text where each line indicates a label corresponding to the feature file.
knns is not necessary as it can be built with the provided functions.

–功能：目前支持二进制文件。（我们计划在不久的将来支持 np.save 文件。）
–标签：支持纯文本，其中每行表示与特征文件对应的标签。
–knns不是必需的，因为它可以使用提供的函数构建。

以 MS-Celeb-1M（Part0 和 Part1）为例。数据目录如下：

data
├── features
├── part0_train.bin # acbbc780948e7bfaaee093ef9fce2ccb
├── part1_test.bin # ced42d80046d75ead82ae5c2cdfba621
├── labels
├── part0_train.meta # class_num=8573, inst_num=576494
├── part1_test.meta # class_num=8573, inst_num=584013
├── knns
├── part0_train/faiss_k_80.npz # 5e4f6c06daf8d29c9b940a851f28a925
├── part1_test/faiss_k_80.npz # d4a7f95b09f80b0167d893f2ca0f5be5
├── pretrained_models
├── pretrained_gcn_d_ms1m.pth # 213598e70ddbc50f5e3661a6191a8be1
├── pretrained_gcn_s_ms1m.pth # 3251d6e7d4f9178f504b02d8238726f7
├── pretrained_gcn_d_iop_ms1m.pth # 314fba47b5156dcc91383ad611d5bd96
├── pretrained_gcn_v_ms1m.pth # 020236d4e8dbff975360f08cb47109c0
├── pretrained_gcn_e_ms1m.pth # 315ff08f28f14bc494dd36158c11e900
├── pretrained_lgcn_ms1m.pth # 97fc6e52d1b5e907eabeb01e7b0825f9

To experiment with custom dataset, it is required to provided extracted features and labels. For training, the number of features should be equal to the number of labels. For testing, the F-score will be evaluated if labels are provided, otherwise only clustering results will be generated.

Note that labels is only required for training clustering model, but it is not mandatory for clustering unlabeled data. Basically, there are two ways to cluster unlabeled data without meta file. (1) Do not pass the label_path in config file. It will not generate loss and evaluation results. (2) Make a pseudo meta label, e.g., setting all labels to -1, but just ignore the loss and the evaluation results.

要试验自定义数据集，需要提供提取的特征和标签。对于训练，特征数应等于标签数。对于测试，如果提供标签，将评估 F-score，否则只会生成聚类结果。

请注意，仅在训练聚类模型时才需要标签，但对未标记数据进行聚类不是必需的。基本上，有两种方法可以在没有元文件的情况下对未标记的数据进行聚类。 (1) 不要在配置文件中传递label_path。它不会产生损失和评估结果。 (2) 制作一个伪元标签，例如，将所有标签设置为-1，但忽略损失和评估结果。

2、支持的数据集Supported datasets

您可以使用以上链接或以下脚本下载数据集：

python tools/download_data.py

我们在不同的数据集上提供预训练的聚类模型。

ModelsMS-Celeb-1MDeepFashionGCN-D + GCN-S

(passwd: v4zv)

(passwd: 3rhf)GCN-V + GCN-E

(passwd: anw6)

(passwd: 6kzj)LGCN

(passwd: dr9p)

(passwd: 6jf5)

0、Fetch code & Create soft link

获取代码并创建软件链接

git clone git@github.com:yl-1993/learn-to-cluster.git
cd learn-to-cluster
ln -s xxx/data data

1、运行算法Run algorithms

Follow the instructions in dsgcn, vegcn and lgcn to run algorithms.

按照 dsgcn、vegcn 和 lgcn 中的说明运行算法。

1、Results on part1_test (584K)

MethodPrecisionRecallF-scoreChinese Whispers (k=80, th=0.6, iters=20)55.4952.4653.93Approx Rank Order (k=80, th=0)99.777.213.42MiniBatchKmeans (ncluster=5000, bs=100)45.4880.9858.25KNN DBSCAN (k=80, th=0.7, eps=0.25, min=1)95.2552.7967.93FastHAC (dist=0.72, single)92.0757.2870.63

(ncluster=8573, affinity=’rbf’)78.7566.5972.16

(single model, th=0.7)80.1970.4775.02

(k_at_hop=[200, 10], active_conn=10, step=0.6, maxsz=300)74.3883.5178.68GCN-D (2 prpsls)95.4167.7779.25GCN-D (5 prpsls)94.6272.5982.15GCN-D (8 prpsls)94.2379.6986.35GCN-D (20 prplss)94.5481.6287.61GCN-D + GCN-S (2 prpsls)99.0767.2280.1GCN-D + GCN-S (5 prpsls)98.8472.0183.31GCN-D + GCN-S (8 prpsls)97.9378.9887.44GCN-D + GCN-S (20 prpsls)97.9180.8688.57GCN-V92.4582.4287.14GCN-V + GCN-E92.5683.7487.93

Note that the prpsls in above table indicate the number of parameters for generating proposals, rather than the actual number of proposals. For example, 2 prpsls generates 34578 proposals and 20 prpsls generates 283552 proposals.

请注意，上表中的 prpsls 表示生成proposals参数的数量，而不是实际的proposals数量。例如，2 prpsls 生成 34578 个proposals，20 个 prpsls 生成 283552 个proposals。

2、基准 (5.21M)Benchmarks (5.21M)

1, 3, 5, 7, 9 denotes different scales of clustering. Details can be found in Face Clustering Benchmarks.

1, 3, 5, 7, 9 表示聚类的不同尺度。详细信息可以在 Face Clustering Benchmarks 中找到。

Pairwise F-score13579CDP (single model, th=0.7)75.0270.7569.5168.6268.06LGCN78.6875.8374.2973.772.99GCN-D (2 prpsls)79.2575.7273.9072.6271.63GCN-D (5 prpsls)82.1577.7175.573.9972.89GCN-D (8 prpsls)86.3582.4180.3278.9877.87GCN-D (20 prpsls)87.6183.7681.6280.3379.21GCN-V87.1483.4981.5179.9778.77GCN-V + GCN-E87.9384.0482.180.4579.3 BCubed F-score13579CDP (single model, th=0.7)78.775.8274.5873.6272.92LGCN84.3781.6180.1179.3378.6GCN-D (2 prpsls)78.8976.0574.6573.5772.77GCN-D (5 prpsls)82.5678.3376.3975.0274.04GCN-D (8 prpsls)86.7383.0181.179.8478.86GCN-D (20 prpsls)87.7683.998280.7279.71GCN-V85.8182.6381.0579.9279.08GCN-V + GCN-E86.0982.8481.2480.0979.25 NMI13579CDP (single model, th=0.7)94.6994.6294.6394.6294.61LGCN96.1295.7895.6395.5795.49GCN-D (2 prpsls)94.6894.6694.6394.5994.55GCN-D (5 prpsls)95.6495.1995.0394.9194.83GCN-D (8 prpsls)96.7596.2996.0895.9595.85GCN-D (20 prpsls)97.0496.5596.3396.1896.07GCN-V96.3796.0195.8395.6995.6GCN-V + GCN-E96.4196.0395.8595.7195.62

3、在YouTube-Faces上的结果Results on YouTube-Faces

MethodPairwise F-scoreBCubed F-scoreNMIChinese Whispers (k=160, th=0.75, iters=20)72.970.5593.25Approx Rank Order (k=200, th=0)76.4575.4594.34Kmeans (ncluster=1436)67.8675.7793.99KNN DBSCAN (k=160, th=0., eps=0.3, min=1)91.3589.3497.52FastHAC (dist=0.72, single)93.0787.9897.19GCN-D (4 prpsls)94.4491.3397.97

4、在DeepFashion的结果Results on DeepFashion

MethodPairwise F-scoreBCubed F-scoreNMIChinese Whispers (k=5, th=0.7, iters=20)31.2253.2589.8Approx Rank Order (k=10, th=0)25.0452.7788.71Kmeans (ncluster=3991)32.0253.388.91KNN DBSCAN (k=4, th=0., eps=0.1, min=2)25.0753.2390.75FastHAC (dist=0.4, single)22.5448.7790.44Meanshift (bandwidth=0.5)31.6156.7389.29Spectral (ncluster=3991, affinity=’rbf’)29.647.1286.95DaskSpectral (ncluster=3991, affinity=’rbf’)24.2544.1186.21CDP (single model, k=2, th=0.5, maxsz=200)28.2857.8390.93L-GCN (k_at_hop=[5, 5], active_conn=5, step=0.5, maxsz=50)30.760.1390.67GCN-D (2 prpsls)29.1459.0989.48GCN-D (8 prpsls)32.5257.5289.54GCN-D (20 prpsls)33.2556.8389.36GCN-V33.5959.4190.88GCN-V + GCN-E38.4760.0690.5

For training face recognition and feature extraction, you may use any frameworks below, including but not limited to:

对于人脸识别和特征提取的训练，您可以使用以下任何框架，包括但不限于：

Please cite the following paper if you use this repository in your reseach.

@inproceedings{yang2019learning,
title={Learning to Cluster Faces on an Affinity Graph},
author={Yang, Lei and Zhan, Xiaohang and Chen, Dapeng and Yan, Junjie and Loy, Chen Change and Lin, Dahua},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2019}
}
@inproceedings{yang2020learning,
title={Learning to Cluster Faces via Confidence and Connectivity Estimation},
author={Yang, Lei and Chen, Dapeng and Zhan, Xiaohang and Zhao, Rui and Loy, Chen Change and Lin, Dahua},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
year={2020}
}

Original: https://blog.csdn.net/dwf1354046363/article/details/121205096
Author: 易小侠
Title: 人脸聚类Learning to Cluster Faces（翻译）

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/550740/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

A. 知识图谱概述

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年6月10日
0058
middles在python中什么意思_在Python中解析大型RDF

如果您正在寻找快速性能，那么我建议您将Raptor与Redland Python Bindings一起使用。用C语言编写的Raptor的性能比RDFLib好得多。如果您不想处理C，…

人工智能 2023年6月10日
00154
win10+yolov3+tensorflow2.0下的目标检测

一、安装pycharm 二、安装anaconda 三、anaconda下安装tensorflow-cpu 1.添加一个虚拟环境tensorflow 打开anaconda navig…

人工智能 2023年7月10日
0065
size mismatch问题：训练权重不匹配问题

在测试二阶段和三阶段模型的时候程序一直报错： RuntimeError: Error(s) in loading state_dict for Eff:size mismatch …

人工智能 2023年7月5日
0070
数据分析入门task5

本记录为本人参加datawhale数据分析（泰坦尼克号任务项目）学习笔记，不足之处多多指教。经过前面的两章的知识点的学习，我可以对数数据的本身进行处理，比如数据本身的增删查补，还…

人工智能 2023年7月18日
0049
onnx标准 & onnxRuntime加速推理引擎

onnx标准 & onnxRuntime加速推理引擎文章目录 onnx标准 & onnxRuntime加速推理引擎 * 一、onnx简介二、pytorch转on…

人工智能 2023年6月15日
00108
【网络安全】记一次APP登录爆破

啊哦~你想找的内容离你而去了哦内容不存在，可能为如下原因导致： ① 内容还在审核中 ② 内容以前存在，但是由于不符合新的规定而被删除 ③ 内容地址错误 ④ 作者删除了内容。可…

人工智能 2023年7月29日
0083
单目测距原理与实现（代码可运行）

Opencv3实现单目视觉测距一、前言单目视觉测距：网上有很多关于单目测距的文章，主要借鉴的是OpenCV学习笔记（二十一）——简单的单目视觉测距尝试和单目摄像机测距（pyth…

人工智能 2023年5月26日
0082
鼠标点击获得opencv图像坐标和像素值

一、核心函数二、在类中定义并且使用 1、将回调函数直接声明为友元函数 2、.h 3、DW_S_OnMou.cpp 4、main.cpp、三、函数调用 1、OnMouse.h …

人工智能 2023年7月20日
0044
【游戏开发教程】BehaviorDesigner插件制作AI行为树（Unity | 保姆级教程 | 动态图演示 | Unity2021最新版）

文章目录 * – + 一、前言 + 二、插件下载 + * 1、AssetStore下载 * 2、GitCode下载 + 三、官方教程 + * 1、在线文档 * 2、离线…

人工智能 2023年7月25日
00173
10 【Express基本使用】

10 【Express基本使用】 https://www.expressjs.com.cn/ 基于 Node.js 平台，快速、开放、极简的 web 开发框架。 1.Express…

人工智能 2023年6月28日
0082
AI 全自动玩斗地主，靠谱吗？Douzero算法教程

你觉得，AI 全自动玩斗地主，胜率能有多高？真就有100%胜率，实现欢乐豆自由？我让这个 AI 自己玩了一小时，结果出乎意料。先不着急说最终结果，我们先来看看这个 AI 有多强…

人工智能 2023年6月24日
00102
深度学习论文: Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions及其PyTorch实现

深度学习论文: Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions及其PyTorch实现I…

人工智能 2023年7月10日
0098
【测试pytorch是否可以使用】

判断pytorch-gpu版本是否可以使用文章目录测试pytorch是否可以使用一、pycharm 二、查看Anaconda-pytorch * 1.进入Anaconda p…

人工智能 2023年7月22日
0055
iOS 分类Category

1.Category定义 Category的主要作用是为已经存在的类添加方法。Objective-C 中的 Category 就是对装饰模式的一种具体实现。它的主要作用是在不改变原…

人工智能 2023年7月2日
0088
机器学习之深度学习二分类、多分类、多标签分类、多任务分类

多任务学习可以&#…

人工智能 2023年7月13日
00101

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31