人脸聚类Learning to Cluster Faces(翻译)

This repo provides an official implementation for [1, 2] and a re-implementation of [3].

这个 repo 提供了 [1, 2] 的官方实现和 [3] 的复现。

conda install faiss-gpu -c pytorch
pip install -r requirements.txt

1、数据格式

文件夹目录

.
├── data
| ├── features
| | └── xxx.bin
│ ├── labels
| | └── xxx.meta
│ ├── knns
| | └── …

  • features currently supports binary file. (We plan to support np.save file in near future.)
  • labels supports plain text where each line indicates a label corresponding to the feature file.

  • knns is not necessary as it can be built with the provided functions.

–功能:目前支持二进制文件。 (我们计划在不久的将来支持 np.save 文件。)
–标签:支持纯文本,其中每行表示与特征文件对应的标签。
–knns不是必需的,因为它可以使用提供的函数构建。

以 MS-Celeb-1M(Part0 和 Part1)为例。 数据目录如下:

data
├── features
├── part0_train.bin # acbbc780948e7bfaaee093ef9fce2ccb
├── part1_test.bin # ced42d80046d75ead82ae5c2cdfba621
├── labels
├── part0_train.meta # class_num=8573, inst_num=576494
├── part1_test.meta # class_num=8573, inst_num=584013
├── knns
├── part0_train/faiss_k_80.npz # 5e4f6c06daf8d29c9b940a851f28a925
├── part1_test/faiss_k_80.npz # d4a7f95b09f80b0167d893f2ca0f5be5
├── pretrained_models
├── pretrained_gcn_d_ms1m.pth # 213598e70ddbc50f5e3661a6191a8be1
├── pretrained_gcn_s_ms1m.pth # 3251d6e7d4f9178f504b02d8238726f7
├── pretrained_gcn_d_iop_ms1m.pth # 314fba47b5156dcc91383ad611d5bd96
├── pretrained_gcn_v_ms1m.pth # 020236d4e8dbff975360f08cb47109c0
├── pretrained_gcn_e_ms1m.pth # 315ff08f28f14bc494dd36158c11e900
├── pretrained_lgcn_ms1m.pth # 97fc6e52d1b5e907eabeb01e7b0825f9

To experiment with custom dataset, it is required to provided extracted features and labels. For training, the number of features should be equal to the number of labels. For testing, the F-score will be evaluated if labels are provided, otherwise only clustering results will be generated.

Note that labels is only required for training clustering model, but it is not mandatory for clustering unlabeled data. Basically, there are two ways to cluster unlabeled data without meta file. (1) Do not pass the label_path in config file. It will not generate loss and evaluation results. (2) Make a pseudo meta label, e.g., setting all labels to -1, but just ignore the loss and the evaluation results.

要试验自定义数据集,需要提供提取的特征和标签。 对于训练,特征数应等于标签数。 对于测试,如果提供标签,将评估 F-score,否则只会生成聚类结果。

请注意,仅在训练聚类模型时才需要标签,但对未标记数据进行聚类不是必需的。 基本上,有两种方法可以在没有元文件的情况下对未标记的数据进行聚类。 (1) 不要在配置文件中传递label_path。 它不会产生损失和评估结果。 (2) 制作一个伪元标签,例如,将所有标签设置为-1,但忽略损失和评估结果。

2、支持的数据集Supported datasets

您可以使用以上链接或以下脚本下载数据集:

python tools/download_data.py

我们在不同的数据集上提供预训练的聚类模型。

ModelsMS-Celeb-1MDeepFashionGCN-D + GCN-S

or

(passwd: v4zv)

or

(passwd: 3rhf)GCN-V + GCN-E

or

(passwd: anw6)

or

(passwd: 6kzj)LGCN

or

(passwd: dr9p)

or

(passwd: 6jf5)

0、Fetch code & Create soft link

获取代码并创建软件链接

git clone git@github.com:yl-1993/learn-to-cluster.git
cd learn-to-cluster
ln -s xxx/data data

1、运行算法Run algorithms

Follow the instructions in dsgcn, vegcn and lgcn to run algorithms.

按照 dsgcn、vegcn 和 lgcn 中的说明运行算法。

1、Results on part1_test (584K)

MethodPrecisionRecallF-scoreChinese Whispers (k=80, th=0.6, iters=20)55.4952.4653.93Approx Rank Order (k=80, th=0)99.777.213.42MiniBatchKmeans (ncluster=5000, bs=100)45.4880.9858.25KNN DBSCAN (k=80, th=0.7, eps=0.25, min=1)95.2552.7967.93FastHAC (dist=0.72, single)92.0757.2870.63

(ncluster=8573, affinity=’rbf’)78.7566.5972.16

(single model, th=0.7)80.1970.4775.02

(k_at_hop=[200, 10], active_conn=10, step=0.6, maxsz=300)74.3883.5178.68GCN-D (2 prpsls)95.4167.7779.25GCN-D (5 prpsls)94.6272.5982.15GCN-D (8 prpsls)94.2379.6986.35GCN-D (20 prplss)94.5481.6287.61GCN-D + GCN-S (2 prpsls)99.0767.2280.1GCN-D + GCN-S (5 prpsls)98.8472.0183.31GCN-D + GCN-S (8 prpsls)97.9378.9887.44GCN-D + GCN-S (20 prpsls)97.9180.8688.57GCN-V92.4582.4287.14GCN-V + GCN-E92.5683.7487.93

Note that the prpsls in above table indicate the number of parameters for generating proposals, rather than the actual number of proposals. For example, 2 prpsls generates 34578 proposals and 20 prpsls generates 283552 proposals.

请注意,上表中的 prpsls 表示生成proposals参数的数量,而不是实际的proposals数量。 例如,2 prpsls 生成 34578 个proposals,20 个 prpsls 生成 283552 个proposals。

2、基准 (5.21M)Benchmarks (5.21M)

1, 3, 5, 7, 9 denotes different scales of clustering. Details can be found in Face Clustering Benchmarks.

1, 3, 5, 7, 9 表示聚类的不同尺度。 详细信息可以在 Face Clustering Benchmarks 中找到。

Pairwise F-score13579CDP (single model, th=0.7)75.0270.7569.5168.6268.06LGCN78.6875.8374.2973.772.99GCN-D (2 prpsls)79.2575.7273.9072.6271.63GCN-D (5 prpsls)82.1577.7175.573.9972.89GCN-D (8 prpsls)86.3582.4180.3278.9877.87GCN-D (20 prpsls)87.6183.7681.6280.3379.21GCN-V87.1483.4981.5179.9778.77GCN-V + GCN-E87.9384.0482.180.4579.3 BCubed F-score13579CDP (single model, th=0.7)78.775.8274.5873.6272.92LGCN84.3781.6180.1179.3378.6GCN-D (2 prpsls)78.8976.0574.6573.5772.77GCN-D (5 prpsls)82.5678.3376.3975.0274.04GCN-D (8 prpsls)86.7383.0181.179.8478.86GCN-D (20 prpsls)87.7683.998280.7279.71GCN-V85.8182.6381.0579.9279.08GCN-V + GCN-E86.0982.8481.2480.0979.25 NMI13579CDP (single model, th=0.7)94.6994.6294.6394.6294.61LGCN96.1295.7895.6395.5795.49GCN-D (2 prpsls)94.6894.6694.6394.5994.55GCN-D (5 prpsls)95.6495.1995.0394.9194.83GCN-D (8 prpsls)96.7596.2996.0895.9595.85GCN-D (20 prpsls)97.0496.5596.3396.1896.07GCN-V96.3796.0195.8395.6995.6GCN-V + GCN-E96.4196.0395.8595.7195.62

3、在YouTube-Faces上的结果Results on YouTube-Faces

MethodPairwise F-scoreBCubed F-scoreNMIChinese Whispers (k=160, th=0.75, iters=20)72.970.5593.25Approx Rank Order (k=200, th=0)76.4575.4594.34Kmeans (ncluster=1436)67.8675.7793.99KNN DBSCAN (k=160, th=0., eps=0.3, min=1)91.3589.3497.52FastHAC (dist=0.72, single)93.0787.9897.19GCN-D (4 prpsls)94.4491.3397.97

4、在DeepFashion的结果Results on DeepFashion

MethodPairwise F-scoreBCubed F-scoreNMIChinese Whispers (k=5, th=0.7, iters=20)31.2253.2589.8Approx Rank Order (k=10, th=0)25.0452.7788.71Kmeans (ncluster=3991)32.0253.388.91KNN DBSCAN (k=4, th=0., eps=0.1, min=2)25.0753.2390.75FastHAC (dist=0.4, single)22.5448.7790.44Meanshift (bandwidth=0.5)31.6156.7389.29Spectral (ncluster=3991, affinity=’rbf’)29.647.1286.95DaskSpectral (ncluster=3991, affinity=’rbf’)24.2544.1186.21CDP (single model, k=2, th=0.5, maxsz=200)28.2857.8390.93L-GCN (k_at_hop=[5, 5], active_conn=5, step=0.5, maxsz=50)30.760.1390.67GCN-D (2 prpsls)29.1459.0989.48GCN-D (8 prpsls)32.5257.5289.54GCN-D (20 prpsls)33.2556.8389.36GCN-V33.5959.4190.88GCN-V + GCN-E38.4760.0690.5

For training face recognition and feature extraction, you may use any frameworks below, including but not limited to:

对于人脸识别和特征提取的训练,您可以使用以下任何框架,包括但不限于:

Please cite the following paper if you use this repository in your reseach.

@inproceedings{yang2019learning,
title={Learning to Cluster Faces on an Affinity Graph},
author={Yang, Lei and Zhan, Xiaohang and Chen, Dapeng and Yan, Junjie and Loy, Chen Change and Lin, Dahua},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2019}
}
@inproceedings{yang2020learning,
title={Learning to Cluster Faces via Confidence and Connectivity Estimation},
author={Yang, Lei and Chen, Dapeng and Zhan, Xiaohang and Zhao, Rui and Loy, Chen Change and Lin, Dahua},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
year={2020}
}

Original: https://blog.csdn.net/dwf1354046363/article/details/121205096
Author: 易小侠
Title: 人脸聚类Learning to Cluster Faces(翻译)

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/550740/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球