MediaPipe基础（3）虹膜(Iris)

2023年5月25日下午2:08 • 人工智能 • 阅读 79

1.摘要

许多真实世界的应用，包括计算摄影(闪光反射)和增强现实(虚拟化身)，都依赖于对眼睛中虹膜的准确跟踪。在移动设备上解决这一问题是一项具有挑战性的任务，因为有限的计算资源、可变的照明条件以及面具的存在(如头发或人类眯着眼睛)。虹膜跟踪也可以用来确定相机到用户的公制距离。这可以改进各种用例，从虚拟试戴合适大小的眼镜和帽子，到根据观看者的距离设置字体大小的辅助功能。通常，使用复杂的专用硬件来计算公制距离，从而限制了可应用该解决方案的设备范围。

[En]

A wide range of real-world applications, including computational photography (flash reflection) and augmented reality (virtual avatars), rely on accurate tracking of the iris in the eye. It is a challenging task to solve this problem on mobile devices due to limited computing resources, variable lighting conditions and the existence of masks (such as hair or human squinting). Iris tracking can also be used to determine the metric distance from the camera to the user. This can improve a variety of use cases, from virtual trying on glasses and hats of the right size to assistive functions of font size based on the distance of the viewer. Typically, complex dedicated hardware is used to calculate metric distances, thus limiting the range of devices to which the solution can be applied.

MediaPipe 虹膜是一种用于精确虹膜估计的 ML 解决方案，能够使用单个 RGB 摄像头实时跟踪涉及虹膜、瞳孔和眼睛轮廓的地标，无需专门的硬件。通过使用虹膜地标，该解决方案还能够以小于 10% 的相对误差确定对象与相机之间的公制距离。请注意，虹膜跟踪不会推断人们正在查看的位置，也不会提供任何形式的身份识别。借助 MediaPipe 框架的跨平台功能，MediaPipe 虹膜可以在大多数现代手机、台式机/笔记本电脑甚至网络上运行。

2.ML管道

管道的第一步利用 MediaPipe面部网格，它生成近似面部几何体的网格。从这个网格中，我们获取原始图像中的眼睛区域，以用于随后的虹膜跟踪步骤。

管道被实现为一个 MediaPipe 图，它使用来自面部地标模块的面部地标子图以及来自虹膜地标模块的虹膜地标子图，并使用专用的虹膜和深度渲染器子图进行渲染。人脸地标子图内部使用人脸检测模块中的人脸检测子图。

管道的输出是一组 478 个 3D 地标，包括来自 MediaPipe 面部网格的 468 个面部地标，眼睛周围的标志进一步细化，最后附加了 10 个额外的虹膜地标（每只眼睛 5 个）。

3.模型

人脸检测模型：人脸检测器与 MediaPipe 人脸检测中使用的 BlazeFace 模型相同。
人脸地标模型：人脸地标模型与 MediaPipe 面部网格中的相同。
虹膜地标模型：虹膜模型捕捉眼睛区域的图像块，并估计眼睛地标(沿眼皮)和虹膜地标(沿虹膜轮廓)。

[En]

Iris landmark model: the iris model captures the image blocks of the eye region and estimates the eye landmarks (along the eyelids) and iris landmarks (along the iris contours).*

眼睛地标（红色）和虹膜地标（绿色）。眼睛地标（红色）和虹膜地标（绿色）。眼睛地标（红色）和虹膜地标（绿色）。

; 4.根据虹膜测量深度

MediaPipe 虹膜能够以小于 10% 的误差确定拍摄对象到相机的公制距离，而无需任何专门的硬件。这是通过依赖这样一个事实来完成的：人眼的水平虹膜直径在广泛的人群中保持大致恒定在 11.7±0.5 毫米，以及一些简单的几何参数。有关更多详细信息，请参阅Google AI 博客文章。

5.解决方案

官方没有虹膜检测的Python代码，我在Github上找到了TensorFlow以及PyTorch的代码：

链接：https://pan.baidu.com/s/13-Kh_2pGUZGn7Fz-Bifw6Q
提取码：123a

5.1目录结构

; 5.1TensorFlow上实现

import numpy as np
import tensorflow as tf
import cv2
import matplotlib.pyplot as plt

def centerCropSquare(img, center, side=None, scaleWRTHeight=None):
    a = side is None
    b = scaleWRTHeight is None
    assert (not a and b) or (a and not b)
    half = 0
    if side is None:
        half = int(img.shape[0] * scaleWRTHeight / 2)
    else:
        half = int(side / 2)

    return img[(center[0] - half):(center[0] + half), (center[1] - half):(center[1] + half), :]

interpreter = tf.lite.Interpreter(model_path="iris_landmark.tflite")
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

img = cv2.imread("test.jpg")
centerRight = [485, 332]
centerLeft = [479, 638]
img = centerCropSquare(img, centerRight,
                       side=400)

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (64, 64))
input_data = np.expand_dims(img.astype(np.float32) / 127.5 - 1.0, axis=0)

input_shape = input_details[0]['shape']
interpreter.set_tensor(input_details[0]['index'], input_data)

interpreter.invoke()

output_data_0 = interpreter.get_tensor(output_details[0]['index'])

eyes = output_data_0
print(eyes.shape)
iris = interpreter.get_tensor(output_details[1]["index"])
print(iris.shape)

plt.imshow(img, zorder=1)
x, y = eyes[0, ::3], eyes[0, 1::3]
plt.scatter(x, y, zorder=2, s=1.0, c="b")

x, y = iris[0, ::3], iris[0, 1::3]
plt.scatter(x, y, zorder=2, s=1.0, c="r")
plt.show()

5.2PyTorch上实现

import torch
from irislandmarks import IrisLandmarks
import matplotlib.pyplot as plt
import cv2

def centerCropSquare(img, center, side=None, scaleWRTHeight=None):
    a = side is None
    b = scaleWRTHeight is None
    assert (not a and b) or (a and not b)
    half = 0
    if side is None:
        half = int(img.shape[0] * scaleWRTHeight / 2)
    else:
        half = int(side / 2)

    return img[(center[0] - half):(center[0] + half), (center[1] - half):(center[1] + half), :]

print("PyTorch version:", torch.__version__)
print("CUDA version:", torch.version.cuda)
print("cuDNN version:", torch.backends.cudnn.version())

gpu = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

net = IrisLandmarks().to(gpu)
net.load_weights("irislandmarks.pth")

img = cv2.imread("test.jpg")
centerRight = [485, 332]
centerLeft = [479, 638]
img = centerCropSquare(img, centerRight,
                       side=400)
img = img[..., ::-1]
plt.imshow(img)
plt.show()

img = cv2.resize(img, (64, 64))

eye_gpu, iris_gpu = net.predict_on_image(img)
eye = eye_gpu.cpu().numpy()
iris = iris_gpu.cpu().numpy()

print(eye.shape)
print(iris.shape)

plt.imshow(img, zorder=1)
x, y = eye[0, :, 0], eye[0, :, 1]
plt.scatter(x, y, zorder=2, s=1.0, c='b')
x, y = iris[0, :, 0], iris[0, :, 1]
plt.scatter(x, y, zorder=2, s=1.0, c='r')
plt.show()

参考目录

https://github.com/cedriclmenard/irislandmarks.pytorch
https://google.github.io/mediapipe/solutions/iris

Original: https://blog.csdn.net/weixin_43229348/article/details/120529083
Author: 求则得之，舍则失之
Title: MediaPipe基础（3）虹膜(Iris)

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/514520/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

利用opencv识别文本

你好呀首先安装好pycharm，我所使用的语言是python，并且利用Tesseract，安装的教程Tesseract OCR 安装过程_清都散闲客的博客-CSDN博客首先选择网…

人工智能 2023年7月19日
0059
没有免费午餐定理和三大机器学习任务

没有免费午餐定理(No Free Lunch Theorem)：任何一个预测函数，如果在一些训练样本上表现好，那么必然在另一些训练样本上表现不好，如果不对训练样本在特征空间的先验分…

人工智能 2023年6月18日
00104
逻辑回归、Softmax回归 — 鸢尾花分类

目录 1.逻辑回归一些回归算法也可用于分类。逻辑回归（Logistic回归，也称为Logit回归）被广泛用于估算一个实例属于某个特定类别的概率。比如，这封电子邮件属于垃圾邮件的…

人工智能 2023年6月24日
0067
VTK实现三维模型的导出保存，STL、OBJ和PLY等格式

基于之前的博客中实现的三维模型，将其导出保存为3D格式的文件，生成的文件可以使用通用的3D浏览器进行查看。附录的代码实现中各变量的命名使用可以查照之前的博客。 1.STL 保存 S…

人工智能 2023年6月17日
00126
IO模型学习笔记

高并发，高性能，高可用，三高高性能：性能是硬件决定的，编写高性能应用程序就是要少浪费硬件资源程序调用分为1.函数调用最快，用户态内内存调用2.系统调用调用操作系统内核的函数…

人工智能 2023年6月30日
0078
Pytorch加载数据集的方式总结

在用Pytorch加载数据集时，看GitHub上的代码经常会用到ImageFolder、DataLoader等一系列方法，而这些方法又是来自于torchvision、torch.u…

人工智能 2023年6月16日
0079
解决OpenCV捕捉USB摄像头时抓帧失败的问题

笔记本上外接了一个USB相机，用OpenCV打开摄像头捕捉图像结果报错如下： [ WARN:1] videoio(MSMF): OnReadSample() is called w…

人工智能 2023年7月19日
0071
stata面板数据聚类及数据导入处理、虚拟变量等

聚类聚类企业 streg y x i.time i.stkcd ,cluster(stkcd) 聚类行业 streg y x i.time i.stkcd ,cluster(in…

人工智能 2023年5月31日
00104
Java实现简单的图书管理系统（讲解清晰，代码齐全，能正常运行）

目录实现的样子大致思路代码（按照这个目录创建包）首先是book包下，有两个类 book类 bookList类然后是operation包下，有7个类和1个接口 AddOp…

人工智能 2023年7月29日
0083
白葡萄酒/红葡萄酒质量分析与预测（PCA+MLPClassifier）100%

白葡萄酒质量数据集数据来自于：https://scikit-learn.org/stable/modules/preprocessing.html 导包 import numpy…

人工智能 2023年7月15日
0057
【Opencv小项目 1】Opencv实现简单颜色识别

参考 Opencv简单颜色识别 Youtube教学视频 BGR HSV颜色模型步骤一、 BGR 和 HSV 颜色模型 BGR ModelBGR模型表示三种颜色通道：红、绿、蓝，…

人工智能 2023年6月18日
0057
为神经网络选择正确的激活函数

我们都知道神经网络模型中使用激活函数的主要目的是将非线性特性引入到我们的网络中，强化网络的学习能力。激活函数应用于隐藏层和输出层中每个节点的称为 z 的输入加权和（此处输入可以是原…

人工智能 2023年6月15日
0088
网络信息安全笔记—逻辑漏洞

逻辑漏洞简介逻辑漏洞是指攻击者利用业务/功能上的设计缺陷，获取敏感信息或破坏业务的完整性。一般出现在密码修改，越权访问，密码找回，交易支付金额等功能处。逻辑漏洞的破坏方式并…

人工智能 2023年7月20日
00104
【无标题】Torch_geometric安装教程，问题汇总详解。

前情提要：本文只谈通过pip方式安装，conda安装的方式博主没有试过，感兴趣的小伙伴可以参考本文自己尝试一下。首先，在核心上，Torch_geometric的安装方法几乎和py…

人工智能 2023年7月22日
0056
人工智能实验1-波士顿房价预测

人工智能实验1-波士顿房价预测 1 实验内容 * 1.1 波士顿房价预测任务 1.2 线性回归模型 2 源代码 * 2.1 数据处理 – 2.1.1 读入数据 2.1….

人工智能 2023年6月19日
0083
Swin-Transformer目标检测

Swin-Transformer目标检测 * – + 1. 环境搭建 + 2. 训练 + * 2.1 预训练模型的准备 * 2.2 数据集的准备 * 2.3 代码的修改…

人工智能 2023年6月17日
0072

2024 年 4 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30