opencv和mediapipe实现手势识别

2023年6月19日上午7:34 • 人工智能 • 阅读 71

本篇文章只是手势识别的一个demo，想要识别的精度更高，还需要添加其他的约束条件，这里只是根据每个手指关键点和手掌根部的距离来判断手指是伸展开还是弯曲的。关于mediapi pe的简介，可以去看官网：Home – mediapipe，官网有现成的demo程序，直接拷贝应用就可以实现手掌21个关键点的识别，这21个关键点的分布如下：

而且，检测的实时性也非常的不错：

当然，mediapipe不止可以检测手势，面部检测，姿态检测都可以：

下面说下这个手势识别的demo的大体思路：

首先，要import必要的库和导入必要的函数方法：

import cv2 as cv
import numpy as np
import mediapipe as mp
from numpy import linalg

#&#x624B;&#x90E8;&#x68C0;&#x6D4B;&#x51FD;&#x6570;
mpHands = mp.solutions.hands
hands = mpHands.Hands()

#&#x7ED8;&#x5236;&#x5173;&#x952E;&#x70B9;&#x548C;&#x8FDE;&#x63A5;&#x7EBF;&#x51FD;&#x6570;
mpDraw = mp.solutions.drawing_utils
handLmsStyle = mpDraw.DrawingSpec(color=(0, 0, 255), thickness=int(5))
handConStyle = mpDraw.DrawingSpec(color=(0, 255, 0), thickness=int(10))

其中，handLmsStyle和handConStyle分别是关键点和连接线的特征，包括颜色和关键点（连接线）的宽度。

如果画面中有手，就可以通过如下函数将关键点和连接线表示出来

if result.multi_hand_landmarks:
    #&#x540C;&#x65F6;&#x51FA;&#x73B0;&#x4E24;&#x53EA;&#x624B;&#x90FD;&#x53EF;&#x4EE5;&#x68C0;&#x6D4B;
    for i, handLms in enumerate(result.multi_hand_landmarks):
        mpDraw.draw_landmarks(frame, handLms, mpHands.HAND_CONNECTIONS,
                              landmark_drawing_spec=handLmsStyle,
                              connection_drawing_spec=handConStyle)

有个这21个关键点，可以做的事情就太多了，比如控制电脑的音量，鼠标、键盘，如果有个完善的手势姿态库，还可以做比如手语识别等等。因为实际生活中，手的摆放不一定是正好手心面向摄像头的，所以约束条件越苛刻，精度就会越高，这里的做法就没有考虑这么多，就只是用手指不同姿态时的向量L2范数（就是向量的模，忘记了就看线性代数或者机器学习）不同，来粗略的检测，比如说食指，伸直的时候和弯曲的时候，指尖（点8）到手掌根部（点0）的向量模dist1肯定是大于点6到点0的向量模dist2的，如果食指弯曲的时候，则有dist1 < dist2，食指、中指、无名指和小拇指的判断都是如此，仅大拇指就点17代替的点0，代码如下：

for k in range (5):
    if k == 0:
        figure_ = finger_stretch_detect(landmark[17],landmark[4*k+2],
                                        landmark[4*k+4])
    else:
        figure_ = finger_stretch_detect(landmark[0],landmark[4*k+2],
                                        landmark[4*k+4])

然后通过五个手指的状态，来判断当前的手势，我这里列举了一些，简单粗暴：

def detect_hands_gesture(result):
    if (result[0] == 1) and (result[1] == 0) and (result[2] == 0) and (result[3] == 0) and (result[4] == 0):
        gesture = "good"
    elif (result[0] == 0) and (result[1] == 1)and (result[2] == 0) and (result[3] == 0) and (result[4] == 0):
        gesture = "one"
    elif (result[0] == 0) and (result[1] == 0)and (result[2] == 1) and (result[3] == 0) and (result[4] == 0):
        gesture = "please civilization in testing"
    elif (result[0] == 0) and (result[1] == 1)and (result[2] == 1) and (result[3] == 0) and (result[4] == 0):
        gesture = "two"
    elif (result[0] == 0) and (result[1] == 1)and (result[2] == 1) and (result[3] == 1) and (result[4] == 0):
        gesture = "three"
    elif (result[0] == 0) and (result[1] == 1)and (result[2] == 1) and (result[3] == 1) and (result[4] == 1):
        gesture = "four"
    elif (result[0] == 1) and (result[1] == 1)and (result[2] == 1) and (result[3] == 1) and (result[4] == 1):
        gesture = "five"
    elif (result[0] == 1) and (result[1] == 0)and (result[2] == 0) and (result[3] == 0) and (result[4] == 1):
        gesture = "six"
    elif (result[0] == 0) and (result[1] == 0)and (result[2] == 1) and (result[3] == 1) and (result[4] == 1):
        gesture = "OK"
    elif(result[0] == 0) and (result[1] == 0) and (result[2] == 0) and (result[3] == 0) and (result[4] == 0):
        gesture = "stone"
    else:
        gesture = "not in detect range..."

    return gesture

然后根据判断的结果输出即可，效果如下：

完整代码如下：

import cv2 as cv
import numpy as np
import mediapipe as mp
from numpy import linalg

#&#x89C6;&#x9891;&#x8BBE;&#x5907;&#x53F7;
DEVICE_NUM = 0

&#x624B;&#x6307;&#x68C0;&#x6D4B;
point1-&#x624B;&#x638C;0&#x70B9;&#x4F4D;&#x7F6E;&#xFF0C;point2-&#x624B;&#x6307;&#x5C16;&#x70B9;&#x4F4D;&#x7F6E;&#xFF0C;point3&#x624B;&#x6307;&#x6839;&#x90E8;&#x70B9;&#x4F4D;&#x7F6E;
def finger_stretch_detect(point1, point2, point3):
    result = 0
    #&#x8BA1;&#x7B97;&#x5411;&#x91CF;&#x7684;L2&#x8303;&#x6570;
    dist1 = np.linalg.norm((point2 - point1), ord=2)
    dist2 = np.linalg.norm((point3 - point1), ord=2)
    if dist2 > dist1:
        result = 1

    return result

&#x68C0;&#x6D4B;&#x624B;&#x52BF;
def detect_hands_gesture(result):
    if (result[0] == 1) and (result[1] == 0) and (result[2] == 0) and (result[3] == 0) and (result[4] == 0):
        gesture = "good"
    elif (result[0] == 0) and (result[1] == 1)and (result[2] == 0) and (result[3] == 0) and (result[4] == 0):
        gesture = "one"
    elif (result[0] == 0) and (result[1] == 0)and (result[2] == 1) and (result[3] == 0) and (result[4] == 0):
        gesture = "please civilization in testing"
    elif (result[0] == 0) and (result[1] == 1)and (result[2] == 1) and (result[3] == 0) and (result[4] == 0):
        gesture = "two"
    elif (result[0] == 0) and (result[1] == 1)and (result[2] == 1) and (result[3] == 1) and (result[4] == 0):
        gesture = "three"
    elif (result[0] == 0) and (result[1] == 1)and (result[2] == 1) and (result[3] == 1) and (result[4] == 1):
        gesture = "four"
    elif (result[0] == 1) and (result[1] == 1)and (result[2] == 1) and (result[3] == 1) and (result[4] == 1):
        gesture = "five"
    elif (result[0] == 1) and (result[1] == 0)and (result[2] == 0) and (result[3] == 0) and (result[4] == 1):
        gesture = "six"
    elif (result[0] == 0) and (result[1] == 0)and (result[2] == 1) and (result[3] == 1) and (result[4] == 1):
        gesture = "OK"
    elif(result[0] == 0) and (result[1] == 0) and (result[2] == 0) and (result[3] == 0) and (result[4] == 0):
        gesture = "stone"
    else:
        gesture = "not in detect range..."

    return gesture

def detect():
    # &#x63A5;&#x5165;USB&#x6444;&#x50CF;&#x5934;&#x65F6;&#xFF0C;&#x6CE8;&#x610F;&#x4FEE;&#x6539;cap&#x8BBE;&#x5907;&#x7684;&#x7F16;&#x53F7;
    cap = cv.VideoCapture(DEVICE_NUM)
    # &#x52A0;&#x8F7D;&#x624B;&#x90E8;&#x68C0;&#x6D4B;&#x51FD;&#x6570;
    mpHands = mp.solutions.hands
    hands = mpHands.Hands()
    # &#x52A0;&#x8F7D;&#x7ED8;&#x5236;&#x51FD;&#x6570;&#xFF0C;&#x5E76;&#x8BBE;&#x7F6E;&#x624B;&#x90E8;&#x5173;&#x952E;&#x70B9;&#x548C;&#x8FDE;&#x63A5;&#x7EBF;&#x7684;&#x5F62;&#x72B6;&#x3001;&#x989C;&#x8272;
    mpDraw = mp.solutions.drawing_utils
    handLmsStyle = mpDraw.DrawingSpec(color=(0, 0, 255), thickness=int(5))
    handConStyle = mpDraw.DrawingSpec(color=(0, 255, 0), thickness=int(10))

    figure = np.zeros(5)
    landmark = np.empty((21, 2))

    if not cap.isOpened():
        print("Can not open camera.")
        exit()

    while True:
        ret, frame = cap.read()
        if not ret:
            print("Can not receive frame (stream end?). Exiting...")
            break

        #mediaPipe&#x7684;&#x56FE;&#x50CF;&#x8981;&#x6C42;&#x662F;RGB&#xFF0C;&#x6240;&#x4EE5;&#x6B64;&#x5904;&#x9700;&#x8981;&#x8F6C;&#x6362;&#x56FE;&#x50CF;&#x7684;&#x683C;&#x5F0F;
        frame_RGB = cv.cvtColor(frame, cv.COLOR_BGR2RGB)
        result = hands.process(frame_RGB)
        #&#x8BFB;&#x53D6;&#x89C6;&#x9891;&#x56FE;&#x50CF;&#x7684;&#x9AD8;&#x548C;&#x5BBD;
        frame_height = frame.shape[0]
        frame_width  = frame.shape[1]

        #print(result.multi_hand_landmarks)
        #&#x5982;&#x679C;&#x68C0;&#x6D4B;&#x5230;&#x624B;
        if result.multi_hand_landmarks:
            #&#x4E3A;&#x6BCF;&#x4E2A;&#x624B;&#x7ED8;&#x5236;&#x5173;&#x952E;&#x70B9;&#x548C;&#x8FDE;&#x63A5;&#x7EBF;
            for i, handLms in enumerate(result.multi_hand_landmarks):
                mpDraw.draw_landmarks(frame,
                                      handLms,
                                      mpHands.HAND_CONNECTIONS,
                                      landmark_drawing_spec=handLmsStyle,
                                      connection_drawing_spec=handConStyle)

                for j, lm in enumerate(handLms.landmark):
                    xPos = int(lm.x * frame_width)
                    yPos = int(lm.y * frame_height)
                    landmark_ = [xPos, yPos]
                    landmark[j,:] = landmark_

                # &#x901A;&#x8FC7;&#x5224;&#x65AD;&#x624B;&#x6307;&#x5C16;&#x4E0E;&#x624B;&#x6307;&#x6839;&#x90E8;&#x5230;0&#x4F4D;&#x7F6E;&#x70B9;&#x7684;&#x8DDD;&#x79BB;&#x5224;&#x65AD;&#x624B;&#x6307;&#x662F;&#x5426;&#x4F38;&#x5F00;(&#x62C7;&#x6307;&#x68C0;&#x6D4B;&#x5230;17&#x70B9;&#x7684;&#x8DDD;&#x79BB;)
                for k in range (5):
                    if k == 0:
                        figure_ = finger_stretch_detect(landmark[17],landmark[4*k+2],landmark[4*k+4])
                    else:
                        figure_ = finger_stretch_detect(landmark[0],landmark[4*k+2],landmark[4*k+4])

                    figure[k] = figure_
                print(figure,'\n')

                gesture_result = detect_hands_gesture(figure)
                cv.putText(frame, f"{gesture_result}", (30, 60*(i+1)), cv.FONT_HERSHEY_COMPLEX, 2, (255 ,255, 0), 5)

        cv.imshow('frame', frame)
        if cv.waitKey(1) == ord('q'):
            break

    cap.release()
    cv.destroyAllWindows()

if __name__ == '__main__':
    detect()

我的公众号：

Original: https://blog.csdn.net/weixin_41747193/article/details/122117629
Author: 王三思
Title: opencv和mediapipe实现手势识别

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/638113/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

r语言summary函数使用_用R语言做数据分析——线性模型函数简介

在介绍多元回归方程之前，先简单地介绍R语言中与线性模型相关的函数，这些函数之前也使用过，在后面的多元线性回归中，也经常遇到。基本函数适用于多元线性模型的基本函数是lm()，其调…

人工智能 2023年6月18日
0061
数据分析之在线JupyterNotebook使用小技巧｜Python技能树测评

大家好，我是小小明。受CSDN官方邀请，前来测评《python 技能树》，活动地址：https://bbs.csdn.net/topics/600937310XDM，一起来测评…

人工智能 2023年7月14日
0070
AttributeError: ‘SVC‘ object has no attribute ‘_probA‘

AttributeError: ‘SVC’ object has no attribute ‘_probA’ 问题： Save th…

人工智能 2023年7月18日
0055
opencv的ORB特征（slambook2 orb_cv.cpp代码详解）

ORB特征提取与匹配 slambook2/ch7/orb_cv.cpp 1. 头文件 #include #include #include #include #include us…

人工智能 2023年7月19日
0050
自动驾驶中间件之二：通信中间件，DDS与SOME/IP 谁主沉浮？

本文是自动驾驶中间件科普系列第二篇，上一篇为随着传感器的数量越来越多，数据来源越来越多、规模也会越来越大，那这些多源异构数据如何在芯片之间、在各任务进程之间高效、稳定地传递，确保…

人工智能 2023年6月10日
0066
【参赛作品93】openGauss-An Autonomous Database【PVLDB论文阅读分享】

作者：YAN左使本文基于openGauss在VLDB2021上最新发表的论文《openGauss: An Autonomous Database System》，从学术的角度来探…

人工智能 2023年6月24日
0060
java mp3转wav_JAVA将MP3转为WAV

想搞个百度语音识别玩玩，但人家要固定格式的音频(关于百度语音识别的请查看官方文档——百度语音识别SDK)，于是就上网找呀找呀，结果转出来的要不就是听不了损坏了，要不就是不能给百度识…

人工智能 2023年5月25日
0051
pandas 操作mysql_pandas 操作数据库

流处理，听起来很高大上啊，其实就是分块读取。有这么一些情况，有一个很大的几个G的文件，没办法一次处理，那么就分批次处理，一次处理1百万行，接着处理下1百万行，慢慢地总是能处理完的。…

人工智能 2023年7月7日
0042
模型训练时gpu内存不足的解决办法

最近在训练微调bert预训练模型的时候，gpu内存老是不足，跑不了一个epoch就爆掉了，在网上来来回回找了很多资料，这里把一些方法总结一下：半精度float16比单精度floa…

人工智能 2023年7月22日
0054
机器学习分类问题指标评估内容详解（准确率、精准率、召回率、F1、ROC、AUC等）

文章目录前言一、混淆矩阵（confusion matrix）二、准确率，精准率，召回率，F1分数 * 1. 准确率（Accuracy） 2. 精确率（Precision） 3…

人工智能 2023年7月29日
0045
【linux】linux实操篇之任务调度

目录前言 * crond 任务调度 – 概述基本语法快速入门案例 + 案例一：每隔一分钟将ls -l /etc/ 追加到 /tmp/to.txt 文件案例二：…

人工智能 2023年7月30日
0074
目标检测进阶：使用深度学习和 OpenCV 进行目标检测

使用深度学习和 OpenCV 进行目标检测基于深度学习的对象检测时，您可能会遇到三种主要的对象检测方法： Faster R-CNNs (Ren et al., 2015) You…

人工智能 2023年7月11日
0081
5 款漏洞扫描工具：实用、强力、全面（含开源）

目录引言 5款工具，打包带走吧！第一款：Trivy 概述安装第二款：OpenVAS 概述安装第三款：Clair 概述安装第四款：Anchore 概述安装第五款：…

人工智能 2023年6月19日
0063
智能车巡线python-opencv

背景：2022智能车比赛百度提高组思路：先拿赛道通过HSV调阈值，然后得到二值化图片，对二值化图像进行巡线；巡线的思路:从图片最后一行的中央开始往左右两边扫线：分扫左线与扫右线…

人工智能 2023年6月18日
0084
【GPU】Nvidia CUDA 编程高级教程——利用蒙特卡罗法求解的近似值

博主未授权任何人或组织机构转载博主任何原创文章，感谢各位对原创的支持！博主链接本人就职于国际知名终端厂商，负责modem芯片研发。在5G早期负责终端数据业务层、核心网相关的开发工…

人工智能 2023年6月30日
0082
【原创】强化学习笔记|从零开始学习PPO算法编程（pytorch版本）

从零开始学习PPO算法编程（pytorch版本）_melody_cjw的博客-CSDN博客_ppo算法 pytorch 从零开始学习PPO算法编程（pytorch版本）（二）_me…

人工智能 2023年7月21日
0092

2024 年 4 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

opencv和mediapipe实现手势识别

大家都在看