YOLOv5行人检测

2023年5月26日上午11:49 • 人工智能 • 阅读 84

YOLOv5行人检测

1.数据准备
*
1.下载数据集
2.整理出jpg和xml
2.进行YOLOV5的部署训练
*
1.划分数据集
2.生成yolo的txt文件
3.配置自己数据集的文件
4.聚类找anchors
5.配置模型文件
6.训练模型
3.测试应用
4.问题
5. 参考

1.数据准备

1.下载数据集

（工程1WiderPerson）首先下载行人数据集：这里我使用了WiderPerson数据集，下载数据集。

2.整理出jpg和xml

把数据集整理出图片和xml文件：把下面的py文件运行3遍，第一遍运行train.txt文件，第二遍运行val.txt文件，并且注释掉make_voc_dir函数的调用，自己可以根据自己的场景需要进行数据集的筛选，第三遍运行test.txt文件，并注释掉with open(label_path) as file和with open(xml_path, ‘wb’) as f里面的内容。

import os
import numpy as np
import scipy.io as sio
import shutil
from lxml.etree import Element, SubElement, tostring
from xml.dom.minidom import parseString
import cv2

def make_voc_dir():

    if not os.path.exists('../VOC2007/Annotations'):
        os.makedirs('../VOC2007/Annotations')
    if not os.path.exists('../VOC2007/ImageSets'):
        os.makedirs('../VOC2007/ImageSets')
        os.makedirs('../VOC2007/ImageSets/Main')
    if not os.path.exists('../VOC2007/JPEGImages'):
        os.makedirs('../VOC2007/JPEGImages')

if __name__ == '__main__':

    classes = {'1': 'pedestrians',
               '2': 'riders',
               '3': 'partially',
               '4':'ignore',
               '5':'crowd'
               }
    VOCRoot = '../VOC2007'
    widerDir = 'C:/Users/邓卓/Desktop/WiderPerson'
    wider_path = 'C:/Users/邓卓/Desktop/WiderPerson/train.txt'

    with open(wider_path, 'r') as f:
        imgIds = [x for x in f.read().splitlines()]

    for imgId in imgIds:
        objCount = 0
        filename = imgId + '.jpg'
        img_path = '../WiderPerson/images/' + filename
        print('Img :%s' % img_path)
        img = cv2.imread(img_path)
        width = img.shape[1]
        height = img.shape[0]

        node_root = Element('annotation')
        node_folder = SubElement(node_root, 'folder')
        node_folder.text = 'JPEGImages'
        node_filename = SubElement(node_root, 'filename')
        node_filename.text = 'VOC2007/JPEGImages/%s' % filename
        node_size = SubElement(node_root, 'size')
        node_width = SubElement(node_size, 'width')
        node_width.text = '%s' % width
        node_height = SubElement(node_size, 'height')
        node_height.text = '%s' % height
        node_depth = SubElement(node_size, 'depth')
        node_depth.text = '3'

        label_path = img_path.replace('images', 'Annotations') + '.txt'
        with open(label_path) as file:
            line = file.readline()
            count = int(line.split('\n')[0])
            line = file.readline()
            while line:
                cls_id = line.split(' ')[0]
                xmin = int(line.split(' ')[1]) + 1
                ymin = int(line.split(' ')[2]) + 1
                xmax = int(line.split(' ')[3]) + 1
                ymax = int(line.split(' ')[4].split('\n')[0]) + 1
                line = file.readline()

                cls_name = classes[cls_id]

                obj_width = xmax - xmin
                obj_height = ymax - ymin

                difficult = 0
                if obj_height  6 or obj_width  6:
                    difficult = 1

                node_object = SubElement(node_root, 'object')
                node_name = SubElement(node_object, 'name')
                node_name.text = cls_name
                node_difficult = SubElement(node_object, 'difficult')
                node_difficult.text = '%s' % difficult
                node_bndbox = SubElement(node_object, 'bndbox')
                node_xmin = SubElement(node_bndbox, 'xmin')
                node_xmin.text = '%s' % xmin
                node_ymin = SubElement(node_bndbox, 'ymin')
                node_ymin.text = '%s' % ymin
                node_xmax = SubElement(node_bndbox, 'xmax')
                node_xmax.text = '%s' % xmax
                node_ymax = SubElement(node_bndbox, 'ymax')
                node_ymax.text = '%s' % ymax
                node_name = SubElement(node_object, 'pose')
                node_name.text = 'Unspecified'
                node_name = SubElement(node_object, 'truncated')
                node_name.text = '0'

        image_path = VOCRoot + '/JPEGImages/' + filename
        xml = tostring(node_root, pretty_print=True)
        dom = parseString(xml)
        xml_name = filename.replace('.jpg', '.xml')
        xml_path = VOCRoot + '/Annotations/' + xml_name
        with open(xml_path, 'wb') as f:
            f.write(xml)

        shutil.copy(img_path, '../VOC2007/JPEGImages/' + filename)

运行3遍之后会在自己的同级目录生成VOC2007文件夹，里面就包括train，val和test的图像和以及train和val的xml文件。

2.进行YOLOV5的部署训练

1.划分数据集

（工程2yolov5）下载yolov5官方文件，在工程下创建一个people_data文件夹（名字可以自定义），将VOC2007文件夹里面的三个文件复制粘贴进去（这里有个坑：要把JPEGImages文件名改为images，后面有关的做相应改变）。划分数据集，创建split_train_val.py文件，更改自己的xml和txt文件夹目录。

import random
import os
import argparse

def get_opt():
    parser = argparse.ArgumentParser()
    parser.add_argument('--xml_path', default='C:/Users/邓卓/Desktop/yolov5-master/people_data/Annotations/',
                        type=str, help='input xml file ')
    parser.add_argument('--txt_path', default="C:/Users/邓卓/Desktop/yolov5-master/people_data/ImageSets/Main/",
                        type=str, help='output txt file')
    opt = parser.parse_args()
    return opt

opt = get_opt()

xml_file = opt.xml_path

save_txt_file = opt.txt_path

if not os.path.exists(save_txt_file):
    os.makedirs(save_txt_file)

total_xml = os.listdir(xml_file)

num = len(total_xml)

list_index = range(num)

train_val_percent = 1

train_percent = 0.99

tv = int(num * train_val_percent)

tr = int(tv * train_percent)

train_val = random.sample(list_index, tv)

train = random.sample(train_val, tr)

file_train_vale = open(save_txt_file + 'train_val.txt', 'w')
file_train = open(save_txt_file + "train.txt", 'w')
file_test = open(save_txt_file + "test.txt", 'w')
file_val = open(save_txt_file + "val.txt", 'w')

for i in list_index:

    data_name = total_xml[i][:-4] + '\n'

    if i in train_val:
        file_train_vale.write(data_name)
        if i in train:
            file_train.write(data_name)
        else:
            file_val.write(data_name)
    else:
        file_test.write(data_name)

file_train_vale.close()
file_train.close()
file_test.close()
file_val.close()

在此目录环境下运行文件

python split_train_val.py

2.生成yolo的txt文件

创建voc_label.py文件，将文件生成label标签并生成路径文件txt


import xml.etree.ElementTree as ET
import os
from os import getcwd

sets = ['train', 'val', 'test']
classes = ["pedestrians", "riders",'partially','ignore','crowd']
abs_path = os.getcwd()
print(abs_path)

def convert(size, box):
    dw = 1. / (size[0])
    dh = 1. / (size[1])
    x = (box[0] + box[1]) / 2.0 - 1
    y = (box[2] + box[3]) / 2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return x, y, w, h

def convert_annotation(image_id):
    in_file = open('C:/Users/邓卓/Desktop/yolov5-master/people_data/Annotations/%s.xml' % (image_id), encoding='UTF-8')
    out_file = open('C:/Users/邓卓/Desktop/yolov5-master/people_data/labels/%s.txt' % (image_id), 'w')
    tree = ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)
    for obj in root.iter('object'):

        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
             float(xmlbox.find('ymax').text))
        b1, b2, b3, b4 = b

        if b2 > w:
            b2 = w
        if b4 > h:
            b4 = h
        b = (b1, b2, b3, b4)
        bb = convert((w, h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')

wd = getcwd()
for image_set in sets:
    if not os.path.exists('C:/Users/邓卓/Desktop/yolov5-master/people_data/labels/'):
        os.makedirs('C:/Users/邓卓/Desktop/yolov5-master/people_data/labels/')
    image_ids = open('C:/Users/邓卓/Desktop/yolov5-master/people_data/ImageSets/Main/%s.txt' % (image_set)).read().strip().split()
    list_file = open('people_data/%s.txt' % (image_set), 'w')
    for image_id in image_ids:
        list_file.write( 'C:/Users/邓卓/Desktop/yolov5-master/people_data/JPEGImages/%s.jpg\n' % (image_id))
        convert_annotation(image_id)
    list_file.close()

3.配置自己数据集的文件

4.聚类找anchors

kmeans找出9个最好的anchors


import os
import numpy as np
import xml.etree.cElementTree as et

def iou(box, clusters):
"""
    Calculates the Intersection over Union (IoU) between a box and k clusters.

    :param box: tuple or array, shifted to the origin (i. e. width and height)
    :param clusters: numpy array of shape (k, 2) where k is the number of clusters
    :return: numpy array of shape (k, 0) where k is the number of clusters
"""
    x = np.minimum(clusters[:, 0], box[0])
    y = np.minimum(clusters[:, 1], box[1])
    if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0:

        pass
    intersection = x * y
    box_area = box[0] * box[1]
    cluster_area = clusters[:, 0] * clusters[:, 1]

    iou_ = intersection / (box_area + cluster_area - intersection)

    return iou_

def avg_iou(boxes, clusters):
"""
    Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters.

    :param boxes: numpy array of shape (r, 2), where r is the number of rows
    :param clusters: numpy array of shape (k, 2) where k is the number of clusters
    :return: average IoU as a single float
"""
    return np.mean([np.max(iou(boxes[i], clusters)) for i in range(boxes.shape[0])])

def translate_boxes(boxes):
"""
    Translates all the boxes to the origin.

    :param boxes: numpy array of shape (r, 4)
    :return: numpy array of shape (r, 2)
"""
    new_boxes = boxes.copy()
    for row in range(new_boxes.shape[0]):
        new_boxes[row][2] = np.abs(new_boxes[row][2] - new_boxes[row][0])
        new_boxes[row][3] = np.abs(new_boxes[row][3] - new_boxes[row][1])
    return np.delete(new_boxes, [0, 1], axis=1)

def kmeans(boxes, k, dist=np.median):
"""
    Calculates k-means clustering with the Intersection over Union (IoU) metric.

    :param boxes: numpy array of shape (r, 2), where r is the number of rows
    :param k: number of clusters
    :param dist: distance function
    :return: numpy array of shape (k, 2)
"""
    rows = boxes.shape[0]

    distances = np.empty((rows, k))
    last_clusters = np.zeros((rows,))

    np.random.seed()

    clusters = boxes[np.random.choice(rows, k, replace=False)]

    while True:
        for row in range(rows):
            distances[row] = 1 - iou(boxes[row], clusters)

        nearest_clusters = np.argmin(distances, axis=1)

        if (last_clusters == nearest_clusters).all():
            break

        for cluster in range(k):
            clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0)

        last_clusters = nearest_clusters

    return clusters

a = np.array([[1, 2, 3, 4], [5, 7, 6, 8]])
print(translate_boxes(a))

FILE_ROOT = "C:/Users/邓卓/Desktop/yolov5-master/people_data/"
ANNOTATION_ROOT = "Annotations"
ANNOTATION_PATH = FILE_ROOT + ANNOTATION_ROOT

ANCHORS_TXT_PATH = "C:/Users/邓卓/Desktop/yolov5-master/data/anchors.txt"

CLUSTERS = 9
CLASS_NAMES = ["pedestrians", "riders",'partially','ignore','crowd']

def load_data(anno_dir, class_names):
    xml_names = os.listdir(anno_dir)
    boxes = []
    for xml_name in xml_names:
        xml_pth = os.path.join(anno_dir, xml_name)
        tree = et.parse(xml_pth)

        width = float(tree.findtext("./size/width"))
        height = float(tree.findtext("./size/height"))

        for obj in tree.findall("./object"):
            cls_name = obj.findtext("name")
            if cls_name in class_names:
                xmin = float(obj.findtext("bndbox/xmin")) / width
                ymin = float(obj.findtext("bndbox/ymin")) / height
                xmax = float(obj.findtext("bndbox/xmax")) / width
                ymax = float(obj.findtext("bndbox/ymax")) / height

                box = [xmax - xmin, ymax - ymin]
                boxes.append(box)
            else:
                continue
    return np.array(boxes)

if __name__ == '__main__':

    anchors_txt = open(ANCHORS_TXT_PATH, "w")

    train_boxes = load_data(ANNOTATION_PATH, CLASS_NAMES)
    count = 1
    best_accuracy = 0
    best_anchors = []
    best_ratios = []

    for i in range(10):
        print(i)
        anchors_tmp = []
        clusters = kmeans(train_boxes, k=CLUSTERS)
        idx = clusters[:, 0].argsort()
        clusters = clusters[idx]

        for j in range(CLUSTERS):
            anchor = [round(clusters[j][0] * 640, 2), round(clusters[j][1] * 640, 2)]
            anchors_tmp.append(anchor)
            print(f"Anchors:{anchor}")

        temp_accuracy = avg_iou(train_boxes, clusters) * 100
        print("Train_Accuracy:{:.2f}%".format(temp_accuracy))

        ratios = np.around(clusters[:, 0] / clusters[:, 1], decimals=2).tolist()
        ratios.sort()
        print("Ratios:{}".format(ratios))
        print(20 * "*" + " {} ".format(count) + 20 * "*")

        count += 1

        if temp_accuracy > best_accuracy:
            best_accuracy = temp_accuracy
            best_anchors = anchors_tmp
            best_ratios = ratios

    anchors_txt.write("Best Accuracy = " + str(round(best_accuracy, 2)) + '%' + "\r\n")
    anchors_txt.write("Best Anchors = " + str(best_anchors) + "\r\n")
    anchors_txt.write("Best Ratios = " + str(best_ratios))
    anchors_txt.close()

5.配置模型文件

在工程的data文件夹下选择yaml文件，修改nc和anchors即可

; 6.训练模型

在yolo官网下载对应模型的pt文件，在工程下新建weights文件夹，将pt文件放入，修改train.py文件

开始训练

python train.py  --device '0'

这里我出现问题了，路径不能有中文，所以我需要更改用户，具体可以参考博文https://blog.csdn.net/weixin_43267344/article/details/109582664。

3.测试应用

训练生成pt文件（在runs/train/exp），我是直接使用摄像头来进行检测，这里我用了网络摄像头easyn，具体设置可以参考博文。
使用摄像头时修改1：修改dataset.py文件如下图

输出框坐标和图像大小。

; 4.问题

1.Unable to find a valid cuDNN algorithm to run convolution
调小batchsize
2.视频测试时出现灰屏，模型改小就没问题了。
3.train特别慢：原因是：yolo5源码默认开启CPU多线程加载图片，所以很慢，需要在源码中修改，–workers = 0
4.detect时视频卡顿的优化方法：
修改datasets里面的workers和num_threads使得线程数变大；修改detect里面的strides使得帧率稍稍变小。

参考

[1]https://blog.csdn.net/qq_36756866/article/details/109111065
[2]https://blog.csdn.net/qq_42495740/article/details/118577143

Original: https://blog.csdn.net/weixin_38226321/article/details/120530939
Author: 我是小z呀
Title: YOLOv5行人检测

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/519329/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

RuntimeError: “nll_loss_forward_reduce_cuda_kernel_2d_index“ not implemented for ‘Int‘

Traceback (most recent call last): File "E:/MyWorkspace/EEG/Pytorch/Train.py", l…

人工智能 2023年7月4日
0045
回归模型评价指标-SST、SSR、SSE、R-square

本文介绍了回归模型在模型选择上的常见评估指标：SST、SSR、SSE和R-square。 SST：The sum of squares totalSSR：The sum of sq…

人工智能 2023年6月17日
00161
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization 论文翻译

论文地址：arxiv.org技术报告视频GitHub地址 Actor-Context-Actor Relation Network for Spatio-Temporal Acti…

人工智能 2023年5月28日
0072
Vue–》过滤器介绍及其使用方法

目录过滤器过滤器的兼容性私有过滤器和全局过滤器过滤器的连续调用过滤器进行传参过滤器过滤器的兼容性注意：Vue3中明确取消了过滤器这个功能，如果想使用只能在Vue2中…

人工智能 2023年6月29日
0070
深度学习相关概念：权重初始化

深度学习相关概念：权重初始化 1.全零初始化（×） 2.随机初始化 * 2.1 高斯分布/均匀分布 – 2.1.1权重较小—N ( 0 , 0.01 ) \pmb{\m…

人工智能 2023年7月13日
0070
语音信号处理：预处理【预加重、分帧、加窗】

; 一、预处理预加重是语音信号处理的前提，主要目的是提升语音信号中的高频分量。人的发生系统是从肺开始，肺作为能量源，气流通过声带，引发周期性震动（元音），能量经过咽、口腔、唇、…

人工智能 2023年7月28日
0065
脑电EEG代码开源分享【4.特征提取-时频域篇】

往期文章希望了解更多的道友点这里0. 分享【脑机接口 + 人工智能】的学习之路1.1 . 脑电EEG代码开源分享【1.前置准备-静息态篇】1.2 . 脑电EEG代码开源分享【…

人工智能 2023年7月28日
0070
人工智能起步-反向回馈神经网路算法（BP算法）

本文出处人工智能分为强人工，弱人工。弱人工智能就包括我们常用的语音识别，图像识别等，或者为了某一个固定目标实现的人工算法，如：下围棋，游戏的AI，聊天机器人，阿尔法狗等。强人…

人工智能 2023年6月4日
0085
论文浅尝-Event Extraction by Answering (Almost) Natural Questions

扫码关注”自然语言处理与算法”公众号，定期更新NLP知识，还可以撩博主哦~该文来自EMNLP2020。论文简介：事件抽取一般需要检测事件触发器(event …

人工智能 2023年6月1日
0058
python实现Lasso回归分析（特征筛选、建模预测）

实现功能： python实现Lasso回归分析（特征筛选、建模预测）输入结构化数据，含有特征以及相应的标签，采用Lasso回归对特征进行分析筛选，并对数据进行建模预测。实现代码…

人工智能 2023年7月18日
0056
基于libtorch的Resnet34残差网络实现——Cifar-10分类（测试集准确率94.15%）

“ 前文我们使用libtorch实现的Resnet34网络对Cifar-10进行分类，测试集的分类准确率仅有74.95%，本文我们在前文的基础上做了一些改进，使得测试集…

人工智能 2023年7月2日
00141
Keras中如何设置学习率和优化器以及两者之间的关系

在集成式机器学习类库Keras中，对优化器和学习率做了很好的封装，以至于很多人搞不清楚怎么设置学习率，怎么使用优化器，两者到底有什么区别。不同的学习率对模型训练过程中的损失值lo…

人工智能 2023年6月23日
0077
Python：pandas（三）——DataFrame

1、构造函数参数类型说明 data ndarray、iterable、dict、DataFrame 用于构造DataFrame的数据（注意，用某个DataFrame构造另一个…

人工智能 2023年6月2日
0072
【分类模型评价】宏平均（macro avg）、微平均(micro avg)和加权平均(weighted avg)

当我们使用 sklearn.metric.classification_report 工具对模型的测试结果进行评价时，会输出如下结果： 1、宏平均 macro avg: 对所有类别…

人工智能 2023年6月15日
00127
python中的reindex_pandas.DataFrame.reindex的使用介绍

参考链接:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reindex.h…

人工智能 2023年7月9日
0051
pytorch中nn.Parameter()使用方法

对于 nn.Parameter()是pytorch中定义 可学习参数的…

人工智能 2023年6月29日
0065

2024 年 4 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

YOLOv5行人检测

YOLOv5行人检测

1.下载数据集

2.整理出jpg和xml

1.划分数据集

2.生成yolo的txt文件

3.配置自己数据集的文件

4.聚类找anchors

5.配置模型文件

; 6.训练模型

大家都在看