修改YOLOv5 detect.py代码使其能逐个视频检测保存，同时对每个视频内参数进行单独操作

2023年6月19日上午6:11 • 人工智能 • 阅读 89

真没怎么看懂YOLOv5的detect.py代码的逻辑，看了YOLOv3，和YOLOv4的detect逻辑，基本都是用opencv对每个视频进行操作，感觉还清晰易懂一点，YOLOv5的作者都好像没用opencv进行操作，或者把opencv的视频操作封装成另一个py文件隐藏起来，实在有些隐晦，所以用了最笨的方法，用os.listdir读视频文件目录下的所有视频，逐一检测。同时改写了画框的函数(因为要保存一帧关键帧的内容)，检测命令里是用python detect.py –exist-ok –nosave，因为检测命令里带nosave这一选项，所以浅扒了一下作者的画框逻辑，发现还是用的opencv的rectangle方法(作者藏的

import numpy as np
import argparse
import os
import sys
from pathlib import Path
import time
import shutil
from PIL import Image
import cv2
import torch
import torch.backends.cudnn as cudnn

FILE = Path(file).resolve()
ROOT = FILE.parents[0] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative

from models.common import DetectMultiBackend
from utils.datasets import IMG_FORMATS, VID_FORMATS, LoadImages, LoadStreams
from utils.general import (LOGGER, check_file, check_img_size, check_imshow, check_requirements, colorstr,
increment_path, non_max_suppression, print_args, scale_coords, strip_optimizer, xyxy2xywh)
from utils.plots import Annotator, colors, save_one_box
from utils.torch_utils import select_device, time_sync

@torch.no_grad()
def run(weights=ROOT / ‘yolov5s.pt’, # model.pt path(s)
vidpath=’/home/ccf_disk/animal/test/’, # file/dir/URL/glob, 0 for webcam
data=ROOT / ‘data/coco128.yaml’, # dataset.yaml path
imgsz=(640, 640), # inference size (height, width)
conf_thres=0.6, # confidence threshold
iou_thres=0.45, # NMS IOU threshold
max_det=1000, # maximum detections per image
device=”, # cuda device, i.e. 0 or 0,1,2,3 or cpu
view_img=False, # show results
save_txt=False, # save results to *.txt
save_conf=False, # save confidences in –save-txt labels
save_crop=False, # save cropped prediction boxes
nosave=True, # do not save images/videos
classes=None, # filter by class: –class 0, or –class 0 2 3
agnostic_nms=False, # class-agnostic NMS
augment=False, # augmented inference
visualize=False, # visualize features
update=False, # update all models
project=’/home/ccf_disk/animal/video_animal’, # save results to project/name
name=’test_1′, # save results to project/name
exist_ok=True, # existing project/name ok, do not increment
line_thickness=3, # bounding box thickness (pixels)
hide_labels=False, # hide labels
hide_conf=False, # hide confidences
half=False, # use FP16 half-precision inference
dnn=False, # use OpenCV DNN for ONNX inference
):
vidpath = str(vidpath)
videos = os.listdir(vidpath)
number = 0
for video_name in videos:
time1_start = time.time()
so = vidpath + video_name
number = number + 1
print(“第%d个视频处理中” %number)
source = str(so)
save_c = 0
keep = 0
save_img = not nosave and not source.endswith(‘.txt’) # save inference images
is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
is_url = source.lower().startswith((‘rtsp://’, ‘rtmp://’, ‘http://’, ‘https://’))
webcam = source.isnumeric() or source.endswith(‘.txt’) or (is_url and not is_file)
if is_url and is_file:
source = check_file(source) # download

Directories
save_dir = increment_path(Path(project) / name, exist_ok=exist_ok) # increment run
(save_dir / ‘labels’ if save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir

Load model
device = select_device(device)
model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data)
stride, names, pt, jit, onnx, engine = model.stride, model.names, model.pt, model.jit, model.onnx, model.engine
imgsz = check_img_size(imgsz, s=stride) # check image size

Half
half &= (pt or jit or onnx or engine) and device.type != ‘cpu’ # FP16 supported on limited backends with CUDA
if pt or jit:
model.model.half() if half else model.model.float()

Dataloader
if webcam:
view_img = check_imshow()
cudnn.benchmark = True # set True to speed up constant image size inference
dataset = LoadStreams(source, img_size=imgsz, stride=stride, auto=pt)
bs = len(dataset) # batch_size
else:
dataset = LoadImages(source, img_size=imgsz, stride=stride, auto=pt)
bs = 1 # batch_size
vid_path, vid_writer = [None] * bs, [None] * bs

Run inference
model.warmup(imgsz=(1 if pt else bs, 3, *imgsz), half=half) # warmup
dt, seen = [0.0, 0.0, 0.0], 0
for path, im, im0s, vid_cap, s in dataset:
flag = 0
c = 1
time1 = 6
t1 = time_sync()
im = torch.from_numpy(im).to(device)
im = im.half() if half else im.float() # uint8 to fp16/32
im /= 255 # 0 – 255 to 0.0 – 1.0
if len(im.shape) == 3:
im = im[None] # expand for batch dim
t2 = time_sync()
dt[0] += t2 – t1

Inference

visualize = increment_path(save_dir / Path(path).stem, mkdir=True) if visualize else False
pred = model(im, augment=augment, visualize=visualize)
t3 = time_sync()
dt[1] += t3 – t2

NMS
pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det)

dt[2] += time_sync() – t3

Second-stage classifier (optional)
pred = utils.general.apply_classifier(pred, classifier_model, im, im0s)

Process predictions
for i, det in enumerate(pred): # per image
seen += 1
count = 0
if webcam: # batch_size >= 1
p, im0, frame = path[i], im0s[i].copy(), dataset.count
s += f'{i}: ‘
else:
p, im0, frame = path, im0s.copy(), getattr(dataset, ‘frame’, 0)

p = Path(p) # to Path
save_path = str(save_dir / p.name) # im.jpg
txt_path = str(save_dir / ‘labels’ / p.stem) + (
” if dataset.mode == ‘image’ else f’_{frame}’) # im.txt
s += ‘%gx%g ‘ % im.shape[2:] # print string
gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh
imc = im0.copy() if save_crop else im0 # for save_crop
annotator = Annotator(im0, line_width=line_thickness, example=str(names))
if len(det):
Rescale boxes from img_size to im0 size
det[:, :4] = scale_coords(im.shape[2:], det[:, :4], im0.shape).round()

Print results
for c in det[:, -1].unique():
n = (det[:, -1] == c).sum() # detections per class
s += f”{n} {names[int(c)]}{‘s’ * (n > 1)}, ” # add to string

Write results
for xyxy, conf, cls in reversed(det):
count = 1
if save_txt: # Write to file
xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
line = (cls, xywh, conf) if save_conf else (cls, *xywh) # label format
with open(txt_path + ‘.txt’, ‘a’) as f:
f.write((‘%g ‘ * len(line)).rstrip() % line + ‘\n’)

if save_img or save_crop or view_img: # Add bbox to image
c = int(cls) # integer class
label = None if hide_labels else (names[c] if hide_conf else f'{names[c]} {conf:.2f}’)
annotator.box_label(xyxy, label, color=colors(c, True))
if save_crop:
save_one_box(xyxy, imc, file=save_dir / ‘crops’ / names[c] / f'{p.stem}.jpg’, BGR=True)
box = xyxy
c = int(cls) # integer class
p1, p2 = (int(box[0]), int(box[1])), (int(box[2]), int(box[3]))
lw = max(round(sum(im0.shape) / 2 * 0.003), 2)
cv2.rectangle(im0, p1, p2, color=(0, 0, 255),
thickness=max(round(sum(im0.shape) / 2 * 0.003), 2), lineType=cv2.LINE_AA)
label = (f'{names[c]} {conf:.2f}’)
tf = max(lw – 1, 1)
w, h = cv2.getTextSize(label, 0, fontScale=lw / 3, thickness=tf)[0] # text width, height
outside = p1[1] – h – 3 >= 0 # label fits outside box
cv2.putText(im0, label, (p1[0], p1[1] – 2 if outside else p1[1] + h + 2), 0, lw / 3,
(0, 0, 255),
thickness=tf, lineType=cv2.LINE_AA)

Stream results
im0 = annotator.result()
if view_img:
cv2.imshow(str(p), im0)
cv2.waitKey(1) # 1 millisecond
if (seen % time1 == 0):
if (count == 0):
save_c = 0
else:
save_c = save_c + 1
if(save_c>=4):
if keep == 0:
im0 = cv2.cvtColor(im0, cv2.COLOR_BGR2RGB)
frame = Image.fromarray(np.uint8(im0))

print(save_path)

frame.save(str(save_path.split(‘.’)[0]) + “.jpg”)
keep = 1
shutil.copy(so, save_path)
print(‘have animal’)
break
else:
continue
break

Save results (image with detections)

if save_img:
if dataset.mode == ‘image’:
cv2.imwrite(save_path, im0)
else: # ‘video’ or ‘stream’
if vid_path[i] != save_path: # new video
vid_path[i] = save_path
if isinstance(vid_writer[i], cv2.VideoWriter):
vid_writer[i].release() # release previous video writer
if vid_cap: # video
fps = vid_cap.get(cv2.CAP_PROP_FPS)
w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
else: # stream
fps, w, h = 30, im0.shape[1], im0.shape[0]
save_path = str(Path(save_path).with_suffix(‘.mp4’)) # force .mp4 suffix on results videos
vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(‘mp4v’), fps, (w, h))
vid_writer[i].write(im0)

Print time (inference-only)
LOGGER.info(f'{s}Done. ({t3 – t2:.3f}s)’)

Print results
t = tuple(x / seen * 1E3 for x in dt) # speeds per image
LOGGER.info(f’Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, imgsz)}’ % t)
if save_txt or save_img:
s = f”\n{len(list(save_dir.glob(‘labels/.txt’)))} labels saved to {save_dir / ‘labels’}” if save_txt else ”
LOGGER.info(f”Results saved to {colorstr(‘bold’, save_dir)}{s}”)
if update:
strip_optimizer(weights) # update model (to fix SourceChangeWarning)

time1_end = time.time()
print(‘视频%d处理时间’ % number + str(time1_end-time1_start))
if bool == True:
shutil.copy(so, save_path)
else:
pass

def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument(‘–weights’, nargs=’+’, type=str, default=ROOT / ‘weights/best.pt’, help=’model path(s)’)
parser.add_argument(‘–vidpath’, type=str, default=’/home/ccf_disk/animal/video/4-3/’,
help=’file/dir/URL/glob, 0 for webcam’)
parser.add_argument(‘–data’, type=str, default=ROOT / ‘data/myvoc.yaml’, help='(optional) dataset.yaml path’)
parser.add_argument(‘–imgsz’, ‘–img’, ‘–img-size’, nargs=’+’, type=int, default=[640], help=’inference size h,w’)
parser.add_argument(‘–conf-thres’, type=float, default=0.75, help=’confidence threshold’)
parser.add_argument(‘–iou-thres’, type=float, default=0.45, help=’NMS IoU threshold’)
parser.add_argument(‘–max-det’, type=int, default=1000, help=’maximum detections per image’)
parser.add_argument(‘–device’, default=”, help=’cuda device, i.e. 0 or 0,1,2,3 or cpu’)
parser.add_argument(‘–view-img’, action=’store_true’, help=’show results’)
parser.add_argument(‘–save-txt’, action=’store_true’, help=’save results to .txt’)
parser.add_argument(‘–save-conf’, action=’store_true’, help=’save confidences in –save-txt labels’)
parser.add_argument(‘–save-crop’, action=’store_true’, help=’save cropped prediction boxes’)
parser.add_argument(‘–nosave’, action=’store_true’, help=’do not save images/videos’)
parser.add_argument(‘–classes’, nargs=’+’, type=int, help=’filter by class: –classes 0, or –classes 0 2 3′)
parser.add_argument(‘–agnostic-nms’, action=’store_true’, help=’class-agnostic NMS’)
parser.add_argument(‘–augment’, action=’store_true’, help=’augmented inference’)
parser.add_argument(‘–visualize’, action=’store_true’, help=’visualize features’)
parser.add_argument(‘–update’, action=’store_true’, help=’update all models’)
parser.add_argument(‘–project’, default=’/home/ccf_disk/animal/video_animal_yolov5/’, help=’save results to project/name’)
parser.add_argument(‘–name’, default=’4-3′, help=’save results to project/name’)
parser.add_argument(‘–exist-ok’, action=’store_true’, help=’existing project/name ok, do not increment’)
parser.add_argument(‘–line-thickness’, default=3, type=int, help=’bounding box thickness (pixels)’)
parser.add_argument(‘–hide-labels’, default=False, action=’store_true’, help=’hide labels’)
parser.add_argument(‘–hide-conf’, default=False, action=’store_true’, help=’hide confidences’)
parser.add_argument(‘–half’, action=’store_true’, help=’use FP16 half-precision inference’)
parser.add_argument(‘–dnn’, action=’store_true’, help=’use OpenCV DNN for ONNX inference’)
opt = parser.parse_args()
opt.imgsz = 2 if len(opt.imgsz) == 1 else 1 # expand
print_args(FILE.stem, opt)
return opt

def main(opt):
check_requirements(exclude=(‘tensorboard’, ‘thop’))
run(**vars(opt))

if name == “main“:
opt = parse_opt()
main(opt)

有点深)，第一次发博客，浅记录一下。

Original: https://blog.csdn.net/Xiashawuyanzu/article/details/126310868
Author: Xiashawuyanzu
Title: 修改YOLOv5 detect.py代码使其能逐个视频检测保存，同时对每个视频内参数进行单独操作

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/637863/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Spconv库安装教程

最近在学习OpenPCDet，需要安装spconv库，这里总结一下安装过程。服务器环境操作系统版本：Ubuntu 18.04GPU：RTX3070CUDA版本：11.1CUDN…

人工智能 2023年7月22日
0051
Numpy库的安装

Numpy库的安装一、numpy简介 NumPy是一个功能强大的Python库，主要用于对多维数组执行计算。NumPy提供了大量的库函数和操作，可以帮助程序员轻松地进行数值计算。…

人工智能 2023年6月15日
00132
ace2005中文数据集_ACE2005数据介绍

实验室同门有人做事理知识图谱，我也看了下事件抽取的论文，大多实验都是基于ACE2005。这个数据好像需要LDC号才可以下载，好像是付费的。这里我大概梳理下ACE2005数据集的文…

人工智能 2023年6月1日
00227
Layer Normalization解析

原论文名称：Layer Normalization原论文地址： https://arxiv.org/abs/1607.06450 之前有讲过Batch Normalization的…

人工智能 2023年7月27日
0057
Neural Graph Collaborative Filtering（NGCF）学习笔记

Neural Graph Collaborative Filtering INTRODUCTION 可学习CF模型有两个关键组成部分： 1.embedding 嵌入：它将用户和项目…

人工智能 2023年6月19日
0095
经典SQL语句大全

🌹作者:云小逸📝个人主页:云小逸的主页📝Github：云小扬🤟motto:要敢于一个人默默的面对自己，强大自己才是核心。不要等到什么都没有了，才下定决心去做。种一颗树，最好的时间…

人工智能 2023年6月20日
0054
深度学习语义分割标签图像独热编码 (one hot encoding)

one-hot encoding (独热编码) 在 loss 的计算时，Pytorch 有些 loss 函数需要网络的 ouput 与 label 的 shape 相同，因此需要…

人工智能 2023年7月22日
0060
Maven安装与配置教程

文章目录 * – 1 什么是Maven – + * 1.1 Maven概述 * 1.2 Maven的常用命令 – 2 怎么安装Maven &#8…

人工智能 2023年7月29日
0059
条件DDPM：Diffusion model的第三个巅峰之作

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年6月24日
0076
【老师见打系列】：我只是写了一个自动回复讨论的脚本~

文章目录 🌟好久不见 ⛳️实现过程 * 🌴老操作了兄弟们~ 🐢一步拿捏讨论 – 💖美图结束语专栏Python零基础入门篇 💥 Python网络蜘蛛 💥 Python…

人工智能 2023年6月26日
0059
自然语言处理—文本分类综述/什么是文本分类

最近在学习文本分类，读了很多博主的文章，要么已经严重过时（还在一个劲介绍SVM、贝叶斯)，要么就是机器翻译的别人的英文论文，几乎看遍全文，竟然没有一篇能看的综述，花了一个月时间，参…

人工智能 2023年6月23日
00135
TensorRT教程17：使用混合精度–fp32、fp16、int8（重点）

五种精度类型 kFLOAT kHALF kINT8 kINT32 kTF32 TF32精度 TF32 Tensor Cores 可以使用 FP32 加速网络，通常不会损失准确性。 …

人工智能 2023年5月28日
0063
对于二分类问题，通常可以将模型的输出结果大

问题：对于二分类问题，通常可以将模型的输出结果大于某个阈值的样本预测为正类，小于阈值的样本预测为负类。请详细介绍相关原理、算法、公式推导、计算步骤，并给出复杂Python代码示例和…

人工智能 2023年12月31日
0039
Day 53 | 1035. 不相交的线 & 53. 最大子数组和

不相交的线本题其实相当于求两个数组的最长公共子序列，与昨天做的题相同。 dp解题思路： ①确定dp数组以及下标含义 dp[i][j]：下标为[i-1]的nums1，下标为[j-1…

人工智能 2023年6月26日
0068
【Python数据分析】pandas知识总结(超全面)

创建一维数据表 sr1 = pd.Series(np.arange(10), index=list(string.ascii_uppercase[:10])) print(sr1)…

人工智能 2023年7月8日
0066
BUUCTF：喵喵喵

BUUCTF：喵喵喵 * – 1.LSB隐写 – 2.png的图片需要修复 – + * – 1.修复图片的文件头部（删除fffe） …

人工智能 2023年7月19日
0058

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

修改YOLOv5 detect.py代码使其能逐个视频检测保存，同时对每个视频内参数进行单独操作

print(save_path)

Save results (image with detections)

大家都在看