深度学习：使用nanodet训练自己制作的数据集并测试模型，通俗易懂，适合小白

2023年7月9日上午7:33 • 人工智能 • 阅读 95

关于目标检测的模型有很多， nanodet模型小且轻量化，适合移植部署到硬件设备上，非常推荐学习这个模型。经过我自己踩了无数坑后总结了此教程，助学刚接触nanodet，想要快速上手的学者，教程很详细，学者耐心看。

一、准备nanodet框架
二、安装训练必要的库文件
*
1、必要库文件
2、torch和torchvison的版本对应问题
3、坑
三、安装nanodet包
四、数据集制作
*
1、VOC格式数据集
2、数据增强
3、VOC格式数据集转COCO格式数据集
4、坑
五、训练数据集
*
1、配置文件的修改
2、开始训练
六、测试训练好的模型
*
1、检测图片
2、检测视频
3、调用电脑摄像头
七、总结

一、准备nanodet框架

我提供了一个已经调试好的源码包，后面的教程也都是基于我自己提供的源码包讲解的，学习者自行下载，下载源码包的链接为：添加链接描述，提取码为：4g6w，学者下载解压后的文件如下所示：

当然官方也给了nanodet源码的下载地址，下载链接为：添加链接描述，如下

补：关于源码包的选择，推荐学者直接下载我提供的使用，我给的源码包中已经填了很多坑，使用官网给的，中间还有很多坑需要自己补。

; 二、安装训练必要的库文件

1、必要库文件

python的版本选择3.7或者3.8及其它高版本的都可以，其它必要包见下：

; 2、torch和torchvison的版本对应问题

torch和torchvision版本的选择一定要对应，不然后期正式训练会遇到各种问题，很糟心。关于版本的对应，学者自己查看网址：添加链接描述，如下：

; 3、坑

一开始我自己电脑安装的是torch版本1.12.0，torchvision版本0.13.0，python版本3.7，按照官网给的版本是对应的，但是正式训练时就报错提示torch和torchvison版本不对应。我最开始安装torch和torchvision的时候是通过官网页面给的命令安装的，最后发现训练有问题，官网链接：添加链接描述，官网给的如下：

通过上面官网页面直接给的安装链接有问题，卸载原始的torch和torchvision。后我又换了官网给的其它安装链接，可以对应安装成功，链接为：添加链接描述，如下：

在这个链接里面给了很多安装命令，学者在安装时注意查看自己的电脑显卡型号，以及cuda版本，不知道怎么查看自己电脑显卡型号的学者看我的另外一篇博文：添加链接描述

安装好torch和torchvision后，在dos命令窗口中进入对应虚拟环境中输入import torch 和import torchvision，如果都不报错，说明安装成功了，且torch和torchvision的版本也已经是对应的，如下：

; 三、安装nanodet包

在根目录下找到setup.py文件，并安装这个包，具体安装方法见下：

具体安装先进入到虚拟环境中，再去到根目录文件中，输入下面的安装命令回车即可安装nanodet包:

python setup.py develop

运行后如下图所示就说明nanodet包安装成功了:

四、数据集制作

1、VOC格式数据集

nanodet支持VOC数据集和COCO数据集，我提供的源码包中，主要是针对COCO 数据集，我先制作了VOC数据集，后又通过一段代码将VOC数据集直接转换为COCO数据集。VOC数据集的制作我使用的软件是labelimg，关于labelimg制作VOC数据集的方法，学者见我的另外一篇博客，整理的很详细，链接：添加链接描述

VOC数据集图片与标签文件对应的样纸见下：

; 2、数据增强

上面打标签制作好一部分数据集后，我们可以对数据集进行增强，即增强数据集的复杂性，增强训练模型的适应性。具体的数据增强方法见我的另外一篇博客，有详细的介绍，链接为：添加链接描述

在数据增强的代码中要修改的地方见下：

具体代码见下：

import xml.etree.ElementTree as ET
import pickle
import os
from os import getcwd
import numpy as np
from PIL import Image
import shutil
import matplotlib.pyplot as plt

import imgaug as ia
from imgaug import augmenters as iaa

ia.seed(1)

def read_xml_annotation(root, image_id):
    in_file = open(os.path.join(root, image_id))
    tree = ET.parse(in_file)
    root = tree.getroot()
    bndboxlist = []

    for object in root.findall('object'):
        bndbox = object.find('bndbox')

        xmin = int(bndbox.find('xmin').text)
        xmax = int(bndbox.find('xmax').text)
        ymin = int(bndbox.find('ymin').text)
        ymax = int(bndbox.find('ymax').text)

        bndboxlist.append([xmin, ymin, xmax, ymax])

    bndbox = root.find('object').find('bndbox')
    return bndboxlist

def change_xml_annotation(root, image_id, new_target):
    new_xmin = new_target[0]
    new_ymin = new_target[1]
    new_xmax = new_target[2]
    new_ymax = new_target[3]

    in_file = open(os.path.join(root, str(image_id) + '.xml'))
    tree = ET.parse(in_file)
    xmlroot = tree.getroot()
    object = xmlroot.find('object')
    bndbox = object.find('bndbox')
    xmin = bndbox.find('xmin')
    xmin.text = str(new_xmin)
    ymin = bndbox.find('ymin')
    ymin.text = str(new_ymin)
    xmax = bndbox.find('xmax')
    xmax.text = str(new_xmax)
    ymax = bndbox.find('ymax')
    ymax.text = str(new_ymax)
    tree.write(os.path.join(root, str("%06d" % (str(id) + '.xml'))))

def change_xml_list_annotation(root, image_id, new_target, saveroot, id,img_name):
    in_file = open(os.path.join(root, str(image_id) + '.xml'))
    tree = ET.parse(in_file)
    elem = tree.find('filename')
    elem.text = (img_name + str("_%06d" % int(id)) + '.jpg')
    xmlroot = tree.getroot()
    index = 0

    for object in xmlroot.findall('object'):
        bndbox = object.find('bndbox')

        new_xmin = new_target[index][0]
        new_ymin = new_target[index][1]
        new_xmax = new_target[index][2]
        new_ymax = new_target[index][3]

        xmin = bndbox.find('xmin')
        xmin.text = str(new_xmin)
        ymin = bndbox.find('ymin')
        ymin.text = str(new_ymin)
        xmax = bndbox.find('xmax')
        xmax.text = str(new_xmax)
        ymax = bndbox.find('ymax')
        ymax.text = str(new_ymax)

        index = index + 1

    tree.write(os.path.join(saveroot, img_name + str("_%06d" % int(id)) + '.xml'))

def mkdir(path):

    path = path.strip()

    path = path.rstrip("\\")

    isExists = os.path.exists(path)

    if not isExists:

        os.makedirs(path)
        print(path + ' 创建成功')
        return True
    else:

        print(path + ' 目录已存在')
        return False

if __name__ == "__main__":

    IMG_DIR = "Images/val/val2017"
    XML_DIR = "Images/val/valxml"

    AUG_XML_DIR = "Images/val Enhance/valxml"
    try:
        shutil.rmtree(AUG_XML_DIR)
    except FileNotFoundError as e:
        a = 1
    mkdir(AUG_XML_DIR)

    AUG_IMG_DIR = "Images/val Enhance/val2017"
    try:
        shutil.rmtree(AUG_IMG_DIR)
    except FileNotFoundError as e:
        a = 1
    mkdir(AUG_IMG_DIR)

    AUGLOOP = 10

    boxes_img_aug_list = []
    new_bndbox = []
    new_bndbox_list = []

    seq = iaa.Sequential([
        iaa.Flipud(0.5),
        iaa.Fliplr(0.5),
        iaa.Multiply((1.2, 1.5)),
        iaa.GaussianBlur(sigma=(0, 3.0)),
        iaa.Affine(
            translate_px={"x": 15, "y": 15},
            scale=(0.8, 0.95),
            rotate=(-30, 30)
        )
    ])

    for root, sub_folders, files in os.walk(XML_DIR):

        for name in files:
            print(name)
            bndbox = read_xml_annotation(XML_DIR, name)
            shutil.copy(os.path.join(XML_DIR, name), AUG_XML_DIR)
            shutil.copy(os.path.join(IMG_DIR, name[:-4] + '.jpg'), AUG_IMG_DIR)

            for epoch in range(AUGLOOP):
                seq_det = seq.to_deterministic()

                img = Image.open(os.path.join(IMG_DIR, name[:-4] + '.jpg'))

                img = np.asarray(img)

                for i in range(len(bndbox)):
                    bbs = ia.BoundingBoxesOnImage([
                        ia.BoundingBox(x1=bndbox[i][0], y1=bndbox[i][1], x2=bndbox[i][2], y2=bndbox[i][3]),
                    ], shape=img.shape)

                    bbs_aug = seq_det.augment_bounding_boxes([bbs])[0]
                    boxes_img_aug_list.append(bbs_aug)

                    n_x1 = int(max(1, min(img.shape[1], bbs_aug.bounding_boxes[0].x1)))
                    n_y1 = int(max(1, min(img.shape[0], bbs_aug.bounding_boxes[0].y1)))
                    n_x2 = int(max(1, min(img.shape[1], bbs_aug.bounding_boxes[0].x2)))
                    n_y2 = int(max(1, min(img.shape[0], bbs_aug.bounding_boxes[0].y2)))
                    if n_x1 == 1 and n_x1 == n_x2:
                        n_x2 += 1
                    if n_y1 == 1 and n_y2 == n_y1:
                        n_y2 += 1
                    if n_x1 >= n_x2 or n_y1 >= n_y2:
                        print('error', name)
                    new_bndbox_list.append([n_x1, n_y1, n_x2, n_y2])

                image_aug = seq_det.augment_images([img])[0]
                path = os.path.join(AUG_IMG_DIR,
                                    name[:-4] + str( "_%06d" % (epoch + 1)) + '.jpg')
                image_auged = bbs.draw_on_image(image_aug, thickness=0)
                Image.fromarray(image_auged).save(path)

                change_xml_list_annotation(XML_DIR, name[:-4], new_bndbox_list, AUG_XML_DIR,
                                           epoch + 1,name[:-4])
                print( name[:-4] + str( "_%06d" % (epoch + 1)) + '.jpg')
                new_bndbox_list = []

增强后的数据集图片和对应标签文件都增多了，具体见下：

3、VOC格式数据集转COCO格式数据集

上面的VOC格式数据集制作好后，只需要通过下面的一段代码，即可将VOC格式数据集转换为COCO数据集，代码中需要的地方见下：

具体代码见下：


import xml.etree.ElementTree as ET
import os
import json

coco = dict()
coco['images'] = []
coco['type'] = 'instances'
coco['annotations'] = []
coco['categories'] = []

category_set = dict()
image_set = set()

category_item_id = 0

image_id = 0
id_num = 0
annotation_id = 0

def addCatItem(name):
    global category_item_id
    category_item = dict()
    category_item['supercategory'] = 'none'
    category_item_id += 1
    category_item['id'] = category_item_id
    category_item['name'] = name
    coco['categories'].append(category_item)
    category_set[name] = category_item_id
    return category_item_id

def addImgItem(file_name, size):
    global image_id, id_num
    if file_name is None:
        raise Exception('Could not find filename tag in xml file.')
    if size['width'] is None:
        raise Exception('Could not find width tag in xml file.')
    if size['height'] is None:
        raise Exception('Could not find height tag in xml file.')

    image_item = dict()

    temp = int(id_num)

    image_item['id'] = temp
    id_num += 1
    image_item['file_name'] = file_name
    image_item['width'] = size['width']
    image_item['height'] = size['height']
    coco['images'].append(image_item)
    image_set.add(file_name)
    return image_item['id']

def addAnnoItem(object_name, image_id, category_id, bbox):
    global annotation_id
    annotation_item = dict()
    annotation_item['segmentation'] = []
    seg = []

    seg.append(bbox[0])
    seg.append(bbox[1])

    seg.append(bbox[0])
    seg.append(bbox[1] + bbox[3])

    seg.append(bbox[0] + bbox[2])
    seg.append(bbox[1] + bbox[3])

    seg.append(bbox[0] + bbox[2])
    seg.append(bbox[1])

    annotation_item['segmentation'].append(seg)

    annotation_item['area'] = bbox[2] * bbox[3]
    annotation_item['iscrowd'] = 0
    annotation_item['ignore'] = 0
    annotation_item['image_id'] = image_id
    annotation_item['bbox'] = bbox
    annotation_item['category_id'] = category_id
    annotation_id += 1
    annotation_item['id'] = annotation_id
    coco['annotations'].append(annotation_item)

def parseXmlFiles(xml_path):
    for f in os.listdir(xml_path):
        if not f.endswith('.xml'):
            continue

        bndbox = dict()
        size = dict()
        current_image_id = None
        current_category_id = None
        file_name = None
        size['width'] = None
        size['height'] = None
        size['depth'] = None

        xml_file = os.path.join(xml_path, f)
        print(xml_file)

        tree = ET.parse(xml_file)
        root = tree.getroot()
        if root.tag != 'annotation':
            raise Exception('pascal voc xml root element should be annotation, rather than {}'.format(root.tag))

        for elem in root:
            current_parent = elem.tag
            current_sub = None
            object_name = None

            if elem.tag == 'folder':
                continue

            if elem.tag == 'filename':
                file_name = elem.text
                if file_name in category_set:
                    raise Exception('file_name duplicated')

            elif current_image_id is None and file_name is not None and size['width'] is not None:
                if file_name not in image_set:
                    current_image_id = addImgItem(file_name, size)
                    print('add image with {} and {}'.format(file_name, size))
                else:
                    raise Exception('duplicated image: {}'.format(file_name))

            for subelem in elem:
                bndbox['xmin'] = None
                bndbox['xmax'] = None
                bndbox['ymin'] = None
                bndbox['ymax'] = None

                current_sub = subelem.tag
                if current_parent == 'object' and subelem.tag == 'name':
                    object_name = subelem.text
                    if object_name not in category_set:
                        current_category_id = addCatItem(object_name)
                    else:
                        current_category_id = category_set[object_name]

                elif current_parent == 'size':
                    if size[subelem.tag] is not None:
                        raise Exception('xml structure broken at size tag.')
                    size[subelem.tag] = int(subelem.text)

                for option in subelem:
                    if current_sub == 'bndbox':
                        if bndbox[option.tag] is not None:
                            raise Exception('xml structure corrupted at bndbox tag.')
                        bndbox[option.tag] = int(option.text)

                if bndbox['xmin'] is not None:
                    if object_name is None:
                        raise Exception('xml structure broken at bndbox tag')
                    if current_image_id is None:
                        raise Exception('xml structure broken at bndbox tag')
                    if current_category_id is None:
                        raise Exception('xml structure broken at bndbox tag')
                    bbox = []

                    bbox.append(bndbox['xmin'])

                    bbox.append(bndbox['ymin'])

                    bbox.append(bndbox['xmax'] - bndbox['xmin'])

                    bbox.append(bndbox['ymax'] - bndbox['ymin'])
                    print('add annotation with {},{},{},{}'.format(object_name, current_image_id, current_category_id,
                                                                   bbox))
                    addAnnoItem(object_name, current_image_id, current_category_id, bbox)

if __name__ == '__main__':

    xml_path = "coco/trainxml"
    json_file = 'coco/annotations/instances_train2017.json'

    parseXmlFiles(xml_path)
    json.dump(coco, open(json_file, 'w'))

上面代码是将训练集或验证集里面的所有.xml标签文件转化为一个.json文件，具体转换后的效果见下：

转换后的.json文件中标签的样纸见下：

4、坑

学者自己制作数据集的时候，图片名字最好命名为1.jpg，2.jpg这种格式（如下图），不要命名为(1).jpg，(2).jpg这种格式，如果命名为后面这种格式，后期训练会遇到各种麻烦问题，报错等。

注：图片格式其它的也行，不是非得.jpg格式，主要是图片名要注意。

如果学者想快速批量修改数据集中图片名，具体方法见我的另外一篇博客，链接为：添加链接描述。想要快速批量修改VOC格式数据集中标签名，即快速批量修改.xml文件名，见我另外一篇博客，链接为：添加链接描述

; 五、训练数据集

1、配置文件的修改

在正式训练前需要在配置文件中修改几处地方，具体的修改位置见下：

; 2、开始训练

在控制台中输入下面命令回车等待就开始训练了，见下：

python tools/train.py ./config/legacy_v0.x_configs/nanodet-m.yml

回车后如下面这个样纸就说明在正常训练了，等待训练完即可：

在这个训练过程中，训练好的模型权重文件会被存放到根目录下的workspace文件夹中，具体位置见下：

六、测试训练好的模型

漫长的等待训练好后，我们测试一下模型的实际检测效果。

1、检测图片

如果想要测试某张图片，则使用下面的命令：
其中，CONFIG_PATH 为开始的config文件，MODEL_PATH 为刚刚生成的pth文件，IMAGE_PATH为图片的路径。

python demo/demo.py image --config CONFIG_PATH --model MODEL_PATH --path IMAGE_PATH

比如我自己训练好后调用我自己的模型则输入：

python demo/demo.py image --config ./config/legacy_v0.x_configs/nanodet-m.yml --model ./workspace/nanodet_m/model_best/nanodet_model_best.pth --path ./coco/val2017/53.jpg

2、检测视频

其中，CONFIG_PATH 为开始的config文件，MODEL_PATH 为刚刚生成的pth文件，VIDEO_PATH为视频的路径。

python demo/demo.py video --config CONFIG_PATH --model MODEL_PATH --path VIDEO_PATH

比如我自己训练好后调用我自己的模型则输入：

python demo/demo.py video --config ./config/legacy_v0.x_configs/nanodet-m.yml --model ./workspace/nanodet_m/model_best/nanodet_model_best.pth --path ./Video/dai.avi

3、调用电脑摄像头

其中，CONFIG_PATH 为开始的config文件，MODEL_PATH 为刚刚生成的pth文件，0为调用电脑自带的摄像头。

输入下面命令即可调用电脑摄像头检测：

python demo/demo.py webcam --config CONFIG_PATH --model MODEL_PATH --path 0

比如我自己训练好后调用我自己的模型则输入：

python demo/demo.py webcam --config ./config/legacy_v0.x_configs/nanodet-m.yml --model ./workspace/nanodet_m/model_best/nanodet_model_best.pth --path  0

七、总结

上面的教程中我提供了少量的数据集，以及训练好的模型，学者按照我上面的步骤，先准备好数据集，后再训练，最后测试，在这个过程中每个人都会遇到不同的问题，实在解决不了的问题可以在下面评论，我看到会及时回复。

希望此教程能帮助到你快速上手使用nanodet，总结教程不易，多多支持，谢谢！

Original: https://blog.csdn.net/qq_40280673/article/details/126097822
Author: 佐咖
Title: 深度学习：使用nanodet训练自己制作的数据集并测试模型，通俗易懂，适合小白

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/680257/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

基于深度学习的弹道目标智能分类

关注微信公众号：人工智能技术与咨询。了解更多资讯！来源：系统工程与电子技术，作者李江等针对弹道目标微动分类前需平动补偿及典型雷达散射截面积(radar cross-sectio…

人工智能 2023年7月2日
0075
VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tupl

构造一个多维数组时，出现问题： VisibleDeprecationWarning: Creating an ndarray from ragged nested sequence…

人工智能 2023年6月17日
0069
pointnet C++推理部署（一）

由于tensorflow编译C++的api比较麻烦，此次部署的pointnet代码的Python版本为Pytorch编写的。代码：Pointnet_Pointnet2_pytorc…

人工智能 2023年5月28日
00171
Python 计算机视觉（十五）—— 图像特效处理

参考的一些文章以及论文我都会给大家分享出来 —— 链接就贴在原文，论文我上传到资源中去，大家可以免费下载学习，如果当天资源区找不到论文，那就等等，可能正在审核，审核完后就可以下载了…

人工智能 2023年5月26日
0058
数据分析平台重点演示内容梳理

DAP数据分析平台作为公司核心产品，其价值在于能将企业业务数据进行抽取汇聚，建立企业数据仓库，能够真实、准确、清晰、有效地将企业内部及外部行业数据进行可视化展现，帮助企业提升行业…

人工智能 2023年6月19日
00104
语音识别（ASR)论文优选：WeNet

声明：平时看些文章做些笔记分享出来，文章中难免存在错误的地方，还望大家海涵。平时搜集一些资料，方便查阅学习：TTS 论文列表 http://yqli.tech/page/tts_p…

人工智能 2023年5月25日
0084
【算法分析与设计】【期中（末）复习题】【2022秋】

文章目录一. 单选题二. 填空题三. 判断题四. 多选题一. 单选题 1.按照渐近阶从低到高的顺序排列下列表达式： 30n，2logn，4，n! A. 4 Origina…

人工智能 2023年7月29日
0082
小啾带你开天眼之开启py-OpenCV摄像头及视频处理【Python-Open_CV系列（十二）】

小啾带你开天眼之开启py-OpenCV摄像头及视频处理【Python-Open_CV系列（十二）】文章目录 1.调用摄像头 – 使用VideoCapture类 2…

人工智能 2023年7月5日
00110
利用红外图像数据集FLIR Thermal Starter训练yolov5目标检测算法

Yolov5代码：https://github.com/ultralytics/yolov5数据集：https://www.flir.com/oem/adas/adas-datas…

人工智能 2023年5月25日
00119
深度学习（2）回归问题

深度学习（2）回归问题一. 问题提出与解析 * 1. Machine Learning 2. Continuous Prediction 3. Linear Equation 4…

人工智能 2023年6月18日
00100
【编程题】【Scratch三级】2021.12 数星星

数星星 Goblin有个很厉害的魔法：他一挥手就可以点亮夜空中五彩斑斓的星星灯。 ; 1. 准备工作（1）删除小猫角色，添加”Goblin”角色，并删除G…

人工智能 2023年6月30日
0091
如何读论文

论文结构 title abstract introduction method experiments conclusion 三遍读论文第一遍：标题、摘要、结论。读完这三个部分…

人工智能 2023年7月12日
0096
FPGA时序约束02——不同时序路径的分析方法

前言分别是触发器到触发器，触发器到输出端，输入端到触发器，输入端到输出端，其中输入端到输出端是纯组合逻辑路径，没有时钟，所谓的时序其实就是延时，故此处只分析前三种路径的时序。注…

人工智能 2023年6月27日
0095
【NOWCODER】- Python：内置函数（二）

【NOWCODER】- Python 刷题 😁作者：啊Q老师🐬人生格言：你问我是否背负过，浩如山海的期望和责任？你所谓的背负，不过是被他人的意愿驱使；我的道，永远只为自己而走。📰个…

人工智能 2023年7月3日
0082
对比学习（contrastive learning）

什么是自监督学习？举个通俗的例子：即使不记得物体究竟是什么样子，我们也可以在野外识别物体。我们通过记住高阶特征并忽略微观层面的细节来做到这一点。那么，现在的问题是，我们能否构建…

人工智能 2023年5月26日
0085
Python Word2vec训练医学短文本字/词向量实例实现，Word2vec训练字向量，Word2vec训练词向量，Word2vec训练保存与加载模型，Word2vec基础知识

一、Word2vec概念（1）Word2vec，是一群用来产生词向量的相关模型。这些模型为浅而双层的神经网络，用来训练以重新建构语言学之词文本。网络以词表现，并且需猜测相邻位置的…

人工智能 2023年7月18日
00107

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

深度学习：使用nanodet训练自己制作的数据集并测试模型，通俗易懂，适合小白

目录

1、必要库文件

; 2、torch和torchvison的版本对应问题

; 3、坑

1、VOC格式数据集

; 2、数据增强

3、VOC格式数据集转COCO格式数据集

4、坑

1、配置文件的修改

; 2、开始训练

1、检测图片

2、检测视频

3、调用电脑摄像头

大家都在看