目标检测—全卷积实现

2023年7月12日下午1:05 • 人工智能 • 阅读 77

目标检测—全卷积实现

文章目录

目标检测—全卷积实现
*
一.相关知识点的学习
–
二.代码实现目标检测

首先照例分享学习资源：

带你逐行手写单目标检测算法，从数据到模型搭建、训练、预测_哔哩哔哩_bilibili

一.相关知识点的学习

二分类交叉熵：

其实现的公式：

上式中带上标的y表示预测值（0-1之间），是网络的预测结果，y是真实值，因为是二分类，所以y的值只分0和1

; MSE：

KSE（均方误差）函数一般用来检测模型的预测值和真实值之间的偏差。

其实现公式为：

其值越大，表面预测的效果越差，MSE的值越小，说明预测模型描述实验数据具有更好的精确度。

多分类交叉熵：

其实现公式为：

在上式中，C是损失值
n是求平均用的，所以是样本数量，也就是batchsize（每批训练的数量）；
x是预测向量维度，因为需要在输出的特征向量维度上一个个计算并求和；
y是onehot编码后的真实值对应x维度上的标签，是1或0；
a是onehot格式输出的预测标签，是0~1的值，a经过了softmax激活，所以a的和值为1。

其中（onehot编码）独热编码即 One-Hot 编码，又称一位有效编码。其方法是使用 N位状态寄存器来对 N个状态进行编码，每个状态都有它独立的寄存器位，并且在任意时候，其中只有一位有效。

; FCN网络：

首先，FCN和CNN的区别在于把CNN最后的全连接转换成卷积层，其次，FCN网络可以接受任意尺寸的输入图像，并采用反卷积层对最后一个卷积层的feature map（特征图）进行上采样, 使它恢复到输入图像相同的尺寸，从而可以对每个像素都产生了一个预测, 同时保留了原始输入图像中的空间信息, 最后在上采样的特征图上进行逐像素分类。

二.代码实现目标检测

二分类：sigmoid函数

回归问题：不需要激活

多分类问题：输出层使用softmax

（1）数据预处理


import os.path
import torch
import cv2
from torch.utils.data import Dataset
import numpy as np

class MyDataset(Dataset):
    def __init__(self,root,is_Train = True):
        self.dataset = []
        dir = 'train' if is_Train else "test"
        sub_dir = os.path.join(root,dir)
        print("The current picture is from ",sub_dir)
        img_list = os.listdir(sub_dir)
        for i in img_list:
            img_dir = os.path.join(sub_dir,i)
            self.dataset.append(img_dir)

    def __len__(self):
        return len(self.dataset)

    def __getitem__(self,index):
        data = self.dataset[index]
        img = cv2.imread(data)/255
        new_img = torch.tensor(img).permute(2,0,1)
        data_list = data.split('.')
        label = int(data_list[1])
        position = data_list[2:6]
        position = [int(i)/300 for i in position]
        sort = int(data_list[6])-1
        return np.float32(new_img),np.float32(label),np.float32(position),np.int(sort)

if __name__ == '__main__':
    data = MyDataset('F:\Artificial Intelligence\Target detection\yellow_data\yellow_data',is_Train=False)
    for i in data:
        print(i)

上述代码中，主要注意讲sort的标签从1-20转换为0-19

（2）.网络搭建


from torch import nn
import torch

class My_net(nn.Module):
    def __init__(self):
        super(My_net,self).__init__()

        self.layers = nn.Sequential(
            nn.Conv2d(3,11,3),
            nn.LeakyReLU(),
            nn.MaxPool2d(3),
            nn.Conv2d(11, 22, 3),
            nn.LeakyReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(22, 32, 3),
            nn.LeakyReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 64, 3),
            nn.LeakyReLU(),
            nn.Conv2d(64, 128, 3),
            nn.LeakyReLU(),
        )
        self.label_layer = nn.Sequential(
            nn.Conv2d(128,1,19),
            nn.LeakyReLU()
        )
        self.position_layer = nn.Sequential(
            nn.Conv2d(128,4,19),
            nn.LeakyReLU()
        )
        self.sort_layer = nn.Sequential(
            nn.Conv2d(128,20,19),
            nn.LeakyReLU()
        )

    def forward(self,x):
        out = self.layers(x)
        label = self.label_layer(out)
        label = torch.squeeze(label,dim = 2)
        label = torch.squeeze(label, dim=2)
        label = torch.squeeze(label, dim=1)
        position = self.position_layer(out)
        position = torch.squeeze(position,dim = 2)
        position = torch.squeeze(position,dim = 2)
        sort = self.sort_layer(out)
        sort = torch.squeeze(sort,dim = 2)
        sort = torch.squeeze(sort, dim=2)
        return label,position,sort

if __name__ == '__main__':
    net = My_net()
    x = torch.randn(3,3,300,300)
    print(net(x)[0].shape)
    print(net(x)[1].shape)
    print(net(x)[2].shape)

注意对应维度的转换。

（3）.训练函数编写

from net import My_net
from data import MyDataset
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
from torch import nn,optim
import torch
import datetime
import os
import warnings
warnings.filterwarnings('ignore')
Device = 'cuda'
class train:
    def __init__(self,root,weight):
        self.summaryWriter = SummaryWriter('logs')

        self.train_dataset = MyDataset(root = root,is_Train=True)
        self.test_dataset = MyDataset(root = root,is_Train=False)

        self.train_dataLoader = DataLoader(self.train_dataset,batch_size=50,shuffle=True)
        self.test_dataLoader = DataLoader(self.test_dataset,batch_size=50,shuffle=True)

        self.net = My_net().to(Device)
        if os.path.exists(weight):
            self.net.load_state_dict(torch.load(weight))
        self.opt = optim.Adam(self.net.parameters())

        self.label_loss_fun = nn.BCEWithLogitsLoss()
        self.position_loss_fun = nn.MSELoss()
        self.sort_loss_fun = nn.CrossEntropyLoss()

        self.train = True
        self.text = True

    def __call__(self):
        index1,index2 = 0,0
        for epoch in range(1000):
            if self.train:

                for i,(img,label,position,sort) in enumerate(self.train_dataLoader):
                    self.net.train()
                    img,label,position,sort = img.to(Device),label.to(Device),position.to(Device),sort.to(Device)

                    out_label,out_position,out_sort = self.net(img)

                    label_loss = self.label_loss_fun(out_label,label)
                    position_loss = self.position_loss_fun(out_position,position)

                    sort = sort[torch.where(sort>=0)]
                    out_sort = out_sort[torch.where(sort>=0)]
                    sort_loss = self.sort_loss_fun(out_sort,sort)

                    train_loss = 0.2 * label_loss + position_loss * 0.6 + 0.2 * sort_loss

                    self.opt.zero_grad()
                    train_loss.backward()

                    self.opt.step()

                    if i%10 == 0:
                        print('train_loss{i}=====>',train_loss.item())
                        self.summaryWriter.add_scalar('train_loss',train_loss,index1)

                        index1+=1
                date_time = str(datetime.datetime.now()).replace(' ', '-').replace(':', '_').replace('.', '_')

                torch.save(self.net.state_dict(), f'param/{date_time}-{epoch}.pt')

            if self.text:
                sum_sort_acc,sum_label_acc = 0,0
                for i, (img, label, position, sort) in enumerate(self.train_dataLoader):
                    self.net.train()
                    img, label, position, sort = img.to(Device), label.to(Device), position.to(Device), sort.to(Device)
                    out_label, out_position, out_sort = self.net(img)
                    label_loss = self.label_loss_fun(out_label, label)
                    position_loss = self.position_loss_fun(out_position, position)
                    sort = sort[torch.where(sort >= 0)]
                    out_sort = out_sort[torch.where(sort >= 0)]
                    sort_loss = self.sort_loss_fun(out_sort, sort)

                    test_loss = label_loss + position_loss + sort_loss

                    out_label = torch.tensor(torch.sigmoid(out_label))

                    out_label[torch.where(out_label>=0.5)] = 1
                    out_label[torch.where(out_label < 0.5)] = 0

                    out_sort = torch.argmax(torch.softmax(out_sort,dim = 1))

                    label_acc=torch.mean(torch.eq(out_label,label).float())
                    sum_label_acc+=label_acc
                    sort_acc = torch.mean(torch.eq(out_sort,sort).float())
                    sum_sort_acc += sort_acc
                    if i % 10 == 0:
                        print('test_loss{i}=====>', test_loss.item())
                        self.summaryWriter.add_scalar('test_loss', test_loss, index2)
                        index2 += 1
                avg_sort_acc = sum_sort_acc/i
                avg_label_acc = sum_label_acc/i

                print(f'avg_sort_acc{epoch}===>',avg_sort_acc)
                print(f'avg_label_acc{epoch}===>', avg_sort_acc)
                self.summaryWriter.add_scalar('avg_sort_acc',avg_sort_acc,epoch)
                self.summaryWriter.add_scalar('avg_label_acc', avg_label_acc, epoch)

if __name__ == '__main__':
    train = train('F:\Artificial Intelligence\Target detection\yellow_data\yellow_data','param/2021-10-19-23_38_55_992635-0.pt')
    train()

若未有weight文件的话，先注释掉weight部分，否则代码无法运行，具体可以看分享的学习资料。

（4）.预测函数

import os

import torch
import cv2
from net import My_net

if __name__ == '__main__':
    img_name = os.listdir(r'F:\Artificial Intelligence\\Target detection\yellow_data\yellow_data\\test')
    for i in img_name:
        img_dir = os.path.join(r'F:\Artificial Intelligence\\Target detection\yellow_data\yellow_data\\test',i)

        img = cv2.imread(img_dir)

        position = i.split('.')[2:6]

        sort = i.split('.')[6]
        position = [int(j) for j in position]
        cv2.rectangle(img,(position[0],position[1]),(position[2],position[3]),(0,255,0),thickness = 2)
        cv2.putText(img,sort,(position[0],position[1]-3),cv2.FONT_HERSHEY_SIMPLEX,2,(0,255,0),thickness = 2)

        model = My_net()

        model.load_state_dict(torch.load('param/2021-10-20-08_13_42_378776-0.pt'))

        new_img = torch.tensor(img).permute(2,0,1)

        new_img = torch.unsqueeze(new_img,dim = 0)/255

        out_label,out_sort,out_position = model(new_img)

        out_label = torch.sigmoid(out_label)
        out_sort = torch.argmax(torch.softmax(out_sort,dim = 1))

        out_position = out_position[0]*300
        out_position = [int(i) for i in out_position]
        if out_label.item()>0.5:
            cv2.rectangle(img,(out_position[0],out_position[1]),(out_position[2],out_position[3]),(0,255,0))
            cv2.putText(img,str(out_sort.item()),(out_position[0],out_position[1]),cv2.FONT_HERSHEY_SIMPLEX,2,(0,0,255))

        cv2.imshow('img',img)
        cv2.waitKey(500)
        cv2.destroyAllWindows()

10.20学习总结

Original: https://blog.csdn.net/qq_52533790/article/details/120872424
Author: 风声向寂
Title: 目标检测—全卷积实现

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/687710/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

10大理由告诉你为什么要学习Python

Python 是当今非常流行的编程语言，在互联网上经常可以看到他的身影。它应用非常广泛，例如编程、Web 开发、机器学习和数据科学等。在最新的tiobe排行中 Python 甚至超…

人工智能 2023年7月16日
0062
Yolov5实例分割Tensorrt部署实战

0.引言 ultralytics在yolov5的6.2版本发布了实例分割模型，可实现快速实例分割，采用官方v5s-seg.pt效果如下图所示：本博客将以此为基础，开发c++版本的…

人工智能 2023年6月29日
0077
可视化分析(机器学习)

目录一、实验内容二、实验过程 1、算法思想： 2、算法原理 3、算法分析（1）需求分析（2）数据预处理（3）分析与建模（4）模型评价与优化（5）部署三、源程序代码 …

人工智能 2023年6月15日
00111
摄像头拍摄后对图片进行图像处理-python（空域增强）

通过摄像头拍摄后对图像进行图像处理（空域增强）硬件介绍应用思路 * 总体设计具体设计操作方法功能体验客观评价总结后续硬件介绍据市场调研预测，未来几年内，基于CM…

人工智能 2023年6月22日
0081
Excel函数公式大全—MATCH函数

EXCEL系列文章目录 Excel系列文章是本人亲身经历职场之后萌发的想法，为什么Excel覆盖如此之广，几乎每个公司、学校、家庭都在使用，但是它深藏的宝藏功能却很少被人使用，PQ…

人工智能 2023年6月11日
0098
Python Apex YOLO V5 6.2 目标检测全过程记录

博文目录文章目录效果展示工程源码环境准备第一阶段使用自带模型实现实时目标检测 * 屏幕截图封装详见 toolkit.py 中 Capture 类目标检测封装详见…

人工智能 2023年6月16日
00131
只需图片＋配音？你一定不能错过的短视频制作小技巧（内附教程）

所以今天，我想和大家分享三款非常好用、功能齐全的配音工具： [En] So today, I’d like to share with you three dubbin…

人工智能 2023年5月27日
0069
新冠知识图谱学习记录

新冠知识图谱构建与问题记录文章目录新冠知识图谱构建与问题记录 * 开始前的安装工作问题(1)描述：shell可import，pycharm报no modul 问题(2)描述：…

人工智能 2023年6月1日
00118
什么是神经网络

人工智能 2024年1月6日
0091
神经网络：实现感知机算法-python

主要原理： f(x) = sign(w*x + b) 损失函数 L(w, b) = -\Sigma{y_{i}(w*x_{i} + b)} 随即梯度下降法 Stochastic G…

人工智能 2023年7月14日
0041
Encoder-Decoder 框架

一、Encoder-Decoder 框架的结构 Encoder-Decoder 框架可以看作是一种深度学习领域的研究模式，应用场景异常广泛。下图是文本处理领域里常用的 Encod…

人工智能 2023年5月27日
0066
ubuntu18.04安装cuda和cudnn

一、安装 cuda 1. 首先查看自身电脑最高支持的cuda版本为多少，在终端输入以下指令 nvidia-smi 可以看到我的最高支持 cuda 11.4 。 2. 打开英伟达官网…

人工智能 2023年6月26日
0070
2022 第二届中国移动“梧桐杯”大数据应用创新大赛-基于移动大数据的网约车司机识别线上0.95+ 方案

大家好，我是轶扬，本文给大家分享一下最近半个多月参加的一个机器学习数据挖掘类比赛，移动主办的2022年梧桐杯——基于移动大数据的网约车司机识别，这是一个结构化表格类比赛，提供的数据…

人工智能 2023年6月19日
0070
基于元数据规则的大数据解决方案

实施大数据分析的目的随着业务不断横向扩张与数据纵向的不断增多，编写基于事务数据库的跨库跨服务程序解决统计、报表、内容搜索等，越来越麻烦且时效性差。我们需要一个数仓聚合数据解决这些…

人工智能 2023年6月11日
00126
使用SOCKET TCP OPENCV 发送视频 mat格式及jpg格式

Server.cpp #includesocket .h> #includeopencv 2/opencv .hpp> const int PORT = 5678; /…

人工智能 2023年7月19日
0045
特征编码1 – 特征编码概述与分类

对特征编码有助于：处理无法量化的数据便于得到更优模型以更细粒度的处理我们的数据博主在学习中觉得，特征编码既是一种优势，又是一种束缚优势在于：同样是年龄这个维度，如果是在研…

人工智能 2023年7月1日
00103

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

目标检测—全卷积实现

文章目录

一.相关知识点的学习

二分类交叉熵：

; MSE：

多分类交叉熵：

; FCN网络：

二.代码实现目标检测

大家都在看