NanoDet代码逐行精读与修改（二）FPN/PAN

2023年7月12日上午2:05 • 人工智能 • 阅读 53

–neozng1@hnu.edu.cn

2. Neck

前一个版本的NanoDet为了追求极致的推理速度使用了无卷积融合的PAN架构，即top-down和down-top路径都是直接通过双线性插值的上下采样+element-wise add实现的，随之而来的显然是性能的下降。在NanoDet-Plus中，作者将Ghost module用于特征融合中，打造了Ghost-PAN，在保证不增加过多参数和运算量的前提下增强了多尺度目标检测的性能。

Ghost PAN中用到了一些GhostNet中的模块，直接查看第一部分关于即可。

作者在Ghost bottleneck的基础上，增加一个reduce_layer以减小通道数，构成Ghost Blocks。这就是用于top-down和bottom-up融合的操作。同时还可以选择是否使用残差连接。且Ghost Block中的ghost bottle neck选用了5×5的卷积核，可以 扩大感受野，更好地融合不同尺度的特征。

class GhostBlocks(nn.Module):
    """Stack of GhostBottleneck used in GhostPAN.


    Args:
        in_channels (int): Number of input channels.

        out_channels (int): Number of output channels.

        expand (int): Expand ratio of GhostBottleneck. Default: 1.

        kernel_size (int): Kernel size of depthwise convolution. Default: 5.

        num_blocks (int): Number of GhostBottlecneck blocks. Default: 1.

        use_res (bool): Whether to use residual connection. Default: False.

        activation (str): Name of activation function. Default: LeakyReLU.

"""

    def __init__(
        self,
        in_channels,
        out_channels,
        expand=1,
        kernel_size=5,
        num_blocks=1,
        use_res=False,
        activation="LeakyReLU",
    ):
        super(GhostBlocks, self).__init__()
        self.use_res = use_res
        if use_res:
            # 若选择添加残差连接,用一个point wise conv对齐通道数
            self.reduce_conv = ConvModule(
                in_channels,
                out_channels,
                kernel_size=1,
                stride=1,
                padding=0,
                activation=activation,
            )
        blocks = []
        for _ in range(num_blocks):
            blocks.append(
                GhostBottleneck(
                    in_channels,
                    int(out_channels * expand), # 第一个ghost module选择不扩充通道数,保持和输入相同
                    out_channels,
                    dw_kernel_size=kernel_size,
                    activation=activation,
                )
            )
        self.blocks = nn.Sequential(*blocks)

    def forward(self, x):
        out = self.blocks(x)
        if self.use_res:
            out = out + self.reduce_conv(x)
        return out

把Ghost Block对PAN中用于特征融合的卷积进行替换就得到了Ghost PAN。

初始化和参数部分

class GhostPAN(nn.Module):
    """Path Aggregation Network with Ghost block.用ghost block替代了简单的卷积用于特征融合

    Args:
        in_channels (List[int]): Number of input channels per scale.

        out_channels (int): Number of output channels (used at each scale)
                            拥有相同的输出通道数,方便检测头有统一的输出
        use_depthwise (bool): Whether to depthwise separable convolution in
            blocks. Default: False
        kernel_size (int): Kernel size of depthwise convolution. Default: 5.

        expand (int): Expand ratio of GhostBottleneck. Default: 1.

        num_blocks (int): Number of GhostBottlecneck blocks. Default: 1.

        use_res (bool): Whether to use residual connection. Default: False.

        num_extra_level (int): Number of extra conv layers for more feature levels.

            Default: 0.

        upsample_cfg (dict): Config dict for interpolate layer.

            Default: dict(scale_factor=2, mode='nearest')
        norm_cfg (dict): Config dict for normalization layer.

            Default: dict(type='BN')
        activation (str): Activation layer name.

            Default: LeakyReLU.

"""

    def __init__(
        self,
        in_channels,
        out_channels,
        use_depthwise=False,
        kernel_size=5,
        expand=1,
        num_blocks=1,
        use_res=False,
        num_extra_level=0,
        upsample_cfg=dict(scale_factor=2, mode="bilinear"),
        norm_cfg=dict(type="BN"),
        activation="LeakyReLU",
    ):
        super(GhostPAN, self).__init__()
        assert num_extra_level >= 0
        assert num_blocks >= 1
        self.in_channels = in_channels
        self.out_channels = out_channels
        # DepthwiseConvModule和ConvModule都是MMdetection中的基础模块,分别对应深度可分离卷积和基本的conv+norm+act模块
        conv = DepthwiseConvModule if use_depthwise else ConvModule

top-down连接

build top-down blocks
self.upsample = nn.Upsample(**upsample_cfg)
在不同stage的特征输入FPN前先进行通道数衰减,降低计算量
self.reduce_layers = nn.ModuleList()
for idx in range(len(in_channels)):
    self.reduce_layers.append(
        ConvModule(
            in_channels[idx],
            out_channels,
            1,
            norm_cfg=norm_cfg,
            activation=activation,
        )
    )
    self.top_down_blocks = nn.ModuleList()
    # 注意索引方式,从最后一个元素向前开始索引到0个
    for idx in range(len(in_channels) - 1, 0, -1):
        self.top_down_blocks.append(
            GhostBlocks(
                # input channel为out_channels*2是因为特征融合采用的cat而非add
                out_channels * 2,
                out_channels,
                expand,
                kernel_size=kernel_size,
                num_blocks=num_blocks,
                use_res=use_res,
                activation=activation,
            )
        )

* bottom-up连接

build bottom-up blocks
        self.downsamples = nn.ModuleList()
        self.bottom_up_blocks = nn.ModuleList()
        for idx in range(len(in_channels) - 1):
            self.downsamples.append(
                conv(
                    out_channels,
                    out_channels,
                    kernel_size,
                    stride=2,
                    padding=kernel_size // 2,
                    norm_cfg=norm_cfg,
                    activation=activation,
                )
            )
            self.bottom_up_blocks.append(
                GhostBlocks(
                    out_channels * 2, # 同样是因为融合时使用cat
                    out_channels,
                    expand,
                    kernel_size=kernel_size,
                    num_blocks=num_blocks,
                    use_res=use_res,
                    activation=activation,
                )
            )

bottom-up连接和top-down一样,只不过方向相反,稍后在 foward() 方法中可以很清楚的看到，这里不再作图（PPT画图真的太慢了累死我了）
* extra layer

extra layers,即PAN上额外的一层,由PAN的最顶层经过卷积得到.

        self.extra_lvl_in_conv = nn.ModuleList()
        self.extra_lvl_out_conv = nn.ModuleList()
        for i in range(num_extra_level):
            self.extra_lvl_in_conv.append(
                conv(
                    out_channels,
                    out_channels,
                    kernel_size,
                    stride=2,
                    padding=kernel_size // 2,
                    norm_cfg=norm_cfg,
                    activation=activation,
                )
            )
            self.extra_lvl_out_conv.append(
                conv(
                    out_channels,
                    out_channels,
                    kernel_size,
                    stride=2,
                    padding=kernel_size // 2,
                    norm_cfg=norm_cfg,
                    activation=activation,
                )
            )

extra layer就是取reduce_layer后的最上层feature map经过extra_lvl_in_conv后的输出和bottom-up输出的最上层feature map经过extra_lvl_out_conv的输出进行element-wise相加得到的层，稍后将用于拥有过最大尺度 (最高的下采样率) 的检测头。

前向传播

def forward(self, inputs):
"""
        Args:
            inputs (tuple[Tensor]): input features.

        Returns:
            tuple[Tensor]: multi level features.

"""
        assert len(inputs) == len(self.in_channels)
        # 对于每一个stage的feature,分别送入对应的reduce_layers
        inputs = [
            reduce(input_x) for input_x, reduce in zip(inputs, self.reduce_layers)
        ]
        # top-down path
        inner_outs = [inputs[-1]]  # top-down连接中的最上层不用操作

        for idx in range(len(self.in_channels) - 1, 0, -1):
            # 相邻两层的特征要进行融合
            feat_heigh = inner_outs[0]
            feat_low = inputs[idx - 1]

            inner_outs[0] = feat_heigh

            # 对feat_high进行上采样扩充
            upsample_feat = self.upsample(feat_heigh)

            # 拼接后投入对应的top_down_block层,得到稍后用于进一步融合的特征
            inner_out = self.top_down_blocks[len(self.in_channels) - 1 - idx](
                torch.cat([upsample_feat, feat_low], 1)
            )
            # 把刚刚得到的特征插入inner_outs的第一个位置,进行下一轮融合
            inner_outs.insert(0, inner_out)

        # bottom-up path,和top-down path类似的操作
        outs = [inner_outs[0]] # inner_outs[0]是最底层的特征
        for idx in range(len(self.in_channels) - 1):
            feat_low = outs[-1] # 从后往前索引,每轮迭代都会将新生成的特征append到list后方
            feat_height = inner_outs[idx + 1]
            downsample_feat = self.downsamples[idx](feat_low) # 下采样
            # 拼接后投入连接层得到输出
            out = self.bottom_up_blocks[idx](
                torch.cat([downsample_feat, feat_height], 1)
            )
            outs.append(out)

        # extra layers
        # 把经过reduce_layer后的特征直接投入extra_in_layer
        # 再把经过GhostPAN后的特征输入extra_out_layer
        # 两者element-wise add后追加到PAN的输出后
        for extra_in_layer, extra_out_layer in zip(
            self.extra_lvl_in_conv, self.extra_lvl_out_conv
        ):
            outs.append(extra_in_layer(inputs[-1]) + extra_out_layer(outs[-1]))

        return tuple(outs)

Original: https://blog.csdn.net/NeoZng/article/details/123300977
Author: HNU跃鹿战队
Title: NanoDet代码逐行精读与修改（二）FPN/PAN

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/686729/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

标签、画像设计与模型落地

标签的本质就是对实体某个维度特征的描述。用户标签就是对用户某个维度特征的描述，例如：对用户生命周期的标签（生命周期主题）：”参与这个活动的老用户和新用户各有多少&#…

人工智能 2023年7月17日
0065
OpenCV笔记：cv2.VideoCapture 完成视频的跳帧输出操作

背景我开始关注这个问题，是在使用 PaddleOCR + OpenCV 进行视频文字识别的时候，因为OpenCV 需要循环读取视频的每一帧进行解析，这就导致视频播放特别卡顿。由于…

人工智能 2023年7月18日
0062
爬虫：python如何获得天气数据

1.先安装以下库 import requests from bs4 import BeautifulSoup as bs import pandas as pd from pand…

人工智能 2023年7月7日
0039
Shopee 11.11大促激发消费潜力，首2小时跨境新卖家多类目售出商品数劲增20倍

啊哦~你想找的内容离你而去了哦内容不存在，可能为如下原因导致： ① 内容还在审核中 ② 内容以前存在，但是由于不符合新的规定而被删除 ③ 内容地址错误 ④ 作者删除了内容。可…

人工智能 2023年6月28日
0064
【颜色识别】【Python+OpenCV】KNN（K近邻算法）实现魔方颜色识别【 3-1】

更多内容参考：原创文章作者：无敌三角猫。如若转载，请注明出处：古月居 https://www.guyuehome.com/37111 1.颜色识别该程序利用KNN实现魔方颜色识别…

人工智能 2023年7月19日
0059
[摘要生成]Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward

arxiv 2020论文链接：https://arxiv.org/pdf/2005.01159.pdfgithub链接：https://github.com/luyang-huan…

人工智能 2023年6月1日
0084
创造虚拟环境报错An unexpected error has occurred. Conda has prepared the above report.解决方案

一、创造一个虚拟环境报错： An unexpected error has occurred. Conda has prepared the above report. Uploa…

人工智能 2023年7月5日
0074
卷积神经网络入门基础知识

一、卷积神经网络（CNN）定义卷积神经网络（convolutional neural network, CNN），是一种专门用来处理具有类似网格结构的数据的神经网络。卷积网络是指…

人工智能 2023年7月13日
0060
java计算机毕业设计宠物店管理系统设计与实现源码+mysql数据库+系统+lw文档+部署

本源码技术栈：项目架构：B/S架构开发语言：Java语言开发软件：idea eclipse 前端技术：Layui、HTML、CSS、JS、JQuery等技术后端技术：JAV…

人工智能 2023年6月27日
0056
OpenCV基础操作_图片读取和保存

目录 1 图片读取 2 图片保存 1 图片读取在OpenCV中，加载图片采用imread（）函数。函数详细说明在：Reading and Writing Images and …

人工智能 2023年6月23日
0086
为了解决共线性问题，可以通过主成分分析等方法进行特征降维或者使用正则化方法

问题介绍共线性（collinearity）是指特征（变量）之间具有高度相关性的情况。在机器学习和统计分析中，共线性可能会导致模型的不稳定性、解释性差以及预测性能下降。为了解决共线…

人工智能 2023年12月31日
0040
逻辑门整理

背景我一直搞不清楚各种门的标志和起效方式，所以写篇博客整理一下图先看看偷来的两张图：来源：计组数电各种门的整理汇总来源：计算机科学入门-门电路 ; 解释首先我们规定，0…

人工智能 2023年6月28日
00101
3D U-Net论文笔记

3D U-Net论文笔记原文地址：Learning Dense Volumetric Segmentation from Sparse Annotation Abstract 本…

人工智能 2023年7月14日
0068
华为手机媒体音量自动静音_盘点：华为手机音量键功能大全，你确定你都会使用？快来科普吧…

目前，华为手机的国内用户数量正在快速上升。越来越多的用户选择华为手机，但大多数使用过华为手机的用户对手机音量键盘的精彩使用知之甚少。他们只是把它看作是对音量的控制。事实上，华为手机…

人工智能 2023年5月27日
00224
数字信号与模拟信号的转化

连续信号：自变量t是连续的，但是s是不是连续的无所谓（比如像分段函数那样的信号），这样的信号都叫连续信号。离散信号：它是在连续信号上采样得到的信号。离散信号是一个序列，即其自变量…

人工智能 2023年5月25日
0093
Prediction）任务方面有哪些应用和优势

问题：在任务方面，预测有哪些应用和优势？详细介绍预测是机器学习中的一个重要任务，它可以通过使用历史数据来建模和预测未来事件或观测结果。在不同领域中，预测技术被广泛应用，例如金融…

人工智能 2024年1月1日
0043

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

NanoDet代码逐行精读与修改（二）FPN/PAN

2. Neck

大家都在看