MobileNetv3网络详解、使用pytorch搭建模型并基于迁移学习训练

2023年7月21日下午9:46 • 人工智能 • 阅读 85

1.MobileNetv3网络详解

提出了 MobileNetv3-Large和 MobileNetv3-Small两种不同大小的网络结构， 主要的区别是通道数的变化与bneck的次数。

网络的创新点：
(1)更新Block(bneck)
(2)使用NAS搜索参数(Neural Architecture Search)
(3)重新设计耗时层结构

(1)更新Block

MobileNetv2的倒残差结构：

MobileNetv3的倒残差结构：
NL表示非线性激活函数

与MobileNetv2相比，MobileNetv3的倒残差结构加入了 轻量级的注意力机制：

注意力机制的作用方式是调整每个通道的权重(对得到的特征矩阵的每一个channel进行池化处理)。

调整方式：特征矩阵的channel等于多少，得到的一维向量就有多少个元素；然后通过两个全连接层得到输出的向量。第一个全连接层，其节点个数为特征矩阵的channel的1/4；。第二个全连接层，其节点个数等于特征矩阵的channel；输出的向量可以理解为对特征矩阵的每一个channel分析出的一个权重关系，对于重要的channel就赋予一个比较大的权重。

利用h-swish代替swish函数：

; (2)使用NAS搜索参数(Neural Architecture Search)

略

(3)重新设计耗时层结构

(a)减少第一个卷积层的卷积核个数(32减为16)
原论文中作者说减少卷积核后准确率不变且能够减少计算量，节省2毫秒时间
(b)精简Last Stage

使用NAS搜索出来的网络结构的最后一部分为 Original Last Stage，但作者在使用过程中发现这一部分比较耗时，因此对其进行精简为 Efficient Last Stage，精简后准确率不变且节省7毫秒时间。

; 2.MobileNetv3(large)网络结构

第一列Input代表mobilenetV3每个特征层的shape变化；
第二列Operator代表每次特征层即将经历的block结构，在MobileNetV3中，特征提取经过了许多的bneck结构，NBN为不适用BN层；
第三、四列分别代表了bneck内倒残差结构升维后的通道数、输入到bneck时特征层的通道数。
第五列SE代表了是否在这一层引入注意力机制。
第六列NL代表了激活函数的种类，HS代表h-swish，RE代表RELU。
第七列s代表了每一次block结构所用的步长。

注：112x112x16这一层，输入特征矩阵的channel等于exp size，所以在这一层的倒残差结构里不需要升维，没有1×1的卷积层

3.MobileNetv3(small)网络结构

与MobileNetV3（large）相比，bneck的次数与通道数都有一定的下降。

; 4.使用Pytorch搭建MobileNetv3网络

文件结构：

MobileNetv2
  ├── model_v2.py:           MobileNetv2模型搭建
  ├── model_v3.py:           MobileNetv3模型搭建
  ├── train.py:              训练脚本
  └── predict.py:            图像预测脚本

(1)model_v3.py

定义卷积结构：

class ConvBNActivation(nn.Sequential):
    def __init__(self,
                 in_planes: int,
                 out_planes: int,
                 kernel_size: int = 3,
                 stride: int = 1,
                 groups: int = 1,
                 norm_layer: Optional[Callable[..., nn.Module]] = None,
                 activation_layer: Optional[Callable[..., nn.Module]] = None):
        padding = (kernel_size - 1) // 2
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d
        if activation_layer is None:
            activation_layer = nn.ReLU6
        super(ConvBNActivation, self).__init__(nn.Conv2d(in_channels=in_planes,
                                                         out_channels=out_planes,
                                                         kernel_size=kernel_size,
                                                         stride=stride,
                                                         padding=padding,
                                                         groups=groups,
                                                         bias=False),
                                               norm_layer(out_planes),
                                               activation_layer(inplace=True))

定义注意力机制模块：

class SqueezeExcitation(nn.Module):
    def __init__(self, input_c: int, squeeze_factor: int = 4):
        super(SqueezeExcitation, self).__init__()
        squeeze_c = _make_divisible(input_c // squeeze_factor, 8)
        self.fc1 = nn.Conv2d(input_c, squeeze_c, 1)
        self.fc2 = nn.Conv2d(squeeze_c, input_c, 1)

    def forward(self, x: Tensor) -> Tensor:
        scale = F.adaptive_avg_pool2d(x, output_size=(1, 1))
        scale = self.fc1(scale)
        scale = F.relu(scale, inplace=True)
        scale = self.fc2(scale)
        scale = F.hardsigmoid(scale, inplace=True)
        return scale * x

定义网络的参数配置：

class InvertedResidualConfig:
    def __init__(self,
                 input_c: int,
                 kernel: int,
                 expanded_c: int,
                 out_c: int,
                 use_se: bool,
                 activation: str,
                 stride: int,
                 width_multi: float):
        self.input_c = self.adjust_channels(input_c, width_multi)
        self.kernel = kernel
        self.expanded_c = self.adjust_channels(expanded_c, width_multi)
        self.out_c = self.adjust_channels(out_c, width_multi)
        self.use_se = use_se
        self.use_hs = activation == "HS"
        self.stride = stride

MobileNetv3的倒残差结构：

class InvertedResidual(nn.Module):
    def __init__(self,
                 cnf: InvertedResidualConfig,
                 norm_layer: Callable[..., nn.Module]):
        super(InvertedResidual, self).__init__()

        if cnf.stride not in [1, 2]:
            raise ValueError("illegal stride value.")

        self.use_res_connect = (cnf.stride == 1 and cnf.input_c == cnf.out_c)

        layers: List[nn.Module] = []
        activation_layer = nn.Hardswish if cnf.use_hs else nn.ReLU

        if cnf.expanded_c != cnf.input_c:
            layers.append(ConvBNActivation(cnf.input_c,
                                           cnf.expanded_c,
                                           kernel_size=1,
                                           norm_layer=norm_layer,
                                           activation_layer=activation_layer))

        layers.append(ConvBNActivation(cnf.expanded_c,
                                       cnf.expanded_c,
                                       kernel_size=cnf.kernel,
                                       stride=cnf.stride,
                                       groups=cnf.expanded_c,
                                       norm_layer=norm_layer,
                                       activation_layer=activation_layer))

        if cnf.use_se:
            layers.append(SqueezeExcitation(cnf.expanded_c))

        layers.append(ConvBNActivation(cnf.expanded_c,
                                       cnf.out_c,
                                       kernel_size=1,
                                       norm_layer=norm_layer,
                                       activation_layer=nn.Identity))

        self.block = nn.Sequential(*layers)
        self.out_channels = cnf.out_c
        self.is_strided = cnf.stride > 1

    def forward(self, x: Tensor) -> Tensor:
        result = self.block(x)
        if self.use_res_connect:
            result += x

        return result

定义MobileNetv3(large)网络结构：

class MobileNetV3(nn.Module):
    def __init__(self,
                 inverted_residual_setting: List[InvertedResidualConfig],
                 last_channel: int,
                 num_classes: int = 1000,
                 block: Optional[Callable[..., nn.Module]] = None,
                 norm_layer: Optional[Callable[..., nn.Module]] = None):
        super(MobileNetV3, self).__init__()

        if not inverted_residual_setting:
            raise ValueError("The inverted_residual_setting should not be empty.")
        elif not (isinstance(inverted_residual_setting, List) and
                  all([isinstance(s, InvertedResidualConfig) for s in inverted_residual_setting])):
            raise TypeError("The inverted_residual_setting should be List[InvertedResidualConfig]")

        if block is None:
            block = InvertedResidual

        if norm_layer is None:
            norm_layer = partial(nn.BatchNorm2d, eps=0.001, momentum=0.01)

        layers: List[nn.Module] = []

        firstconv_output_c = inverted_residual_setting[0].input_c
        layers.append(ConvBNActivation(3,
                                       firstconv_output_c,
                                       kernel_size=3,
                                       stride=2,
                                       norm_layer=norm_layer,
                                       activation_layer=nn.Hardswish))

        for cnf in inverted_residual_setting:
            layers.append(block(cnf, norm_layer))

        lastconv_input_c = inverted_residual_setting[-1].out_c
        lastconv_output_c = 6 * lastconv_input_c
        layers.append(ConvBNActivation(lastconv_input_c,
                                       lastconv_output_c,
                                       kernel_size=1,
                                       norm_layer=norm_layer,
                                       activation_layer=nn.Hardswish))
        self.features = nn.Sequential(*layers)
        self.avgpool = nn.AdaptiveAvgPool2d(1)
        self.classifier = nn.Sequential(nn.Linear(lastconv_output_c, last_channel),
                                        nn.Hardswish(inplace=True),
                                        nn.Dropout(p=0.2, inplace=True),
                                        nn.Linear(last_channel, num_classes))

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode="fan_out")
                if m.bias is not None:
                    nn.init.zeros_(m.bias)
            elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
                nn.init.ones_(m.weight)
                nn.init.zeros_(m.bias)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.zeros_(m.bias)

    def _forward_impl(self, x: Tensor) -> Tensor:
        x = self.features(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.classifier(x)

        return x

    def forward(self, x: Tensor) -> Tensor:
        return self._forward_impl(x)

定义MobileNetv3(large)网络参数：

MobileNetv3(small)同理

    inverted_residual_setting = [

        bneck_conf(16, 3, 16, 16, False, "RE", 1),
        bneck_conf(16, 3, 64, 24, False, "RE", 2),
        bneck_conf(24, 3, 72, 24, False, "RE", 1),
        bneck_conf(24, 5, 72, 40, True, "RE", 2),
        bneck_conf(40, 5, 120, 40, True, "RE", 1),
        bneck_conf(40, 5, 120, 40, True, "RE", 1),
        bneck_conf(40, 3, 240, 80, False, "HS", 2),
        bneck_conf(80, 3, 200, 80, False, "HS", 1),
        bneck_conf(80, 3, 184, 80, False, "HS", 1),
        bneck_conf(80, 3, 184, 80, False, "HS", 1),
        bneck_conf(80, 3, 480, 112, True, "HS", 1),
        bneck_conf(112, 3, 672, 112, True, "HS", 1),
        bneck_conf(112, 5, 672, 160 // reduce_divider, True, "HS", 2),
        bneck_conf(160 // reduce_divider, 5, 960 // reduce_divider, 160 // reduce_divider, True, "HS", 1),
        bneck_conf(160 // reduce_divider, 5, 960 // reduce_divider, 160 // reduce_divider, True, "HS", 1),
    ]
    last_channel = adjust_channels(1280 // reduce_divider)

    return MobileNetV3(inverted_residual_setting=inverted_residual_setting,
                       last_channel=last_channel,
                       num_classes=num_classes)

MobileNetV3预训练权重下载

https://download.pytorch.org/models/mobilenet_v3_large-8738ca79.pth

下载完成后将其名称改为mobilenet_v3_large.pth，存放在当前文件夹

数据集

数据集采用花分类数据集：使用pytorch搭建AlexNet并训练花分类数据集

(2)train.py

如果要使用v3模型，要做如下更改：
导入模块的更改

from model_v2 import MobileNetV2
改为：from model_v3 import mobilenet_v3_large

实例化网络的更改

    model_weight_path = "./mobilenet_v3_large.pth"

保存路径更改：

    save_path = './MobileNetV3.pth'

不使用预训练权重：


    for param in net.features.parameters():
        param.requires_grad = False

训练结果

; (3)predict.py

同样更改导入模块、实例化模块和载入权重部分代码

Original: https://blog.csdn.net/STATEABC/article/details/123781280
Author: STATEABC
Title: MobileNetv3网络详解、使用pytorch搭建模型并基于迁移学习训练

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/707837/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

SCRDet++

SCRDet++ Detection for Small, Cluttered and Rotated Objects via Instance-Level Feature Den…

人工智能 2023年7月10日
0088
训练集误差和验证集误差

通常，数据集会被分为3份，训练集，验证集，测试集。训练集的用途不用多说，验证集主要为了对比训练集从而判断是否发生过拟合！举个例子，如果训练集上精度比测试集上精度高很多，说明发生了过…

人工智能 2023年6月25日
0069
yolov5检测框显示中文标签

目录前言 1、有中文标签的数据集 2、yolov5代码修改为支持中文标签前言很多人在训练yolov5目标检测的时候，标签只能显示英文的。怎么样才可以训练一个可以检测物体并且显…

人工智能 2023年7月13日
0086
人工智能学习——模糊控制

模糊控制文章目录模糊控制前言一、模糊控制是什么？与神经网络的区别？二、模糊控制原理 * 1.模糊化 2.模糊规则 3.模糊推理 4.解模糊化三、模糊控制算法实例解析（含…

人工智能 2023年6月26日
0090
工业级NPU： MIMX8ML8CVNKZAB在智能物联网中应用

一、边缘计算的崛起随着5G、物联网等技术的发展，智能终端和数据越来越多，网络的传输速度越来越快，覆盖面越来越广，对云端的存储和计算能力提出了更高的要求。这必然会推动计算力向智能终端…

人工智能 2023年7月14日
00116
PyTorch中的损失函数有哪些常见的选择

问题描述你好，我今天要解决的问题是关于PyTorch中的损失函数有哪些常见的选择。我会详细介绍每个损失函数的算法原理、公式推导、计算步骤，并提供复杂的Python代码示例来解释代…

人工智能 2024年1月5日
0034
（Python数字图像处理）彩色图像处理—色调和彩色校正以及直方图均衡化

文章目录一、色调和彩色校正二、色调校正及彩色平衡三、彩色直方图均衡化 -基于Python+OpenCV，实验环境：pycharm+anaconda，参考《数字图像处理》冈萨雷…

人工智能 2023年6月19日
0084
COCO数据集解析生成语义分割mask

COCO数据集解析生成语义分割mask 通过coco数据集的标注文件 — instances_train2014.json / instances_val2014.json 生成语…

人工智能 2023年6月17日
0097
《OpenCV 4.5计算机视觉开发实战（基于VC++）》简介

好书推荐##好书奇遇季#《OpenCV 4.5 计算机视觉开发实战（基于VC++）》，京东当当天猫都有发售。书非常厚，定价89 元，网店打折销售其实没多少钱。近年来，在入侵检测、…

人工智能 2023年7月20日
0055
DHCP笔记

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年6月28日
0082
python批量读取txt文件为DataFrame

我们有时候会批量处理同一个文件夹下的文件，并且希望读取到一个文件里面便于我们计算操作。比方我有下图一系列的txt文件，我该如何把它们写入一个txt文件中并且读取为DataFrame…

人工智能 2023年6月2日
0072
CV7 颜色追踪和图像阈值

一颜色追踪 HSV:H（hue）是色调，S（saturation)是饱和度，V（value）表示黑暗的程度因为在HSV中比在BGR颜色空间中更容易表示颜色，所以我们需要将BGR…

人工智能 2023年6月22日
0097
【论文精读】Point-NeRF:Point-based Neural Radiance Fields

CVPR2022 oral的一篇文章，文章还行，代码比较乱，超参非常多且没有注释，代码也有bug原文链接：https://arxiv.org/abs/2201.08845代码链接：…

人工智能 2023年7月12日
0065
[云炬python3玩转机器学习笔记] 3-11Matplotlib数据可视化基础

matplotlib 基础 In [2]: import matplotlib as mpl import matplotlib.pyplot as plt In [4]: imp…

人工智能 2023年6月17日
0079
java-mahout根据用户或物品数据过滤推荐（开源）

业务场景：在学研究ava过程中想做一个智能的推荐系统，千人千面智能推荐。在翻阅资料过程中看到了mahout这个机器学习算法库，感觉很实用，无奈与文档是英文（真是扑街gai了）。那…

人工智能 2023年6月15日
0082
负对数似然（negative log-likehood, NLL）

目录 1. 似然 2. 最大似然估计 3. 对数似然 4. 负对数似然 5. 补充说明 Reference 1. 似然似然与概率不同。概率是指一个事件发生的可能性，描述的是对象…

人工智能 2023年7月21日
0099

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31