NIN (Network In Network)

Network In Network

论文Network In Network(Min Lin, ICLR2014).

传统细胞神经网络中使用的线性滤波器是一种广义线性模型(Generalized Line Model,GLM)。因此,当使用CNN进行特征提取时,隐含地假设特征是线性可分的,但实际问题往往很难线性可分。CNN通过添加卷积过滤器来生成更高级别的特征表示。笔者认为,除了像以前一样增加网络卷积层外,还可以在卷积层做专门的设计,使网络在各个感受域都能更好地提取特征。

[En]

The linear filter used in traditional CNN is a generalized linear model (Generalized linear model,GLM). Therefore, when using CNN for feature extraction, it is implicitly assumed that features are linearly separable, but practical problems are often difficult to be linearly separable. CNN generates higher-level feature representation by adding convolution filters. The author thinks that in addition to adding the network convolution layer as before, we can also make a special design in the convolution layer, so that the network can extract better features in each receptive domain.

mlpconv

maxout能够拟合任何凸函数,也就能够拟合任何的激活函数(默认了激活函数都是凸的),而NIN想表明它不仅能够拟合任何凸函数,而且能够拟合任何函数,因为它本质上可以说是一个小型的全连接神经网络.

NIN使用多层感知器的原因是MLP的结构与CNN兼容,都可以使用反向传播训练,并且也是个深度模型,与特征重用的理念一致.将MLP构成的网络层称为一个mlpconv层. MLP可以拟合任意形式的函数,线性、非线性的都可以.

线性卷积层和最大卷积层之间的差异如图所示:

[En]

The difference between the linear convolution layer and the mlpconvolution layer is shown in the figure:

NIN (Network In Network)

mlpconv中使用ReLU,并未替换掉激活函数,改变的只是卷积的方式:不再是 element-wise形式的乘积,而是用非线性的 MLP + ReLU完成。其目的是引入更多的非线性元素。

下图的NIN结构:

[En]

The NIN structure of the following figure:

NIN (Network In Network)

第一个卷积核是11x11x3x96,所以在块上的卷积的输出是1x1x96(一个96维矢量)的特征图。之后,连接另一个MLP层,输出仍为96。5%。因此,这个MLP层相当于一个1×1的卷积层,所以工程实现仍然遵循以前的方式,不需要额外的工作。

[En]

The first convolution kernel is 11x11x3x96, so the output of convolution on a patch block is the feature map of 1x1x96 (a 96-dimensional vector). After that, another MLP layer is connected, and the output is still 96. 5%. Therefore, this MLP layer is equivalent to a 1 x 1 convolution layer, so that the engineering implementation still follows the previous way, and no extra work is required.

Global Average Pooling

传统的CNN在分类任务等较低层使用卷积,将从最终卷积层获得的特征地图矢量化到全连接层,然后用Softmax回归进行分类。一般来说,在卷积结束时完成的卷积是与传统分类器之间的桥梁。全连接阶段容易过拟合,阻碍了整个网络的泛化能力。一般来说,应该有一些规则方法来处理过拟合。

[En]

The traditional cnn uses convolution at the lower level, such as the classification task, in which the feature map obtained from the final convolution layer is vectorized to the full connection layer, and then classified by softmax regression. Generally speaking, the convolution completed at the end of the convolution is bridged with the traditional classifier. The full connection stage is easy to over-fit, which hinders the generalization ability of the whole network. Generally, there should be some rule methods to deal with over-fitting.

在传统的CNN中,很难解释最后一个全连接层的类别信息输出的误差是如何传递到前一卷积层的。全球平均汇集更容易解释。此外,完全连通的层很容易过拟合,并且通常依赖于丢弃等正则化方法。

[En]

In traditional CNN, it is difficult to explain how the error of the category information output of the last full connection layer is transmitted to the previous convolution layer. Global average pooling is easier to explain. In addition, the fully connected layer is easy to over-fit and often depends on regularization methods such as dropout.

global average pooling的概念非常简单,分类任务有多少个类别,就控制最终产生多少个feature map.对每个feature map的数值求平均作为某类别的置信度,类似FC层输出的特征向量,再经过softmax分类.其优点有:

  1. 参数数量减少,减轻过拟合(应用于AlexNet,模型230MB->29MB);
  2. 更符合卷积网络的结构,使feature map和类别信息直接映射;
  3. 求和取平均操作综合了空间信息,使得对输入的空间变换更鲁棒(与卷积层相连的FC按顺序对特征进行了重新编排(flatten),可能破坏了特征的位置信息).

  4. FC层输入的大小须固定,这限制了网络输入的图像大小.

FC与global average pooling的区别如下图:

NIN (Network In Network)

它可用于图像分类、目标检测等任务。

[En]

It can be used for image classification, target detection and other tasks.

global average pooling实现使用Average Pooling,kernel_size是特征图的大小. caffe prototxt定义如下:

layers {
  bottom: "cccp8"
  top: "pool4"
  name: "pool4"
  type: POOLING
  pooling_param {
    pool: AVE
    #kernel_size: 6
    #stride: 1
    #--旧版caffe需指定kernel_size&stride--
    global_pooling: true
  }
}

caffe在该论文之后加入了对global_pooling的支持,在pooling_param中指定 global_pooling: true即可,不需要指定kernel大小,pad和stride大小(pad = 0 , stride = 1,否则会报错).kernel_size自动使用特征图的大小,代码:

if (global_pooling_) {
    kernel_h_ = bottom[0]->height();
    kernel_w_ = bottom[0]->width();
}

论文的caffe模型参数定义gist.

模型结构图源于绘制caffe prototxt模型结构的在线工具.

Original: https://www.cnblogs.com/makefile/p/nin.html
Author: 康行天下
Title: NIN (Network In Network)

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/6539/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

最近整理资源【免费获取】:   👉 程序员最新必读书单  | 👏 互联网各方向面试题下载 | ✌️计算机核心资源汇总