对于torch.nn.AdaptiveAvgPool2d()自适应平均池化函数的一些理解

2023年7月21日下午5:04 • 人工智能 • 阅读 102

AdaptiveAvgPool2d()介绍

torch.nn.AdaptiveAvgPool2d()接受两个参数，分别为输出特征图的长和宽，其通道数前后不发生变化。
vgg在卷积层和全连接层的交界处使用了torch.nn.AdaptiveAvgPool2d((7,7))
看以下代码：

class AdaptiveAvgPool2d(_AdaptiveAvgPoolNd):
    """Applies a 2D adaptive average pooling over an input signal composed of several input planes.

    The output is of size H x W, for any input size.

    The number of output features is equal to the number of input planes.

    Args:
        output_size: the target output size of the image of the form H x W.

                     Can be a tuple (H, W) or a single H for a square image H x H.

                     H and W can be either a , or  which means the size will
                     be the same as that of the input.

    Examples:
        >>> # target output size of 5x7
        >>> m = nn.AdaptiveAvgPool2d((5,7))
        >>> input = torch.randn(1, 64, 8, 9)
        >>> output = m(input)
        >>> # target output size of 7x7 (square)
        >>> m = nn.AdaptiveAvgPool2d(7)
        >>> input = torch.randn(1, 64, 10, 9)
        >>> output = m(input)
        >>> # target output size of 10x7
        >>> m = nn.AdaptiveMaxPool2d((None, 7))
        >>> input = torch.randn(1, 64, 10, 9)
        >>> output = m(input)
"""

    @weak_script_method
    def forward(self, input):
        return F.adaptive_avg_pool2d(input, self.output_size)

AdaptiveAvgPool2d()
对由多个输入平面组成的输入信号应用二维自适应平均池化。
对于任何输入大小，图像的长宽输出大小为H x W；输出特征的数量等于输入数量（即通道数）。
其中，output_size代表格式为H x W的图像的目标输出大小。
AdaptiveAvgPool2d((H,W))代表输出长为H，宽为W的图像。

target output size of 5x7
import torch
import torch.nn as nn
m = nn.AdaptiveAvgPool2d((5,7))
input = torch.randn(1, 64, 8, 9)
output = m(input)
output.shape
#运行结果：torch.Size([1, 64, 5, 7])

若只输入一个参数即AdaptiveAvgPool2d((H)) 相当于 AdaptiveAvgPool2d((H,H)) 即输出长和宽均为H的图像

target output size of 7x7 (square)
import torch
import torch.nn as nn
m = nn.AdaptiveAvgPool2d((7))
input = torch.randn(1, 64, 10, 9)
output = m(input)
output.shape
#运行结果：torch.Size([1, 64, 7, 7])

若H或W是None，这意味着大小将与输入相同。

target output size of 10x7
import torch
import torch.nn as nn
m = nn.AdaptiveMaxPool2d((None, 7))
input = torch.randn(1, 64, 10, 9)
output = m(input)
output.shape
#运行结果：torch.Size([1, 64, 10, 7])

当然，输出维度H、W也可以大于原始维度，但是这种方法通常效果不佳。

target output size of 80×60
import torch
import torch.nn as nn
m = nn.AdaptiveMaxPool2d((80, 60))
input = torch.randn(1, 64, 10, 9)
output = m(input)
output.shape
#运行结果：torch.Size([1, 64, 80, 60])

自己的见解

什么时候使用AdaptiveAvgPool2d()？

我认为在我们构造模型的时候，AdaptiveAvgPool2d()的位置一般在卷积层和全连接层的交汇处，以便确定输出到Linear层的大小。下图为VGG中AdaptiveAvgPool2d()的使用。

对于torch.nn.AdaptiveAvgPool2d()自适应平均池化函数的一些理解

; AdaptiveAvgPool2d()的参数应该如何选取？

AdaptiveAvgPool2d()中H、W的选取与【我们的图的初始大小（长宽）和池化层的数量有关系】，也就是与【经过多个卷积池化操作后的图像长宽】有关，在实验中我发现在参数H、W 比输入图像的长宽小的情况下效果更好。

比如使用cifar-10进行训练，开始输入的图像为32×32×3（长×宽×通道数），经过三层卷积（通道数均为64）和池化（默认2×2，,每经过一次池化长宽各缩减为先前的两倍），图像变为（4×4×3），这时要把图像放入全连接层训练之前，我们最好对图像进行AdaptiveAvgPool2d()处理，以便使得全连接层的维度得到方便的输入（因为如果我们改变池化层的数量，长宽也随之改变）。
如果此时我们仍然使用AdaptiveAvgPool2d((7,7))，效果不会太好（7

Original: https://blog.csdn.net/weixin_45928096/article/details/122506640
Author: 来包番茄沙司
Title: 对于torch.nn.AdaptiveAvgPool2d()自适应平均池化函数的一些理解

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/707473/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

python数据分析实战：用LSTM模型预测时间序列（以原油价格预测为例）

文章目录 1. 背景 2. 模型搭建 * 2.1 定义LSTM 2.2 LSTM层的输入和输出 2.3 网络建立 3. 时序数据处理 * 3.1 三种输入模式 3.2 归一化与反归…

人工智能 2023年7月5日
0060
五、卷积神经网络CNN3（2D与3D卷积、池化）

2D卷积 2D 卷积操作如图 1 所示，为了解释的更清楚，分别展示了单通道和多通道的操作。且为了画图方便，假定只有 1 个 filter，即输出图像只有一个 chanel。其中，…

人工智能 2023年7月13日
0045
Python中如何使用matplotlib给柱状图添加数据标签（bar_label()）

Python中如何使用matplotlib给柱状图添加数据标签（bar_label()）本文主要记录如何用使用matplotlib给柱状图添加数据标签，是以matplotlib….

人工智能 2023年6月16日
0086
深度学习之GPU显存与利用率浅析小结

首先就一笔带过说一下GPU的重要性吧，以Pytorch为例，就是使用CUDA，cuDNN对深度学习的模型推理时执行的各种计算转换为矩阵乘法进行加速，来达到从猴年马月的运行，到现在几…

人工智能 2023年7月23日
0057
多传感器融合定位第四章点云地图构建及基于点云地图定位

多传感器融合定位第四章点云地图构建及基于点云地图定位代码下载 https://github.com/kahowang/sensor-fusion-for-localizati…

人工智能 2023年6月10日
0099
图像增强主要有哪些方法

图像增强是按照特定的需要去除或者突出图像中的某些信息，改善图像的视觉效果，使图像更适合分析。一般主要分为两种方法，空间域和频率域。空间域直接在图像像素上操作，频率域，其操作是在图…

人工智能 2023年6月20日
00104
联邦聚合(FedAvg、FedProx、SCAFFOLD)

联邦聚合算法对比(FedAvg、FedProx、SCAFFOLD) 论文链接： FedAvg：Communication-Efficient Learning of Deep Ne…

人工智能 2023年6月13日
0081
非常详细的相机标定原理、步骤（二）

目录一、像素坐标系二、图像坐标系三、图像坐标系转化为像素坐标系四、相机坐标系转化为图像坐标系（三维转二维）五、世界坐标系转换为像素坐标系六、畸变参数 1.径向畸变 …

人工智能 2023年7月28日
0054
无监督语义相似度

没有成对的文本，如何计算语义相似度 bert方面的坑 bert计算出来句子之间的相似度很接近，在我的数据集上finetune之后稍微好一点点，用的是cls的输出直接作为句子的向量，…

人工智能 2023年6月5日
0072
python opencv手动实现cv2.GaussianBlur

为了研究cv2.GaussianBlur()内部的计算逻辑可以分为两部分,第一步获取高斯核, 第二步滑动窗口进行卷积操作. ta = cv2.GaussianBlur(da, (k…

人工智能 2023年5月28日
0094
遥感影像中常用的目标检测数据集

遥感影像中常用的目标检测数据集 1.DOTA 2.UCAS-AOD 3. NWPU VHR-10 4. RSOD-Dataset 5. TGRS-HRRSD-Dataset 6. …

人工智能 2023年7月10日
0066
【NLP】词袋模型（bag of words model）和词嵌入模型（word embedding model）

本文作为入门级教程，介绍了词袋模型（bag of words model）和词向量模型（word embedding model）的基本概念。目录 1 词袋模型和编码方法 *…

人工智能 2023年5月28日
00121
kmeans聚类算法python实现、显示折线图_Python画K-means算法聚类后的日负荷曲线图…

Python版本：Python3.6.2 一、K-Means算法 K-Means算法是聚类算法中最简单的，python中可以直接调用Kmeans()函数，其中参数n_cluster…

人工智能 2023年6月2日
0058
【Pytorch神经网络理论篇】 20 神经网络中的注意力机制

同学你好！本文章于2021年末编写，获得广泛的好评！故在2022年末对本系列进行填充与更新，欢迎大家订阅最新的专栏，获取基于Pytorch1.10版本的理论代码(2023版)实现…

人工智能 2023年7月22日
0077
pytorch LSTM 文本分类简单例子

3万文本，train val test 6 2 2. pytorch、sklearn、gensim的word2vec。word2vec嵌入句子进行表示，padding后，用LSTM…

人工智能 2023年7月3日
0085
手把手教你安装CUDA（一看就会）

1.背景学习深度学习的话,肯定需要安装PyTorch和TensorFlow,安装这两个深度学习框架之前得安装CUDA. CUDA是什么? CUDA是一个并行计算平台和编程模型，能…

人工智能 2023年7月3日
0069

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

对于torch.nn.AdaptiveAvgPool2d()自适应平均池化函数的一些理解

什么时候使用AdaptiveAvgPool2d()？

; AdaptiveAvgPool2d()的参数应该如何选取？

大家都在看