DeepHyperX代码理解-HamidaEtAl

2023年6月3日下午11:33 • 人工智能 • 阅读 108

代码复现自论文《3-D Deep Learning Approach for Remote Sensing Image Classification》

先对部分基础知识做一些整理：

一、局部连接与参数共享（都减少了参数计算量）

局部连接：基于图像局部相关的原理，保留了图像局部结构，同时减少了网络的权值个数，加快了学习速率，同时也在一定程度上减少了过拟合的可能。

参数共享：下面用图例解释参数共享与不共享的区别。

如图是一个3*3大小的卷积核在进行特征提取，channel=1, 在每个位置进行特征提取的时候都是共享一个卷积核，假设有k个channel,则参数

总量为33k,注意不同channel的参数是不能共享的。

假设现在不使用参数共享，则卷积核作用于矩阵上的每一个位置时其参数都是不一样的，则卷积核的参数数量就与像素矩阵的大小保持一致了，假
设有k个channel,则参数数量为weightheightk，这对于尺寸较大的图片来说明显是不可取的。

二、多通道输入输出：

首先理解fliter与kernel区别：kernel是fliter的组成成分，一个fliter就会对应生成一个特征图。

下图为多通道输入，单通道输出。有一个fliter，一个fliter包括三个kernel，最后生成一个特征图（一个fliter就对应一个特征图）

下图为直观展示，输入为883的RGB三通道图片，设置了4个fliter对应四个输出通道，也就是输出4个特征图。每个fliter有3个kernel。

三、2D卷积代码示例：

1 import torch as t
 2 import torch.nn as nn
 3
 4 class A(nn.Module):
 5     def __init__(self):
 6         super(A, self).__init__()
 7         #三个2D卷积层结构，每个卷积层的输入和输出通道数均为2，卷积核和大小是(3, 3)
 8         #torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1,
 9         #padding=0, dilation=1, groups=1, bias=True,
10         #padding_mode='zeros', device=None, dtype=None)
11         self.conv1 = nn.Conv2d(2, 2, kernel_size=(3, 3))
12         self.conv2 = nn.Conv2d(2, 2, kernel_size=(3, 3))
13         self.conv3 = nn.Conv2d(2, 2, kernel_size=(3, 3))
14
15 a = A()
16 print(list(a.parameters()))

运行结果：

1 [Parameter containing:
 2 tensor([[[[ 0.0213, -0.2054,  0.0985],
 3           [ 0.1254,  0.1202, -0.1115],
 4           [-0.0280,  0.0256,  0.1975]],
 5
 6          [[-0.2118, -0.1683,  0.0056],
 7           [-0.0554,  0.2255, -0.1548],
 8           [ 0.1747,  0.0449,  0.1606]]],
 9
10
11         [[[-0.1328, -0.1700, -0.1048],
12           [ 0.0911, -0.2230,  0.0685],
13           [ 0.0886,  0.1765, -0.1879]],
14
15          [[-0.0581,  0.1266, -0.1030],
16           [-0.0170,  0.0387, -0.0641],
17           [-0.0127, -0.2099, -0.2213]]]], requires_grad=True), Parameter containing:
18 tensor([ 0.0877, -0.1969], requires_grad=True), Parameter containing:
19 tensor([[[[-0.0266,  0.1511, -0.0034],
20           [-0.1070, -0.1734, -0.1017],
21           [ 0.0053,  0.1358, -0.0542]],
22
23          [[-0.0436,  0.1587, -0.0375],
24           [ 0.0125, -0.0431, -0.0877],
25           [-0.0766,  0.0405, -0.1306]]],
26
27
28         [[[-0.1449, -0.0315, -0.0236],
29           [ 0.0118,  0.2230, -0.2137],
30           [-0.1108, -0.1178,  0.0027]],
31
32          [[ 0.2184,  0.1964, -0.0959],
33           [-0.0385, -0.0523,  0.2135],
34           [-0.0387,  0.1951, -0.1546]]]], requires_grad=True), Parameter containing:
35 tensor([-0.0942, -0.2029], requires_grad=True), Parameter containing:
36 tensor([[[[ 8.9598e-02, -3.3190e-02,  1.0606e-01],
37           [-2.2397e-02,  2.0944e-01, -6.8180e-02],
38           [-1.7312e-01,  2.2318e-01,  1.9368e-01]],
39
40          [[ 7.0529e-02, -2.0741e-01, -1.2648e-01],
41           [-1.7503e-01,  1.7972e-01, -1.0417e-01],
42           [-1.9124e-01, -4.2022e-02,  1.4635e-01]]],
43
44
45         [[[-8.0857e-02,  8.5098e-03,  7.0629e-02],
46           [ 1.6926e-01, -1.7654e-02, -9.3033e-02],
47           [-2.9836e-02, -2.2935e-01,  1.1450e-01]],
48
49          [[ 1.3848e-01, -5.7713e-02,  1.5293e-04],
50           [-6.3998e-03, -1.0745e-01,  2.8835e-03],
51           [-1.7894e-01,  2.2133e-01,  5.2435e-02]]]], requires_grad=True), Parameter containing:
52 tensor([-0.1026,  0.0207], requires_grad=True)]

上面代码是定义的一个2D卷积网络架构，有3个卷积层，每个卷积层的输入通道和输出通道（对应fliter为2）都为2，每个fliter中有两个33的kernel（对应两个输入通道）。代码中的两个12的向量为bias。

四、3D卷积的示意图和示例代码：

1 class B(nn.Module):
 2     def __init__(self):
 3         super(B, self).__init__()
 4         self.conv1 = nn.Conv3d(
 5             1,  # 输入图像的channel数,C_in
 6             3,  # 卷积产生的channel数,C_out
 7             kernel_size=2,  # 卷积核的尺寸,这里实际是(2,2,2),第一维表示卷积核处理的帧数
 8             stride=(1,1,1),  # 卷积步长,(D,H,W)
 9             padding=(0,0,0),  # 输入的每一条边补充0的层数,(D,H,W)
10             bias=False)
11         self.conv2 = nn.Conv3d(1, 3, kernel_size=(2, 2, 2), stride=(1, 1, 1), padding=(0, 0, 0), bias=False)
12
13 b = B()
14 print(list(b.parameters()))

运行结果：

1 [Parameter containing:
 2 tensor([[[[[-0.0895, -0.1651],
 3            [-0.1319,  0.2510]],
 4
 5           [[-0.1616, -0.2614],
 6            [ 0.0383,  0.2656]]]],
 7
 8
 9
10         [[[[-0.0809, -0.1821],
11            [-0.0624, -0.1401]],
12
13           [[-0.3170,  0.1499],
14            [-0.3449,  0.2639]]]],
15
16
17
18         [[[[ 0.2779, -0.0731],
19            [ 0.0439, -0.0353]],
20
21           [[ 0.2853, -0.3177],
22            [ 0.0559, -0.3290]]]]], requires_grad=True), Parameter containing:
23 tensor([[[[[-0.2529, -0.0537],
24            [ 0.0955, -0.1513]],
25
26           [[ 0.0310,  0.2558],
27            [-0.1903, -0.1561]]]],
28
29
30
31         [[[[-0.2919, -0.3142],
32            [ 0.0190, -0.1089]],
33
34           [[-0.0586,  0.1541],
35            [-0.2402, -0.3339]]]],
36
37
38
39         [[[[-0.3260, -0.0500],
40            [ 0.1392,  0.1486]],
41
42           [[ 0.1664,  0.1514],
43            [ 0.0345, -0.1979]]]]], requires_grad=True)]

三维数据输入通道为1，输出通道为3，那么fliter数为3，每个fliter中有1个kernel(2,2,2),因为两个3D-Conv参数一样，所以每个3D-Conv有3个(2,2,2)的tensor。这里如果对应到高光谱数据集，输入通道也为1。因为整个高光谱数据集就是一个3维的volume，整个数据集图像由长宽光谱维组成。

论文复现：

原论文网络结构:

复现网络：

输入维度：Image has dimensions 610×340 and 103 channels

网络结构及参数：

复现代码：

1     elif name == "hamida":
2         patch_size = kwargs.setdefault("patch_size", 5)
3         center_pixel = True
4         model = HamidaEtAl(n_bands, n_classes, patch_size=patch_size)
5         lr = kwargs.setdefault("learning_rate", 0.01)
6         optimizer = optim.SGD(model.parameters(), lr=lr, weight_decay=0.0005)
7         kwargs.setdefault("batch_size", 100)
8         criterion = nn.CrossEntropyLoss(weight=kwargs["weights"])

1 class HamidaEtAl(nn.Module):
 2     """
 3     3-D Deep Learning Approach for Remote Sensing Image Classification
 4     Amina Ben Hamida, Alexandre Benoit, Patrick Lambert, Chokri Ben Amar
 5     IEEE TGRS, 2018
 6     https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8344565
 7     """
 8
 9     @staticmethod
10     def weight_init(m):#权重初始化
11         if isinstance(m, nn.Linear) or isinstance(m, nn.Conv3d):
12             init.kaiming_normal_(m.weight) #kaiming_normal_均匀初始化方法
13             init.zeros_(m.bias)
14
15     def __init__(self, input_channels, n_classes, patch_size=5, dilation=1):
16         super(HamidaEtAl, self).__init__()
17         # The first layer is a (3,3,3) kernel sized Conv characterized
18         # by a stride equal to 1 and number of neurons equal to 20
19         self.patch_size = patch_size#每次截取图像patch大小(5,5)
20         self.input_channels = input_channels#输入通道数（波段数）
21         dilation = (dilation, 1, 1)
22
23         if patch_size == 3:
24             self.conv1 = nn.Conv3d(
25                 1, 20, (3, 3, 3), stride=(1, 1, 1), dilation=dilation, padding=1
26             )
27         else:
28             self.conv1 = nn.Conv3d(
29                 1, 20, (3, 3, 3), stride=(1, 1, 1), dilation=dilation, padding=0
30             )
31         # Next pooling is applied using a layer identical to the previous one
32         # with the difference of a 1D kernel size (1,1,3) and a larger stride
33         # equal to 2 in order to reduce the spectral dimension
34         #为什么要用卷积层要替代池化层：
35         # 使用2×2的最大池化，与使用卷积（stride为2）来做down sample性能并没有明显差别，
36         # 而且使用卷积（stride为2）相比卷积（步进为1）+池化，还可以减少卷积运算量和一个池化层。何乐而不为呢。
37         self.pool1 = nn.Conv3d(
38             20, 20, (3, 1, 1), dilation=dilation, stride=(2, 1, 1), padding=(1, 0, 0)
39         )
40         # Then, a duplicate of the first and second layers is created with
41         # 35 hidden neurons per layer.

42         self.conv2 = nn.Conv3d(
43             20, 35, (3, 3, 3), dilation=dilation, stride=(1, 1, 1), padding=(1, 0, 0)
44         )
45         self.pool2 = nn.Conv3d(
46             35, 35, (3, 1, 1), dilation=dilation, stride=(2, 1, 1), padding=(1, 0, 0)
47         )
48         # Finally, the 1D spatial dimension is progressively reduced
49         # thanks to the use of two Conv layers, 35 neurons each,
50         # with respective kernel sizes of (1,1,3) and (1,1,2) and strides
51         # respectively equal to (1,1,1) and (1,1,2)
52         self.conv3 = nn.Conv3d(
53             35, 35, (3, 1, 1), dilation=dilation, stride=(1, 1, 1), padding=(1, 0, 0)
54         )
55         self.conv4 = nn.Conv3d(
56             35, 35, (2, 1, 1), dilation=dilation, stride=(2, 1, 1), padding=(1, 0, 0)
57         )
58
59         # self.dropout = nn.Dropout(p=0.5)
60
61         self.features_size = self._get_final_flattened_size()
62         # The architecture ends with a fully connected layer where the number
63         # of neurons is equal to the number of input classes.

64         self.fc = nn.Linear(self.features_size, n_classes)
65
66         self.apply(self.weight_init)
67
68     def _get_final_flattened_size(self):#计算fc前的输出tensor大小，用随机的X去测试得到，这样就不用自己去推到了
69         with torch.no_grad():
70             x = torch.zeros(
71                 (1, 1, self.input_channels, self.patch_size, self.patch_size)
72             )
73             x = self.pool1(self.conv1(x))
74             x = self.pool2(self.conv2(x))
75             x = self.conv3(x)
76             x = self.conv4(x)
77             _, t, c, w, h = x.size()
78         return t * c * w * h
79
80     def forward(self, x):
81         x = F.relu(self.conv1(x))
82         x = self.pool1(x)
83         x = F.relu(self.conv2(x))
84         x = self.pool2(x)
85         x = F.relu(self.conv3(x))
86         x = F.relu(self.conv4(x))
87         x = x.view(-1, self.features_size)#对tensor resize相当于做展平（flatten）操作
88         # x = self.dropout(x)
89         x = self.fc(x)
90         return x

关于参数dilation的解释：https://blog.csdn.net/qimo601/article/details/112624091

运行结果：

通过阅读代码发现和原论文中的结构并不完全相同，试着去修改了一层网络参数去接近原网络结构：

self.conv3 = nn.Conv3d(
            35, 35, (3, 1, 1), dilation=dilation, stride=(1, 1, 1), padding=(1, 0, 0)
        )

->>

self.conv3 = nn.Conv3d(
            35, 35, (3, 3, 3), dilation=dilation, stride=(1, 1, 1), padding=(1, 0, 0)
        )

但发现最终效果不如上面结果，可能是对原论文提出的结构稍作改进了吧。

参考资料：

https://github.com/nshaud/DeepHyperX

https://blog.csdn.net/weixin_38481963/article/details/109906338

https://blog.csdn.net/abbcdc/article/details/123332063

https://wenku.baidu.com/view/fa899796f221dd36a32d7375a417866fb84ac087.html

Original: https://www.cnblogs.com/AllFever/p/16700359.html
Author: AllFever
Title: DeepHyperX代码理解-HamidaEtAl

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/565991/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

【论文精读】Knowledge Graph Embedding by Translating on Hyperplanes

年份2014第一作者单位中山大学benchmark datasetsWordNet and Freebasetasklink prediction, triplet classif…

人工智能 2023年6月10日
0067
【深度学习实战】Kaggle比赛：房价预测（kaggle-house-price）

好的，以下是一个使用 PyTorch 实现 Kaggle 房价预测比赛_中的线性回归模型的示例代码： import pandas as pd import torch impor…

人工智能 2023年7月27日
0083
python数据分析案例，心脏病预测

心脏病是一类比较常见的循环系统疾病。它作为全球第一大杀手，是我们不得不提前防御的疾病。我从kaggle上下载了一份心脏病数据集，利用python找出一些规律，看下哪些特征对于确诊心…

人工智能 2023年7月16日
0054
CenterFusion代码复现

CenterFusion代码复现参考代码：GitHub – mrnabati/CenterFusion: CenterFusion: Center-based Rad…

人工智能 2023年7月10日
0062
numpy第三章-索引器、多级索引

一、索引器 df = pd.read_csv(….) 下面的df都是指数据名 1.列索引从表中取出一列：df[‘列名’] 例如：df[‘Na…

人工智能 2023年7月8日
0096
（pytorch进阶之路）IDDPM之diffusion实现

文章目录概述代码实现 * image_trian.py def create_model_and_diffusion() def create_gaussian_diffusi…

人工智能 2023年7月26日
0055
【机器学习算法 – 1】KNN

目录基本原理算法实现数据集距离计算方法 python实现运行结果 KNN算法优缺点 KNN算法改进机器学习中的KNN（K-NearestNeighbor）算法，是数据挖…

人工智能 2023年7月2日
0080
008-单片机最小系统解析、函数的调用

关于单片机最小系统解析的讲解及函数的定义、调用单片机最小系统电源电路 5V和3.3V是一个概念值，非准确值。数字电路常用电源有24V,12V,5V,3.3V,2.5V,1.8…

人工智能 2023年6月6日
0084
三、python中pandas_numpy_数据处理、合并与分组

三、数据处理、合并与分组 import pandas as pd import numpy as np df = pd.read_excel(‘/home/mw/input/pan…

人工智能 2023年7月9日
0077
回归分析-线性回归-检验-模型

OLS：最小二乘法通过预测变量的加权和来预测量化的因变量，其中权重是通过数据估计而得的参数数据特征：  正态性对于固定的自变量值，因变量值成正态分布。  独立性 Yi值之…

人工智能 2023年6月17日
00113
2022美国大学生数学建模竞赛A题思路与总结

2022美国大学生数学建模竞赛的一点反思看到官网的评奖结果，我的心情从诧异到愤愤不平，最后终于得以平静下来思考论文中存在的诸多的问题。以下是我们对于A题的求解思路与反思，欢迎大家…

人工智能 2023年6月15日
0078
实现KNN算法&搭建人工神经网络ANN——第三次数据挖掘实验

实验结果直接戳这里免费下载实验报告[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:se…

人工智能 2023年6月2日
0094
【Python 实战基础】Pandas中Series与数据list如何互相转换

一、实战场景二、主要知识点文件读写基础语法 Pandas Pandas的Series对象互相转换三、菜鸟实战 1、创建 python 文件，数据list，变成Pandas…

人工智能 2023年7月8日
0037
OpenCV-Python 识别万用表七段数码管电流值

项目简介利用 OpenCV 和笔记本 PC 的摄像头（或者外接摄像头）识别万用表电流值。步骤： 1、定位屏幕照片要光线均匀，无大面积阴影。另外，这款万用表轮廓不明显，故贴纸使…

人工智能 2023年7月19日
0098
Python 实现单例模式的五种写法

单例模式（Singleton Pattern）是一种常用的软件设计模式，该模式的主要目的是确保某一个类只有一个实例存在。当你希望在整个系统中，某个类只能出现一个实例时，单例对象就…

人工智能 2023年6月28日
0085
数据集分析工具pandas-profiling进阶：个性化定制配置文件与参数

目录 1.常用方法 2.自定义参数 3.自定义配置文件 4.总结 1.常用方法本文使用 pandas_profiling 3.1.0 我们在使用pandas-profiling时…

人工智能 2023年7月16日
0082

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

DeepHyperX代码理解-HamidaEtAl

大家都在看