笔记：Pytorch-geometric: GAT代码超详细解读 | source node | target node | source_to_target

2023年6月19日下午5:31 • 人工智能 • 阅读 76

文章目录

笔记：Pytorch-geometric: GAT代码超详细解读 | source node | target node | source_to_target
*
前言
Torch geometric官方的GAT实现
源码解读
–
总结

笔记：Pytorch-geometric: GAT代码超详细解读 | source node | target node | source_to_target

知识分享求点赞QAQ，能力有限，如有错误欢迎诸位大佬指正。

不想读源码又想了解torch-geometric库利用message-passing实现GAT的机理，找遍博文也没有满意的，看了官方的文档也不能完全理解（大概还是自己理解能力不太行），于是有了这篇源码解读。

前言

什么是GAT？是Graph Attention Networks，图注意网络，具体参考其他人的文章
什么是Pytorch-geometric？是目前常用的实现图神经网络方法的依赖库，本文详述的GAT的torch实现方法，可见官方文档torch-geometric GAT
什么是message passing?是torch geometric为了方便用户构建图神经网络实现的类，GAT的实现即继承了message passing类

; Torch geometric官方的GAT实现

笔记：Pytorch-geometric: GAT代码超详细解读 | source node | target node | source_to_target

其中Θ \Theta Θ是参数，α i j \alpha_{ij}αi j 是注意力系数，其中说明： i代表target node， j代表source node。从公式或者GAT的示意图很容易得出消息的流向是从source node到target node。
* 官方的GATConv源码:

class GATConv(MessagePassing):
    def __init__(self, in_channels: Union[int, Tuple[int, int]],
                 out_channels: int, heads: int = 1, concat: bool = True,
                 negative_slope: float = 0.2, dropout: float = 0.,
                 add_self_loops: bool = True, bias: bool = True, **kwargs):
        kwargs.setdefault('aggr', 'add')
        super(GATConv, self).__init__(node_dim=0, **kwargs)

        self.in_channels = in_channels
        self.out_channels = out_channels
        self.heads = heads
        self.concat = concat
        self.negative_slope = negative_slope
        self.dropout = dropout
        self.add_self_loops = add_self_loops

        if isinstance(in_channels, int):
            self.lin_l = Linear(in_channels, heads * out_channels, bias=False)
            self.lin_r = self.lin_l
        else:
            self.lin_l = Linear(in_channels[0], heads * out_channels, False)
            self.lin_r = Linear(in_channels[1], heads * out_channels, False)

        self.att_l = Parameter(torch.Tensor(1, heads, out_channels))
        self.att_r = Parameter(torch.Tensor(1, heads, out_channels))

        if bias and concat:
            self.bias = Parameter(torch.Tensor(heads * out_channels))
        elif bias and not concat:
            self.bias = Parameter(torch.Tensor(out_channels))
        else:
            self.register_parameter('bias', None)

        self._alpha = None

        self.reset_parameters()

    def reset_parameters(self):
        glorot(self.lin_l.weight)
        glorot(self.lin_r.weight)
        glorot(self.att_l)
        glorot(self.att_r)
        zeros(self.bias)

    def forward(self, x: Union[Tensor, OptPairTensor], edge_index: Adj,
                size: Size = None, return_attention_weights=None):
        H, C = self.heads, self.out_channels

        x_l: OptTensor = None
        x_r: OptTensor = None
        alpha_l: OptTensor = None
        alpha_r: OptTensor = None
        if isinstance(x, Tensor):
            assert x.dim() == 2, 'Static graphs not supported in GATConv.'
            x_l = x_r = self.lin_l(x).view(-1, H, C)
            alpha_l = (x_l * self.att_l).sum(dim=-1)
            alpha_r = (x_r * self.att_r).sum(dim=-1)
        else:
            x_l, x_r = x[0], x[1]
            assert x[0].dim() == 2, 'Static graphs not supported in GATConv.'
            x_l = self.lin_l(x_l).view(-1, H, C)
            alpha_l = (x_l * self.att_l).sum(dim=-1)
            if x_r is not None:
                x_r = self.lin_r(x_r).view(-1, H, C)
                alpha_r = (x_r * self.att_r).sum(dim=-1)

        assert x_l is not None
        assert alpha_l is not None

        if self.add_self_loops:
            if isinstance(edge_index, Tensor):
                num_nodes = x_l.size(0)
                if x_r is not None:
                    num_nodes = min(num_nodes, x_r.size(0))
                if size is not None:
                    num_nodes = min(size[0], size[1])
                edge_index, _ = remove_self_loops(edge_index)
                edge_index, _ = add_self_loops(edge_index, num_nodes=num_nodes)
            elif isinstance(edge_index, SparseTensor):
                edge_index = set_diag(edge_index)

        out = self.propagate(edge_index, x=(x_l, x_r),
                             alpha=(alpha_l, alpha_r), size=size)

        alpha = self._alpha
        self._alpha = None

        if self.concat:
            out = out.view(-1, self.heads * self.out_channels)
        else:
            out = out.mean(dim=1)

        if self.bias is not None:
            out += self.bias

        if isinstance(return_attention_weights, bool):
            assert alpha is not None
            if isinstance(edge_index, Tensor):
                return out, (edge_index, alpha)
            elif isinstance(edge_index, SparseTensor):
                return out, edge_index.set_value(alpha, layout='coo')
        else:
            return out

    def message(self, x_j: Tensor, alpha_j: Tensor, alpha_i: OptTensor,
                index: Tensor, ptr: OptTensor,
                size_i: Optional[int]) -> Tensor:
        alpha = alpha_j if alpha_i is None else alpha_j + alpha_i
        alpha = F.leaky_relu(alpha, self.negative_slope)
        alpha = softmax(alpha, index, ptr, size_i)
        self._alpha = alpha
        alpha = F.dropout(alpha, p=self.dropout, training=self.training)
        return x_j * alpha.unsqueeze(-1)

    def __repr__(self):
        return '{}({}, {}, heads={})'.format(self.__class__.__name__,
                                             self.in_channels,
                                             self.out_channels, self.heads)

源码解读

输入图

为了方便的解读源码，我们建立一个简单的图用于输入，图中包含三个标号0，1，2的节点，节点特征是二维的。

建立图代码如下

import torch
from torch_geometric.nn import GATConv
from torch_geometric.data import Data
from torch_geometric.utils import to_undirected
x=torch.tensor([[1.,2],[2,3],[1,3]])
edge_index=torch.LongTensor([[0,0],[1,2]])
edge_index = to_undirected(edge_index)
graph = Data(x=x,edge_index=edge_index)

init部分

self.in_channels = in_channels
self.out_channels = out_channels
self.heads = heads
self.concat = concat
self.negative_slope = negative_slope
self.dropout = dropout
self.add_self_loops = add_self_loops
if isinstance(in_channels, int):
    self.lin_l = Linear(in_channels, heads * out_channels, bias=False)
    self.lin_r = self.lin_l

else:
    self.lin_l = Linear(in_channels[0], heads * out_channels, False)
    self.lin_r = Linear(in_channels[1], heads * out_channels, False)
self.att_l = Parameter(torch.Tensor(1, heads, out_channels))
self.att_r = Parameter(torch.Tensor(1, heads, out_channels))

这一部分非常简单，见注释。注意 Message passing有可选参数 flow,可以选择为 source_to_target或者是 target_to_source。很明显GAT是前者，且与默认值相同，不做修改。

forward部分

def forward(self, x: Union[Tensor, OptPairTensor], edge_index: Adj,
                size: Size = None, return_attention_weights=None):

        H, C = self.heads, self.out_channels

输入特征向量矩阵 x和 edge_index。
此处输入的

x=[
[1,2],

注意 edge_index出现了变化，原因是建图中 to_undirected的操作

  x_l: OptTensor = None
  x_r: OptTensor = None
  alpha_l: OptTensor = None
  alpha_r: OptTensor = None
  if isinstance(x, Tensor):
      assert x.dim() == 2, 'Static graphs not supported in GATConv.'
      x_l = x_r = self.lin_l(x).view(-1, H, C)
      alpha_l = (x_l * self.att_l).sum(dim=-1)
      alpha_r = (x_r * self.att_r).sum(dim=-1)
  else:
      x_l, x_r = x[0], x[1]
      assert x[0].dim() == 2, 'Static graphs not supported in GATConv.'
      x_l = self.lin_l(x_l).view(-1, H, C)
      alpha_l = (x_l * self.att_l).sum(dim=-1)
      if x_r is not None:
          x_r = self.lin_r(x_r).view(-1, H, C)
          alpha_r = (x_r * self.att_r).sum(dim=-1)

x_l,x_r分别计算的是左乘Θ \Theta Θ后的向量值，这里再强调（因为后面很重要）， l对应source node， r对应target node， i代表target node， j代表source node。
此外 alpha_l是 x_l和 self.att_l点积之后的结果，对应a l T Θ l x a^T_l\Theta_l x a l T Θl x，同理 alpha_r。
我们假设二维到一维的映射是简单的相加（即Θ \Theta Θ左乘就是相加），同时a T a^T a T的作用是乘以0.5），那么此时的 x_l,x_r,alpha_l,alpha_r为：

x_l = x_r = [
[3],

if self.add_self_loops:
     if isinstance(edge_index, Tensor):
         num_nodes = x_l.size(0)
         if x_r is not None:
             num_nodes = min(num_nodes, x_r.size(0))
         if size is not None:
             num_nodes = min(size[0], size[1])
         edge_index, _ = remove_self_loops(edge_index)
         edge_index, _ = add_self_loops(edge_index, num_nodes=num_nodes)
     elif isinstance(edge_index, SparseTensor):
         edge_index = set_diag(edge_index)

接下来为 edge_index加入自环，加入自环之后的 edge_index变为：

edge_index = [
[0,0,1,2,0,1,2],

 out = self.propagate(edge_index, x=(x_l, x_r),
                      alpha=(alpha_l, alpha_r), size=size)

调用Message passing的 propagate的方法，这是一个集成方法，调用其会依次调用 message、 aggregate、 update方法。在source_to_target的方式下， message方法负责产生source node需要传出的信息， aggregate负责为target node收集来自source node的信息，一般是 max、 add（default）等方法，GAT默认采用的是 add方法， update用于更新表示。可见实现GAT最关键的是 message方法的构造。
注意源码中调用 propagate传入的参数会等价的传入 message和 aggregate中，这里传入的x是一个元胞，例如 (x_l,x_r)，元胞中第一位是用作source node信息使用的，第二位是用作target node信息使用的。

重构message方法

def message(self, x_j: Tensor, alpha_j: Tensor, alpha_i: OptTensor,
                index: Tensor, ptr: OptTensor,
                size_i: Optional[int]) -> Tensor:
        alpha = alpha_j if alpha_i is None else alpha_j + alpha_i
        alpha = F.leaky_relu(alpha, self.negative_slope)
        alpha = softmax(alpha, index, ptr, size_i)
        self._alpha = alpha
        alpha = F.dropout(alpha, p=self.dropout, training=self.training)
        return x_j * alpha.unsqueeze(-1)

x_j和 alpha_j是source node的信息， index是与source node相连的target node的标号， ptr默认值是 None，这里不考虑。这么说是不是非常的不明白？这里就需要数字举例了。
此时有：

edge_index = [
[0,0,1,2,0,1,2],

传入 message中的各变量为：

index=[1,2,0,0,0,1,2]

这样就非常清晰明了了。剩下的就是说明其softmax的实现

alpha = softmax(alpha, index, ptr, size_i)

这里的 alpha是 alpha_i和 alpha_j的和:

alpha=[[4],[3.5],[4],[3.5],[3],[5],[4]]

softmax函数先是对 alpha的内容都取 exp，得到 exp_alpha

exp_alpha=[exp(4),exp(3.5),exp(4),exp(3.5),exp(3),exp(5),exp(4)]#简单起见省略了中间的小括号
index = [1,2,0,0,0,1,2]

最后的 softmax函数是依赖 exp_alpha和 index共同得到输出 out：

out=[
exp(4)/(exp(4)+exp(5)),
exp(3.5)/(exp(4)+exp(3.5)),
exp(4)/(exp(3)+exp(4)+exp(3.5)),
exp(3.5)/(exp(3)+exp(4)+exp(3.5)),
exp(3)/(exp(3)+exp(4)+exp(3.5)),
exp(5)/(exp(4)+exp(5)),
exp(4)/(exp(4)+exp(3.5)),
]

到这一步了，我居然不知道怎么用文字解释 index和 exp_alpha产生 out的方法…就看上面的公式找规律吧，很容易观察出来，大概就是按位寻找对应 index中内容相同的，然后计算占比这样。
这样的 out就是注意力系数了，到这里GAT的讲解也就结束了。

总结

应该各部分都很好理解，除了 message部分，文中举例了数据，也列出了输入和输出，仔细观察应该不难弄明白。

Original: https://blog.csdn.net/weixin_44839047/article/details/115724958
Author: Deno_V
Title: 笔记：Pytorch-geometric: GAT代码超详细解读 | source node | target node | source_to_target

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/639892/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

AAAI 2022：基于锚框排序的目标检测知识蒸馏｜AI Drive

「AI Drive」是由 biendata 和 PaperWeekly共同发起的学术直播间，旨在帮助更多的青年学者宣传其最新科研成果。我们一直认为，单向地输出知识并不是一个最好的方…

人工智能 2023年7月10日
0056
SpringBoot集成neo4j实战

文章目录 1.图数据库Neo4j介绍 * 1.1 什么是图数据库(graph database) 1.2 为什么需要图数据库 1.3 Neo4j特点和优势 – Neo4…

人工智能 2023年6月10日
0072
Docker部署onnxruntime-gpu环境

Docker部署onnxruntime-gpu环境新开发的深度学习模型需要通过docker部署到服务器上，由于只使用了onnx进行模型推理，为了减少镜像大小，准备不使用pytor…

人工智能 2023年5月24日
00133
Pandas concat连接操作

Pandas 通过 concat() 函数能够轻松地将 Series 与 DataFrame 对象组合在一起，函数的语法格式如下： pd.concat(objs,axis=0,jo…

人工智能 2023年7月8日
0052
在汽车控制器应用最广最多的芯片之一-英飞凌-AURIX架构

汽车世界正以前所未有的速度发展着。英飞凌拥有40多年在为汽车行业的电子系统提供高质量半导体方面的成功和成熟的专业知识。今天，其传感器、微控制器和功率半导体帮助世界各地的汽车制造商实…

人工智能 2023年6月11日
0096
朴素贝叶斯算法之鸢尾花特征分类【机器学习】【伯努利分布,多项式分布,高斯分布】

文章目录一.前言 * 1.1 本文原理 1.2 本文目的二.实验过程 * 2.1使用BernoulliNB（伯努利分布）给鸢尾花分类，写出代码，对运行结果截图并对分类结果进行分…

人工智能 2023年6月30日
0079
Xception迁移学习：玉米叶片病害识别分类

Xception迁移学习：玉米叶片病害识别分类数据集：来自网上公开的PlantVillage数据集中的玉米叶片部分。运行环境：Tensorflow深度学习开源框架，选用Pyth…

人工智能 2023年5月26日
0067
pyecharts任意位置添加文字 + 自适应居中 + table表格居中

数据分析（3）以下是工作中遇到问题时，检索到比较好的案例，收集起来方便自己查看。同时，希望也能给遇到相同问题的同学节约时间，快速检索到解决方案。说下自己直观感受，以上3个…

人工智能 2023年7月15日
0059
知识图谱领域顶级学术会议列表

知识图谱领域顶级学术会议列表会议简称会议全称 ACL Association of Computational Linguistics EMNLP Empirical Meth…

人工智能 2023年6月1日
0088
Docker:基于Docker对中间件进行配置、安装和使用操作合集

提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档文章目录前言一、rabbitmq * 启动rabbitmq 设置rabbitmq用户 – + 进入…

人工智能 2023年7月31日
0043
[39题] 牛客深度学习专项题

[卷积核大小] 提升卷积核(convolutional kernel)的大小会显著提升卷积神经网络的性能，这种说法是正确的错误的卷积核的大小是一个超参数(hyperparam…

人工智能 2023年6月23日
0076
论文阅读：Explanations for CommonsenseQA ：New Dataset and Models

论文阅读：Explanations for CommonsenseQA ：New Dataset and Models 来源：ACL 2021 下载地址：https://aclan…

人工智能 2023年7月14日
00113
训练集、验证集、测试集的作用和区别

一、概述简单说，训练集就是用来训练模型用的，验证集为了验证模型的效果，测试集用来最终评测。所以基于这个，那训练数据的时候，就不要使用验证集和测试集的相关信息，包括统计均值方差特…

人工智能 2023年5月31日
0065
《商务与经济统计》练习：案例3-4：天使巧克力的网络交易

此练习涉及的知识技能：描述统计学、相关关系分析、Excel（数据透视表、数据分析工具）的应用练习内容：三种数值变量（浏览网站时间、观看网页数量、消费金额）的图表和和数值汇总分…

人工智能 2023年7月15日
0071
关联分析：Apriori算法

本文代码及数据集来自《Python大数据分析与机器学习商业案例实战》步骤1：设定最小支持度和最小置信度首先设定最小支持度为2/5，即40%；最小置信度为4/5，即80%。步骤…

人工智能 2023年7月17日
0064
torchsummary和torchstat使用方法和结果分析

1 torchstat：查看模型的大小和浮动运算量安装工具 pip install torchstat 使用例子 import torch import torch.nn as …

人工智能 2023年7月6日
0093

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

笔记：Pytorch-geometric: GAT代码超详细解读 | source node | target node | source_to_target

文章目录

前言

; Torch geometric官方的GAT实现

源码解读

输入图

__init__部分

forward部分

重构message方法

总结

大家都在看

init部分