【论文笔记】Knowledge-Driven Encode, Retrieve, Paraphrase for Medical Image Report Generation (AAAI 2019)

2023年5月28日上午12:39 • 人工智能 • 阅读 88

论文原文：https://arxiv.org/pdf/1903.10122.pdf

Abstract

Knowledge-driven Encode, Retrieve, Paraphrase (KERP) approach 知识驱动的编码、检索、释义(KERP)方法
decomposes medical report generation into explicit medical abnormality graph learning 显式医学异常图学习 and subsequent natural language modeling 自然语言建模

visual features –(Encode Module)–> an abnormality graph –(Retireve module)–> sequences of templates –(Paraphrase module)–> sequences of words

generates structured and robust reports supported with accurate abnormality prediction 生成结构化和健壮的报告，支持准确的异常预测
produces explainable attentive regions which is crucial for interpretative diagnosis 产生可解释的注意区域，这对解释性诊断至关重要

GTR

core of KERP
dynamically transforms high-level semantics between graph-structured data of multiple domains such as knowledge graphs, images and sequences 在知识图、图像和序列等多个领域的图结构数据之间动态转换高级语义

GTR as a module

concatenating intra-graph message passing and inter-graph message passing into one step 将图内消息传递和图间消息传递连接到一个步骤中
conduct message passing within target graph 在目标图内进行消息传递
conduct message passing from one / multiple source graph 从一个/多个源图传递消息
stacking multiple such steps into one module 将多个这样的步骤叠加到一个模块中
convert target graph features into high-level semantics 将目标图特性转换为高级语义

【论文笔记】Knowledge-Driven Encode, Retrieve, Paraphrase for Medical Image Report Generation (AAAI 2019)

; GTR as a multiple domains

GTR i2g \text{GTR}_\text{i2g}GTR i2g : image features –> graph’s features
GTR g2s \text{GTR}_\text{g2s}GTR g2s : input – graph; output – sequence
GTR g2g \text{GTR}_\text{g2g}GTR g2g : a graph –> another graph
abnormality graph –> disease graph
GTR gs2s \text{GTR}_\text{gs2s}GTR gs2s : input – graph&sequence; output – sequence

GTR for sequential input/output

positional encoding – relative and absolute position information 位置编码——相对和绝对位置信息

KERP

Encode module

transforms visual features into a structured abnormality graph by incorporating prior medical knowledge 将视觉特征转化为结构化的异常图
each node represents a possible clinical abnormality 临床异常

updated node features:
h u = G T R i 2 g ( X ) u = s i g m o i d ( W u h u ) \text{h}u = GTR{i2g}(\text{X}) \ u = sigmoid(\text{W}_u\text{h}_u)h u =G T R i 2 g (X )u =s i g m o i d (W u h u )

W u \text{W}u W u : linear projection to transform latent feature _u into 1-d probability 线性投影将潜在特征 _u_转化为一维概率
h u = ( h u 1 ; h u 2 ; . . . ; h u N ) ∈ R N , d \text{h}u=(\text{h}{u_1};\text{h}{u_2};…;\text{h}{u_N}) \in R^{N,d}h u =(h u 1 ;h u 2 ;…;h u N )∈R N ,d: the set of latent features of nodes where d is feature dimension 节点潜在特征集，其中 _d_为特征维数
u = ( u 1 , u 2 , . . . , u N ) , y i ∈ { 0 , 1 } , i ∈ { 1 , . . . , N } \text{u}=(u_1,u_2,…,u_N),y_i\in {0,1}, i\in{1,…,N}u =(u 1 ,u 2 ,…,u N ),y i ∈{0 ,1 },i ∈{1 ,…,N }:binary label for abnormality nodes 异常节点的二进制标签

Retrieve module 检索

retrieves text templates based on the detected abnormalities 根据检测到的异常检索文本模板

obtain template sequence:
h t = G T R g 2 s ( h u ) t = argmax S o f t m a x ( W t h t ) \text{h}t = GTR{g2s}(\text{h}_u) \ t = \text{argmax}Softmax(\text{W}_t\text{h}_t)h t =G T R g 2 s (h u )t =argmax S o f t m a x (W t h t )

W t \text{W}_t W t : linear projection to transform latent feature to template embedding 线性投影将潜在特征转化为模板嵌入

Paraphrase module

refine templates with enriched details and possibly new case-specific findings 用丰富的细节和可能的新的特定病例发现来改进模板
by modifying information in the templates that is not accurate for specific cases 通过修改模板中对于特定情况不准确的信息
convert templates into more natural and dynamic expressions 将模板转换为更自然和生动的表达式
by robust language modeling for the same content通过对同一内容进行稳健的语言建模

h w = G T R g s 2 s ( h u , t ) R = argmax S o f t m a x ( W w f ( h w ) ) \text{h}w = GTR{gs2s}(\text{h}_u,t) \ R = \text{argmax}Softmax(\text{W}_wf(\text{h}_w))h w =G T R g s 2 s (h u ,t )R =argmax S o f t m a x (W w f (h w ))

f f f: the operation of reshaping h w \text{h}_w h w from R N s , N w , d R^{N_s,N_w,d}R N s ,N w ,d to R N s ∗ N w , d R^{N_s*N_w,d}R N s ∗N w ,d
W w \text{W}_w W w : linear projection to transform latent feature into word embedding 线性投影将潜在特征转化为文字嵌入

Disease classification

multi-label disease classification: 多标记疾病分类
h z = G T R g 2 g ( h u ) z = s i g m o i d ( W z h z ) \text{h}z = GTR{g2g}(\text{h}_u) \ z = sigmoid(\text{W}_z\text{h}_z)h z =G T R g 2 g (h u )z =s i g m o i d (W z h z )
W z \text{W}_z W z : linear projection to transform disease nodes feature into 1-d probability 线性投影将疾病节点特征转化为一维概率

Learning

During paraphrasing, the retrieved templates t, instead of latent feature h t \text{h}t h t , is used for rewriting. Sampling the templates of maximum predicted probability breaks the connectivity of differentiable back-propagation of the whole _encode retrieve-paraphrase pipeline. 破坏了整个编码-检索-转述管道的可微调反向传播的连接性

train the Paraphrase with ground truth templates
then with sampled templates 采样模板 generated by Retrieval module

Results

; Conclusion

accurate attributes prediction
dynamic medical knowledge graph
explainable location reference 可解释的位置参考

Original: https://blog.csdn.net/Kqp12_27/article/details/124615783
Author: 快去皮
Title: 【论文笔记】Knowledge-Driven Encode, Retrieve, Paraphrase for Medical Image Report Generation (AAAI 2019)

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/528361/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Paper Reading – Loss系列 – Focal Loss for Dense Object Detection

确实发现大神的文章都比较简单明了实用 – ICCV2017计算机视觉-Paper&Code – 知乎 Abstract 总结主要为以下几点 OHEM…

人工智能 2023年5月26日
0082
pocketSphinx Android 离线语音识别

前几天有个朋友找到我说是否有非第三方(讯飞、百度)的语音识别方案，之前对这块的了解是语音识别基本都是调用这种第三方的方案。听完之后，谷歌发下有两种方案。调用Android本地具…

人工智能 2023年6月26日
00137
模式识别使用Fish分类算法和感知器分类算法——数据集可视化和特征选择

一、数据集简介本次实验所选取的数据集为：CORK_STOPPERS.xls数据集，即软木塞数据集。该数据即可在《模式识别》第三版书中的链接下载，也可在我的主页内下载。下图为本…

人工智能 2023年7月17日
0096
sql 分页查询 order by和group by一起使用导致排序失效问题解决

背景：在查询数据库数据的时候，很多时候我们需要用到group by来进行分组，同时使用order by进行排序，但是当两个同时使用时稍不注意就会出现问题。我在进行公司项目开发时就出…

人工智能 2023年6月28日
00130
Filterin

问题介绍在数据处理和信号处理中，滤波（Filtering）是一个常见的问题。它的目标是通过去除或压制信号中的某些成分，从而改变信号的特性或提取出感兴趣的信息。滤波在很多领域都得到…

人工智能 2024年1月2日
0037
一元线性回归及案例（Python）

目录 1 一元线性回归简介 2 一元线性回归数学形式 3 案例：不同行业工龄与薪水的线性回归模型 3.1 案例背景 3.2 具体代码 3.3 模型优化 4 总体展示 5 线性回归模…

人工智能 2023年6月12日
0067
故障预测方法分类

故障预测算法分类故障预测算法分为三类：基于模型(model-driven)的故障预测技术；基于数据驱动(data—driven)的故障预测技术；基于统计可靠性的故障预测技术…

人工智能 2023年7月28日
0062
数字图像处理-图像基础-复习总结

文章目录数字图像处理复习总结 * 数字图像基础 – 数字图像基础概念采样和量化非均匀采样与量化数字图像常见失真类型数字图像处理基础 – 数字图像处…

人工智能 2023年5月26日
0068
如何将python文件打包成exe格式

文章目录 * – 如何将python文件打包成exe格式 – + 1、安装pyinstaller + image-20211217220823007 + 2…

人工智能 2023年7月5日
00117
Mediapipe实战——导出身体节点坐标并用TensorFlow搭建LSTM网络来训练自己的手势检测模型再部署到树莓派4B

一、前言在YouTube上看到up主——Nicholas Renotte的相关教程，觉得非常有用。使用他的方法，我训练了能够检测四种手势的模型，在这里和大家分享一下。附上该up主的…

人工智能 2023年5月26日
00120
复现KGAT: Knowledge Graph Attention Network for Recommendation（三）

复现KGAT: Knowledge Graph Attention Network for Recommendation（三）昨天写了复现KGAT系列的第二篇文章，准确的说那片文…

人工智能 2023年6月1日
0092
盘点一个Pandas中explode()爆炸函数应用实际案例

点击上方” Python爬虫与数据挖掘“，进行关注回复” 书籍“即可获赠Python从入门到进阶共10本电子书今日鸡汤莫…

人工智能 2023年7月17日
0091
科沃斯扫地机器人无语音提示_科沃斯机器人DN33常见问题汇总

科沃斯机器人作为国产扫地机器人的龙头老大，扫地机器人产品也是遍布全国，那么多人购买科沃斯扫地机器人产品，难免会遇到这样那样的问题。经常有朋友留言问些DN33的问题，这里扫地机器人网…

人工智能 2023年5月27日
00423
SNN识别手写数字—MNIST数据集

提示：参考论文：“Unsupervised learning of digit recognition using spike-timing-dependent pla…

人工智能 2023年6月25日
0085
Python 计算机视觉（十五）—— 图像特效处理

参考的一些文章以及论文我都会给大家分享出来 —— 链接就贴在原文，论文我上传到资源中去，大家可以免费下载学习，如果当天资源区找不到论文，那就等等，可能正在审核，审核完后就可以下载了…

人工智能 2023年6月19日
0079
nlp 分词提取关键词的基本操作

概述从今天开始我们将开启一段自然语言处理 (NLP) 的旅程. 自然语言处理可以让来处理, 理解, 以及运用人类的语言, 实现机器语言和人类语言之间的沟通桥梁. 关键词关键词 …

人工智能 2023年5月28日
0082

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31