知识图谱学习笔记 4

2023年6月10日上午8:00 • 人工智能 • 阅读 69

今天学习使用ChatterBot来做一个问答系统。
关于ChatterBot的安装问题可以看这篇博文。

程序结构如图：

medical.json是网上爬取的一个医药相关的包，然后需要通过运行build_medicalgraph在neo4j服务器上建立一个医药图谱。
首先打开neo4j服务，运用以下指令清空之前的图：

MATCH(n)
DETACH DELETE n

运行build_medicalgraph，这个时候程序报错：

打开neo4j.conf，搜素dbms.security.auth_enabled=false，将前面的井号键#去掉，重新打开neo4j服务，运行build_medicalgraph就可以了。

下面对build_medicalgraph做一个简单的说明。

定义一个MedicalGraph类，创建它的对象，然后用create_graphnodes()函数创建节点，用create_graphrels()函数创建关系。

if __name__ == '__main__':
    handler = MedicalGraph()
    handler.create_graphnodes()
    handler.create_graphrels()
    # handler.export_data()

接下来看MedicalGraph类内部：
在构造函数中，定义了需要读取的json文件地址，以及一个本机的neo4j的数据5-8行为默认数据：

    def __init__(self):
        cur_dir = '/'.join(os.path.abspath(__file__).split('/')[:-1])
        self.data_path = os.path.join(cur_dir, 'data/medical.json')
        self.g = Graph(
            host="127.0.0.1",  # neo4j &#x642D;&#x8F7D;&#x670D;&#x52A1;&#x5668;&#x7684;ip&#x5730;&#x5740;&#xFF0C;ifconfig&#x53EF;&#x83B7;&#x53D6;&#x5230;
            http_port=7474,  # neo4j &#x670D;&#x52A1;&#x5668;&#x76D1;&#x542C;&#x7684;&#x7AEF;&#x53E3;&#x53F7;
            user="neo4j",  # &#x6570;&#x636E;&#x5E93;user name&#xFF0C;&#x5982;&#x679C;&#x6CA1;&#x6709;&#x66F4;&#x6539;&#x8FC7;&#xFF0C;&#x5E94;&#x8BE5;&#x662F;neo4j
            password="123456")

下面的代码是对知识图谱的节点的一个建立过程，这里的代码只创建了药、食物、检查结果和疾病信息等，其他的节点添加方式相同：

    '''&#x521B;&#x5EFA;&#x77E5;&#x8BC6;&#x56FE;&#x8C31;&#x5B9E;&#x4F53;&#x8282;&#x70B9;&#x7C7B;&#x578B;schema'''
    def create_graphnodes(self):
        Drugs, Foods, Checks, disease_infos = self.read_nodes()
        self.create_diseases_nodes(disease_infos)
        self.create_node('Drug', Drugs)
        # print(len(Drugs))
        self.create_node('Food', Foods)
        # print(len(Foods))
        self.create_node('Check', Checks)
        # print(len(Checks))
        return

    '''&#x521B;&#x5EFA;&#x77E5;&#x8BC6;&#x56FE;&#x8C31;&#x4E2D;&#x5FC3;&#x75BE;&#x75C5;&#x7684;&#x8282;&#x70B9;'''
    def create_diseases_nodes(self, disease_infos):
        count = 0
        for disease_dict in disease_infos:
            node = Node("Disease", name=disease_dict['name'], desc=disease_dict['desc'],
                        prevent=disease_dict['prevent'] ,cause=disease_dict['cause'])
            self.g.create(node)
            count += 1
            # print(count)
        return

    '''&#x5EFA;&#x7ACB;&#x8282;&#x70B9;'''
    def create_node(self, label, nodes):
        count = 0
        for node_name in nodes:
            node = Node(label, name=node_name)
            self.g.create(node)
            count += 1
            # print(count, len(nodes))
        return

json文件中的每个疾病节点结构如图，根据文件结构，我们读取文件的代码大致如下，这段代码大概的操作就是将json文件中的内容读入到相应的节点中：

    '''&#x8BFB;&#x53D6;&#x6587;&#x4EF6;'''
    def read_nodes(self):
        # &#x51E0;&#x7C7B;&#x8282;&#x70B9;
        foods = [] #&#x3000;&#x98DF;&#x7269;
        diseases = [] #&#x75BE;&#x75C5;

        # &#x6784;&#x5EFA;&#x8282;&#x70B9;&#x5B9E;&#x4F53;&#x5173;&#x7CFB;
        rels_noteat = [] # &#x75BE;&#x75C5;&#xFF0D;&#x5FCC;&#x5403;&#x98DF;&#x7269;&#x5173;&#x7CFB;
        rels_doeat = [] # &#x75BE;&#x75C5;&#xFF0D;&#x5B9C;&#x5403;&#x98DF;&#x7269;&#x5173;&#x7CFB;

        count = 0
        for data in open(self.data_path, encoding='utf-8'):
            disease_dict = {}
            count += 1
            # print(count)
            data_json = json.loads(data)
            disease = data_json['name']
            disease_dict['name'] = disease
            diseases.append(disease)

            if 'not_eat' in data_json:
                not_eat = data_json['not_eat']
                for _not in not_eat:
                    rels_noteat.append([disease, _not])

                foods += not_eat
                do_eat = data_json['do_eat']
                for _do in do_eat:
                    rels_doeat.append([disease, _do])

                foods += do_eat
                recommand_eat = data_json['recommand_eat']

            disease_infos.append(disease_dict)
        return set(drugs), set(foods), set(diseases), disease_infos

接下来我们创建实体的关系边，在第4行，我们可以看到，Disease是一个节点，Food是另一个节点，他们之间的关系是no_eat：

    '''&#x521B;&#x5EFA;&#x5B9E;&#x4F53;&#x5173;&#x7CFB;&#x8FB9;'''
    def create_graphrels(self):
        Foods, Diseases, rels_noteat, rels_doeat, rels_recommandeat = self.read_nodes()
        self.create_relationship('Disease', 'Food', rels_recommandeat, 'recommand_eat', '&#x63A8;&#x8350;&#x98DF;&#x8C31;')
        self.create_relationship('Disease', 'Food', rels_noteat, 'no_eat', '&#x5FCC;&#x5403;')
        self.create_relationship('Disease', 'Food', rels_doeat, 'do_eat', '&#x5B9C;&#x5403;')

创建实体关联边的函数create_relationship结构如下：

    '''&#x521B;&#x5EFA;&#x5B9E;&#x4F53;&#x5173;&#x8054;&#x8FB9;'''
    def create_relationship(self, start_node, end_node, edges, rel_type, rel_name):
        count = 0
        # &#x53BB;&#x91CD;&#x5904;&#x7406;
        set_edges = []
        for edge in edges:
            set_edges.append('###'.join(edge))
        all = len(set(set_edges))
        for edge in set(set_edges):
            edge = edge.split('###')
            p = edge[0]
            q = edge[1]
            query = "match(p:%s),(q:%s) where p.name='%s'and q.name='%s' create (p)-[rel:%s{name:'%s'}]->(q)" % (
                start_node, end_node, p, q, rel_type, rel_name)
            try:
                self.g.run(query)
                count += 1
                # print(rel_type, count, all)
            except Exception as e:
                print(e)
        return

将上述程序运行后，即可以将json文件整个加载到neo4j服务器上，因为json文件比较大，所以这个过程非常慢。

之后可以再写一个文件做一个应答系统，主要代码如下，这个程序主要是构造了一个类，分别完成了对提出问题的解析，和对回答的检索：

    def __init__(self):
        self.classifier = QuestionClassifier()
        self.parser = QuestionPaser()
        self.searcher = AnswerSearcher()

    def chat_main(self, question):
        answer = '&#x60A8;&#x597D;&#xFF0C;&#x6211;&#x662F;&#x533B;&#x836F;&#x667A;&#x80FD;&#x52A9;&#x7406;&#x201C;&#x5C0F;&#x533B;&#x201D;&#xFF0C;&#x5E0C;&#x671B;&#x53EF;&#x4EE5;&#x5E2E;&#x5230;&#x60A8;&#x3002;'
        res_classify = self.classifier.classify(question)
        if not res_classify:
            return answer
        res_sql = self.parser.parser_main(res_classify)
        final_answers = self.searcher.search_main(res_sql)
        if not final_answers:
            return answer
        else:
            return '\n'.join(final_answers)

Original: https://blog.csdn.net/m0_60976461/article/details/119729628
Author: 落霞孤雾
Title: 知识图谱学习笔记 4

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/595934/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Neo4j图数据库入门实践

Neo4j作为图数据库和知识图谱伴随 2012 年 google 正式发布知识图谱搜索引擎和 2013 年 facebook 开放知识图谱搜索入口以来，知识图谱迎来了一波发展浪潮，…

人工智能 2023年6月1日
0082
【神经网络】【TensorFlow】求解耦合常微分方程组

刚开始学习神经网络，之前在帖子中学习了arxiv.org中一篇论文通过神经网络求解常微分方程的思路，原帖介绍了论文思路并给出了常微分方程求解举例，在这里我写一下自己的一些理解，并尝…

人工智能 2023年5月25日
00122
三分钟解决Matlab中文乱码问题

前言：简单介绍Matlab中文乱码处理方式。 ; 乱码原因比如：教研室电脑上的是Matlab 2022a，个人笔记本上的是Matlab 2018b，用个人笔记本电脑打开教研室上电…

人工智能 2023年6月17日
0095
备战数学建模44-聚类模型(攻坚站8)

“物以类聚，人以群分”，所谓的聚类，就是将样本划分为由类似的对象组成的多个类的过程。聚类后，我们可以更加准确的在每个类中单独使用统计模型进行估计、分析或预测…

人工智能 2023年6月2日
0084
Deep3DFaceReconstruction踩坑实录

在跑3d人脸ｄｅｍｏ Deep3DFaceReconstruction时遇到一些问题，拿个小本本记录下来．项目介绍这个项目实现了通过单张图片推理3d人脸参数的功能，对应的论文为…

人工智能 2023年5月26日
0078
关于《Robust outlier detection based on the changing rate of directed density ratio》的阅读笔记

关于《Robust outlier detection based on the changing rate of directed density ratio》的阅读笔记中文题…

人工智能 2023年7月18日
0093
多旅行商问题——公式和求解过程概述

英文：The multiple traveling salesman problem an overview of formulations and solution proced…

人工智能 2023年6月15日
0075
EVT 极值理论

EVT：Extreme Value Theory；预测小概率时间发生的可能，如大洪水，评估海事安全等。 EVT 中心思想是概率分布，可给出事件发生概率的数学公式。例如常用的高斯分…

人工智能 2023年6月15日
0075
目标检测OD

目标检测目标检测开源实现（Yolo等框架）Yolo：you only look once1.环境安装利用Anaconda安装pytorch和paddle深度学习环境+pychar…

人工智能 2023年7月12日
0058
三种实现逻辑回归算法的代码

了解逻辑回归是解决二分类问题 https://zhuanlan.zhihu.com/p/46591702 需要了解的数学知识，如何用逻辑回归算法来解决分类问题。视频：https:/…

人工智能 2023年6月17日
00111
# Conda environment for TensorFlow and ROOT(HEP) in Mac (M1)

Follow the procedures described below: 清除原有xcode环境（for ROOT installation） remove xcode（卸载…

人工智能 2023年5月24日
0087
基于批量OCR分析中传研究生录取名单

中国传媒大学一直是我向往的高校，但是众所周知中国传媒大学研究生录取是十分不透明的，复试参考资料、往年真题、报录比等等都不公开，官网的研究生录取名单是图片形式的，无法直接用网页搜索工…

人工智能 2023年6月11日
0061
ubuntu安装opencv_contrib扩展库，附踩坑+测试

博主昨晚需要用到OpenCV的SURF接口，但是发现无法调用，因为没有头文件。于是查阅了下资料，发现这些库已经被美国买下专利，成为付费库，都在opencv_contrib中。如果你…

人工智能 2023年5月26日
0095
深度理解微服务

🥲 🥸 🤌 🫀 🫁 🥷 🐻‍❄️🦤 🪶 🦭 🪲 🪳 🪰 🪱 🪴 🫐 🫒 🫑 🫓 🫔 🫕 🦤 🪶 🦭 🪲 🪳 🪰 🪱 🐻‍❄️ 🫐 🫒 🫑 🫓 🫔 🫕♔博主昵称：�欢快↑㎡🕍博客主页…

人工智能 2023年7月31日
0059
激光雷达（LiDAR）| 第一节：点云处理库与软件介绍

本节将介绍基于激光雷达点云处理的相关库和软件点云数据激光雷达（LIght Detection And Ranging，LiDAR）是一种集激光，全球定位系统(GPS)和惯性导航…

人工智能 2023年7月27日
0066
YOLO系列详解目标检测

yolo v1 前言相比同年的fast-rcnn和ssd都没有优势 ; 详解 B=2，Pr（Object）为0或者1 在v1中没有anchor的概念，预测的xywh是直接预测的b…

人工智能 2023年7月10日
0095

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

知识图谱学习笔记 4

大家都在看