【原创】人工客服会话日志挖掘论文调研

2023年5月30日下午6:20 • 人工智能 • 阅读 80

在人工客服服务日志中抽取问答对，配置到机器人知识库中
QA matching：以question为出发点，即假设question已经确定，从上下文(主要是上文)中找到该question的答案，组成问答对
【ECAI-2020】Matching Questions and Answers in Dialogues from Online Forums：https://arxiv.org/pdf/2005.09276.pdf
【LREC-2020】Cross-sentence Pre-trained Model for Interactive QA matching：https://www.aclweb.org/anthology/2020.lrec-1.666/
conversation structure modeling(dialogue structure analysis/conversation structure discovery/conversation disentanglement/discourse parsing)：以answer为出发点，即假设answer已经确定，从上下文(主要是上文)中找到该answer对应的问题(reply-to relation)，组成问答对，目前针对multi-party dialogue的研究较多
two-party dialogue
【AAAI-2017】Discovering Conversational Dependencies between Messages in Dialogs
【AAAI-2019】Learning to Align Question and Answer Utterances in Customer Service Conversation with Recurrent Pointer Networks：https://ojs.aaai.org//index.php/AAAI/article/view/3778
multi-party dialogue
【NAACL-2018】Learning to Disentangle Interleaved Conversational Threads with a Siamese Hierarchical Network and Similarity Ranking：multi-party, SHCNN, CISIR
【AAAI-2019】A Deep Sequential Model for Discourse Parsing on Multi-Party Dialogues：multi-party
【NAACL-2019】Context-Aware Conversation Thread Detection in Multi-Party Chat：multi-party
【AAAI-2020】Who Did They Respond to? Conversation Structure Modeling Using Masked Hierarchical Transformer：https://ojs.aaai.org/index.php/AAAI/article/view/6524
论文
【ACL-2020】Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs：https://www.aclweb.org/anthology/2020.acl-main.20/
【ACL-2020】Harvesting and Refining Question-Answer Pairs for Unsupervised QA：https://www.aclweb.org/anthology/2020.acl-main.600/
【EMNLP-2020】QADiscourse – Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines：https://www.aclweb.org/anthology/2020.emnlp-main.224/
【ACL-2020】A Smart System to Generate and Validate Question Answer Pairs for COVID-19 Literature：https://www.aclweb.org/anthology/2020.sdp-1.4.pdf
【AAAI-2020】On the Generation of Medical Question-Answer Pairs：https://ojs.aaai.org//index.php/AAAI/article/view/6410
【ACL-2016】QA-It: Classifying Non-Referential It for Question Answer Pairs：https://www.aclweb.org/anthology/P16-3020
【NAACL-2016】Watson Discovery Advisor: Question-answering in an industrial setting：https://www.aclweb.org/anthology/W16-0101.pdf
【ACL-2019】Synthetic QA Corpora Generation with Roundtrip Consistency：https://www.aclweb.org/anthology/P19-1620.pdf
【EMNLP-2017】Question Generation for Question Answering：https://www.aclweb.org/anthology/D17-1090.pdf
【ACL-2018】Neural Models for Key Phrase Extraction and Question Generation：https://www.aclweb.org/anthology/W18-2609.pdf

Abstract. Matching question-answer relations between two turns in conversations is not only the first step in analyzing dialogue structures, but also valuable for training dialogue systems. This paper presents a QA matching model considering both distance information and dialogue history by two simultaneous attention mechanisms called mutual attention. Given scores computed by the trained model between each non-question turn with its candidate questions, a greedy matching strategy is used for final predictions. Because existing dialogue datasets such as the Ubuntu dataset are not suitable for the QA matching task, we further create a dataset with 1,000 labeled dialogues and demonstrate that our proposed model outperforms the state-of-the-art and other strong baselines, particularly for matching long-distance QA pairs.

Abstract. Customers ask questions, and customer service staffs answer those questions. It is the basic service manner of customer service (CS). The progress of CS is a typical multi-round conversation. However, there are no explicit corresponding relations among conversational utterances. This paper focuses on obtaining explicit alignments of question and answer utterances in CS. It not only is an important task of dialogue analysis, but also able to obtain lots of valuable train data for learning dialogue systems. In this work, we propose end-to-end models for aligning question (Q) and answer (A) utterances in CS conversation with recurrent pointer networks (RPN). On the one hand, RPN-based alignment models are able to model the conversational contexts and the mutual influence of different Q-A alignments. On the other hand, they are able to address the issue of empty and multiple alignments for some utterances in a unified manner. We construct a dataset from an in-house online CS. The experimental results demonstrate that the proposed models are effective to learn the alignments of question and answer utterances.

Abstract.Conversation structure is useful for both understanding the nature of conversation dynamics and for providing features for many downstream applications such as summarization of conversations. In this work, we define the problem of conversation structure modeling as identifying the parent utterance(s) to which each utterance in the conversation responds to. Previous work usually took a pair of utterances to decide whether one utterance is the parent of the other. We believe the entire ancestral history is a very important information source to make accurate prediction. Therefore, we design a novel masking mechanism to guide the ancestor flow, and leverage the transformer model to aggregate all ancestors to predict parent utterances. Our experiments are performed on the Reddit dataset (Zhang, Culbertson, and Paritosh 2017) and the Ubuntu IRC dataset (Kummerfeld et al. 2019). In addition, we also report experiments on a new larger corpus from the Reddit platform and release this dataset. We show that the proposed model, that takes into account the ancestral history of the conversation, significantly outperforms several strong baselines including the BERT model on all datasets.

Abstract. One of the most crucial challenges in question answering (QA) is the scarcity of labeled data, since it is costly to obtain question-answer (QA) pairs for a target text domain with human annotation. An alternative approach to tackle the problem is to use automatically generated QA pairs from either the problem context or from large amount of unstructured texts (e.g. Wikipedia). In this work, we propose a hierarchical conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts, while maximizing the mutual information between generated QA pairs to ensure their consistency. We validate our Information Maximizing Hierarchical Conditional Variational AutoEncoder (InfoHCVAE) on several benchmark datasets by evaluating the performance of the QA model (BERT-base) using only the generated QA pairs (QA-based evaluation) or by using both the generated and human-labeled pairs (semisupervised learning) for training, against stateof-the-art baseline models. The results show that our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training

Original: https://blog.csdn.net/u010567574/article/details/122731564
Author: suvedo
Title: 【原创】人工客服会话日志挖掘论文调研

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/544467/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

对公知识图谱-资金流向风险图谱

啊哦~你想找的内容离你而去了哦内容不存在，可能为如下原因导致： ① 内容还在审核中 ② 内容以前存在，但是由于不符合新的规定而被删除 ③ 内容地址错误 ④ 作者删除了内容。可…

人工智能 2023年6月1日
0069
pandas中Series的使用

文章目录 * – pandas的应用 – 创建Series对象 – + 索引 + 花式索引 + 布尔索引 + Series对象的常用属性 + d…

人工智能 2023年7月15日
0094
在Python中计算一次性计算多个百分位数percentile、quantile

在Python中计算一次性计算多个百分位数 percentile、 quantile 目录在Python中计算一次性计算多个百分位数percentile、 quantile Or…

人工智能 2023年5月31日
0076
数据挖掘算法——序列模式

目录序列模式挖掘简介问题定义序列模式挖掘的应用背景应用案例：客户购买行为模式分析应用案例：疾病诊断序列模式挖掘算法概述类Apriori算法基于划分的模式生长算法 G…

人工智能 2023年7月17日
0050
DTM动态主题模型实战案例

DTM动态主题模型实战案例针对三个月份某期刊论文的摘要进行时间片上的动态模型主题分析文章目录 DTM动态主题模型实战案例 * 代码实现所参考博客一、数据处理二、使用步骤 *…

人工智能 2023年6月24日
0095
Ubuntu20.04上安装Tensorflow-gpu 2.5.0和pytorch(GPU),以及Anaconda+pycharm

文章目录前言一、安装Anaconda 二、安装显卡驱动三、安装GPU版Tensorflow 四、安装CUDA 五、安装cuDNN 六、安装Pytorch 七、安装Pychar…

人工智能 2023年5月23日
0087
WARNING: You are using pip version 19.2.3, however version 22.2.2 is available问题解决

一、问题现象：在编译VPP执行make install-ext-deps时报错缺失meson源码： ls: cannot access /root/vpp/vpp/build/e…

人工智能 2023年7月6日
0063
win10+VS2015编译opencv4.5.1+opencv_contrib+CUDA详细教程

背景说明好久没写博客了，因为我太懒了。懒得写这篇本来去年就要写的拖到了现在。我已经安装成功了，趁着今天有空，给大家分享一下怎么编译CUDA和opencv4.5.1。相关的编程我也…

人工智能 2023年7月19日
0056
Python出租车GPS数据的路网匹配（TransBigData+leuvenmapmatching）

本例尝试使用TransBigData+leuvenmapmatching实现出租车GPS数据的路网匹配，使用的样例数据在：https://github.com/ni1o1/tran…

人工智能 2023年7月6日
0052
Spring | 一文带你掌握IOC技术

👑 博主简介：🥇 Java领域新星创作者🥇 阿里云开发者社区专家博主、星级博主、技术博主🤝 交流社区：BoBooY（优质编程学习笔记社区）文章目录 IOC 控制反转 * 1、概念…

人工智能 2023年7月30日
0093
毕设题目：Matlab交通标志识别

1 案例背景交通标志识别技术是智能交通和自动驾驶领域中的一项关键性技术,如何建立一个准确性高、实时性好以及安全性佳的交通标志识别系统是当下一大研究热点。在简要介绍该系统框架和比较已…

人工智能 2023年6月20日
00113
阿里云天池零基础入门NLP – 新闻文本分类 2种做法，F1=0.87

代码： import pandas as pd from sklearn.feature_extraction.text import TfidfVectorizer from s…

人工智能 2023年7月1日
0054
《Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs》论文阅读笔记

《Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs》论文阅读笔记主要挑战贡献： KG上的推理…

人工智能 2023年6月1日
0075
大数据培训技术Kylin核心算法逐层构建算法

核心算法 Kylin的工作原理就是对数据模型做Cube预计算，并利用计算的结果加速查询： 1）指定数据模型，定义维度和度量； 2）预计算Cube，计算所有Cuboid并保存为物化视…

人工智能 2023年6月30日
0066
nlp 分词提取关键词的基本操作

概述从今天开始我们将开启一段自然语言处理 (NLP) 的旅程. 自然语言处理可以让来处理, 理解, 以及运用人类的语言, 实现机器语言和人类语言之间的沟通桥梁. 关键词关键词 …

人工智能 2023年5月28日
0076
卷积神经网络中二维卷积核与三维卷积核有什么区别？

目录 1 一维卷积神经网络（1D-CNN） 2 二维卷积神经网络（2D-CNN） 2.1 单通道 2.2 多通道 2.3 2D卷积的计算 3 三维卷积 3.1 3D卷…

人工智能 2023年6月16日
0093

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

【原创】人工客服会话日志挖掘论文调研

大家都在看