【原创】人工客服会话日志挖掘论文调研

  • 在人工客服服务日志中抽取问答对,配置到机器人知识库中

  • QA matching:以question为出发点,即假设question已经确定,从上下文(主要是上文)中找到该question的答案,组成问答对

  • 【ECAI-2020】Matching Questions and Answers in Dialogues from Online Forums:https://arxiv.org/pdf/2005.09276.pdf
  • 【LREC-2020】Cross-sentence Pre-trained Model for Interactive QA matching:https://www.aclweb.org/anthology/2020.lrec-1.666/
  • conversation structure modeling(dialogue structure analysis/conversation structure discovery/conversation disentanglement/discourse parsing):以answer为出发点,即假设answer已经确定,从上下文(主要是上文)中找到该answer对应的问题(reply-to relation),组成问答对,目前针对multi-party dialogue的研究较多
    two-party dialogue
  • 【AAAI-2017】Discovering Conversational Dependencies between Messages in Dialogs
  • 【AAAI-2019】Learning to Align Question and Answer Utterances in Customer Service Conversation with Recurrent Pointer Networks:https://ojs.aaai.org//index.php/AAAI/article/view/3778
    multi-party dialogue
  • 【NAACL-2018】Learning to Disentangle Interleaved Conversational Threads with a Siamese Hierarchical Network and Similarity Ranking:multi-party, SHCNN, CISIR
  • 【AAAI-2019】A Deep Sequential Model for Discourse Parsing on Multi-Party Dialogues:multi-party
  • 【NAACL-2019】Context-Aware Conversation Thread Detection in Multi-Party Chat:multi-party
  • 【AAAI-2020】Who Did They Respond to? Conversation Structure Modeling Using Masked Hierarchical Transformer:https://ojs.aaai.org/index.php/AAAI/article/view/6524

  • 论文
    【ACL-2020】Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs:https://www.aclweb.org/anthology/2020.acl-main.20/
    【ACL-2020】Harvesting and Refining Question-Answer Pairs for Unsupervised QA:https://www.aclweb.org/anthology/2020.acl-main.600/
    【EMNLP-2020】QADiscourse – Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines:https://www.aclweb.org/anthology/2020.emnlp-main.224/
    【ACL-2020】A Smart System to Generate and Validate Question Answer Pairs for COVID-19 Literature:https://www.aclweb.org/anthology/2020.sdp-1.4.pdf
    【AAAI-2020】On the Generation of Medical Question-Answer Pairs:https://ojs.aaai.org//index.php/AAAI/article/view/6410
    【ACL-2016】QA-It: Classifying Non-Referential It for Question Answer Pairs:https://www.aclweb.org/anthology/P16-3020
    【NAACL-2016】Watson Discovery Advisor: Question-answering in an industrial setting:https://www.aclweb.org/anthology/W16-0101.pdf
    【ACL-2019】Synthetic QA Corpora Generation with Roundtrip Consistency:https://www.aclweb.org/anthology/P19-1620.pdf
    【EMNLP-2017】Question Generation for Question Answering:https://www.aclweb.org/anthology/D17-1090.pdf
    【ACL-2018】Neural Models for Key Phrase Extraction and Question Generation:https://www.aclweb.org/anthology/W18-2609.pdf

Abstract. Matching question-answer relations between two turns in conversations is not only the first step in analyzing dialogue structures, but also valuable for training dialogue systems. This paper presents a QA matching model considering both distance information and dialogue history by two simultaneous attention mechanisms called mutual attention. Given scores computed by the trained model between each non-question turn with its candidate questions, a greedy matching strategy is used for final predictions. Because existing dialogue datasets such as the Ubuntu dataset are not suitable for the QA matching task, we further create a dataset with 1,000 labeled dialogues and demonstrate that our proposed model outperforms the state-of-the-art and other strong baselines, particularly for matching long-distance QA pairs.

Abstract. Customers ask questions, and customer service staffs answer those questions. It is the basic service manner of customer service (CS). The progress of CS is a typical multi-round conversation. However, there are no explicit corresponding relations among conversational utterances. This paper focuses on obtaining explicit alignments of question and answer utterances in CS. It not only is an important task of dialogue analysis, but also able to obtain lots of valuable train data for learning dialogue systems. In this work, we propose end-to-end models for aligning question (Q) and answer (A) utterances in CS conversation with recurrent pointer networks (RPN). On the one hand, RPN-based alignment models are able to model the conversational contexts and the mutual influence of different Q-A alignments. On the other hand, they are able to address the issue of empty and multiple alignments for some utterances in a unified manner. We construct a dataset from an in-house online CS. The experimental results demonstrate that the proposed models are effective to learn the alignments of question and answer utterances.

Abstract.Conversation structure is useful for both understanding the nature of conversation dynamics and for providing features for many downstream applications such as summarization of conversations. In this work, we define the problem of conversation structure modeling as identifying the parent utterance(s) to which each utterance in the conversation responds to. Previous work usually took a pair of utterances to decide whether one utterance is the parent of the other. We believe the entire ancestral history is a very important information source to make accurate prediction. Therefore, we design a novel masking mechanism to guide the ancestor flow, and leverage the transformer model to aggregate all ancestors to predict parent utterances. Our experiments are performed on the Reddit dataset (Zhang, Culbertson, and Paritosh 2017) and the Ubuntu IRC dataset (Kummerfeld et al. 2019). In addition, we also report experiments on a new larger corpus from the Reddit platform and release this dataset. We show that the proposed model, that takes into account the ancestral history of the conversation, significantly outperforms several strong baselines including the BERT model on all datasets.

Abstract. One of the most crucial challenges in question answering (QA) is the scarcity of labeled data, since it is costly to obtain question-answer (QA) pairs for a target text domain with human annotation. An alternative approach to tackle the problem is to use automatically generated QA pairs from either the problem context or from large amount of unstructured texts (e.g. Wikipedia). In this work, we propose a hierarchical conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts, while maximizing the mutual information between generated QA pairs to ensure their consistency. We validate our Information Maximizing Hierarchical Conditional Variational AutoEncoder (InfoHCVAE) on several benchmark datasets by evaluating the performance of the QA model (BERT-base) using only the generated QA pairs (QA-based evaluation) or by using both the generated and human-labeled pairs (semisupervised learning) for training, against stateof-the-art baseline models. The results show that our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training

Original: https://blog.csdn.net/u010567574/article/details/122731564
Author: suvedo
Title: 【原创】人工客服会话日志挖掘论文调研

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/544467/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球