一种基于精神卫生访谈信息实现长文本语音识别增强处理的方法

当前通用语音识别模型针对精神卫生领域医患访谈的识别率不高。目前的录音一般是需要用户在安静环境下,近距离、慢慢的、认真清晰发音。经实际验证,在一些实际医疗访谈场景,通用ASR准确率效果还不够好的,比如,如果拿到医患的真实场景内去验证,结果发现,诊疗室语音识别效果非常不理想。医患距离过远,收音情况不理想,实际场景中发现现有的通用ASR也没有特别严格的医患测试,除了多人说话的干扰,还有各种噪音以及患者情绪的波动经常处于信号不好的情况。访谈中精卫领域的专有术语混合。特别在描述病情的场景,用户说抑郁专有词汇时,很容易识别错误的。总之,ASR是目前AI领域,通用场景下可以商用成熟的技术,但是医患访谈领域,但还是要医患双方可以配合AI在特定场景下使用。近年来在精神卫生领域的诊断当中,每个工作日几乎都可能面临海量的患者诊疗,而在访谈过程中,医生一般专注于跟进病情陈述内容、诊疗进程,在诊疗结束后,病历纪要往往需要依靠医生根据访谈过程进行收集整理,甚至需要护士跟进访谈上所有人的语音信息进行整理、编辑,甚至需要跟进患者确认,从而导致整理病历的过程需要人力成本的投入,耗费人力又费时。

目前,病历整理通常是通过医务人员人手记录,没有普遍使用AI语音识别技术,最多也只是机器识别医患的语音并转换为文字记录。但是,机器只能单纯实现从语音到文字上的转换,并不能够对医疗相关的内容进行理解和整理,当前的语音转换系统尤其不能针对医疗以及精神卫生领域的关键词,语音习惯,语音模型等进行定制,识别准确率较差。

病历是记录访谈点、归档和传递病案信息的重要手段。随着信息时代的到来,高准确度的全文病历越来越受到越来越多医疗单位的重视。比较传统的病历实现方式是整理医生的问诊过程记录,这取决于医生或护士的专注度。由于录音时的诊断,在走神时也会出现遗漏音符的现象。随着科技的发展,出现了工作板记录仪等产品来帮助记录医患面谈,再通过回放记录的关键点进行人工细化,解决了病历记录效率低下的问题。

[En]

Medical record is an important means to record interview points in order to archive and transmit medical record information. with the arrival of the information age, full-text medical records with high accuracy have been paid more and more attention by more and more medical units. The more traditional way to realize the medical record is to arrange the record of the interview process of the doctor, which depends on the concentration of the doctor or nurse. Because of the diagnosis while recording, the phenomenon of missing notes will also occur when distracted. With the development of science and technology, the emergence of work plate recorders and other products to help record doctor-patient interviews, and then through the playback of the recording of the key points of manual refinement to solve the inefficient recording of medical records.

为了克服上述现有技术的不足,我们设计了一种基于心理健康访谈信息的长文本语音识别增强处理的实现方法如下:一种基于心理健康访谈信息的长文本语音识别增强处理的实现方法,其主要特征是该方法包括以下步骤:

[En]

In order to overcome the shortcomings of the above existing technologies, we have designed a method to realize long text speech recognition enhancement processing based on mental health interview information as follows: a method for realizing long text speech recognition enhancement processing based on mental health interview information, the main feature is that the method includes the following steps:

(1 )接收访谈语音信号,对信号进行预处理,输出特征数据;

(2 )构建CTC 声学模型,将语音信息转化为基本音素信息;

(3 )通过语言模型和发音字典将基本音素信息解码为中文信息,获取抑郁症访谈文本。

较佳地,所述的步骤(1 )具体包括以下步骤:

(1.1 )接收访谈语音信号;

(1.2 )对语音信号进行处理;

(1.3 )对语音信号中的特征信息进行处理,输出特征数据;

较佳地,所述的步骤(3 )具体包括以下步骤:

(3.1 )通过抑郁症热词库构建针对医疗数据的Transformer 语言模型;

(3.2 )通过语言模型和发音字典将基本音素信息解码为中文信息。

优选地,该方法还包括调试语言模型的步骤,具体包括以下步骤:

[En]

Preferably, the method also includes the steps of debugging the language model, which specifically includes the following steps:

(4 )根据抑郁症访谈文本,调整词汇出现的概率,更新抑郁症热词库

抑郁症智能诊断和病例系统预先设置了常用的精神疾病文本和词汇,并设置了词汇出现的概率。

[En]

The intelligent diagnosis and case system of depression sets the commonly used mental illness text and vocabulary in advance, and sets the probability of vocabulary occurrence.

采用基于心理健康访谈信息的长文本语音识别增强处理方法,能够识别常见的抑郁句型和词语,并根据已提供的心理疾病文本调整词汇出现概率。达到了症状词准确识别的效果,识别准确率达到90%以上。

[En]

The method of long text speech recognition enhancement processing based on mental health interview information is adopted, which can identify common depression sentence patterns and terms, and adjust the probability of vocabulary occurrence according to the mental illness text already provided. the effect of accurate recognition of symptom words is achieved, and the recognition accuracy is more than 90%.

Original: https://blog.csdn.net/zhangruhuan501/article/details/123087403
Author: zhangruhuan501
Title: 一种基于精神卫生访谈信息实现长文本语音识别增强处理的方法

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/498128/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球