个推科普漫画,解读《女心理师》中的智能语音识别系统

近期的很多热播剧都和心理咨询相关,在《女心理师》中,有这样一个数据智能应用,吸引了Mr.Tech的目光。

个推科普漫画,解读《女心理师》中的智能语音识别系统

▲图片来源:优酷网剧《女心理学家》

[En]

▲ photo source: Youku online drama “female psychologist”

在该女子工作的心理救助中心,不仅有酷炫的可视化数据屏,还有非常先进的智能语音识别系统。这一智能语音识别系统不仅可以将通话双方的语音内容实时翻译成文字,还可以根据通话内容进行预警,帮助心理咨询师做出判断和决策,从而更好地实施帮扶。

[En]

In the psychological rescue center where the woman works, there is not only a cool visual data screen, but also a very advanced intelligent speech recognition system. This intelligent speech recognition system can not only translate the voice content of both sides of the call into text in real time, but also make an early warning according to the content of the call, and help psychological counselors to make judgments and decisions, so as to better implement assistance.

这样的”黑科技”具体如何实现呢?接下来,Mr.Tech就为大家图文并茂做深入解读。

第一步: 用傅里叶变换将声音信号处理成波数据

众所周知,计算机不能直接对声音文件进行计算和机器学习训练。算法工程师需要对声音文件进行处理,把MP3、MP4等声音文件转化成计算机擅长解决的数学问题。

高中学过物理,我们都知道声音的本质是波,而频率和幅度是描述声波的两个重要属性。我们可以通过制作声音在特定时间段内的幅度和频率的可视图表来获得声图。

[En]

Having studied physics in high school, we all know that the essence of sound is wave, and frequency and amplitude are two important attributes that describe sound waves. * We can get the sonogram by making a visual chart of the amplitude and frequency of the sound over a certain period of time.*

个推科普漫画,解读《女心理师》中的智能语音识别系统

当你看到上面的声像图时,你会怎么想?它与数学中的正弦函数的形象非常相似吗?

[En]

What do you think of when you see the sonogram above? Is it very similar to the image of sine function in mathematics?

Bingo!

智能语音识别的第一步是选择正确的函数类型来描述不同的声波,然后将数据交给机器学习和计算。

[En]

The first step of intelligent speech recognition is to select the correct type of function to describe different sound waves, and then give the data to the machine to learn and calculate.

明白了这一点,你的右脚就踏入了智能语音识别科学的大门!

[En]

Understand this, your right foot has stepped into the door of intelligent speech recognition science!

但是声波是非常复杂的。声波是不同频率和强度的正弦波的叠加。当我们将声音可视化时,我们得到的只是叠加和合成后的声图。为了更好地理解声音信号,算法工程师还需要对声图进行分解。

[En]

But sound waves are very complicated. Sound waves are the superposition of sine waves of different frequencies and intensities. When we visualize the sound, all we get is the sound map after superposition and synthesis. In order to better understand the sound signal, the algorithm engineer also needs to decompose the sound map.

如下图:

个推科普漫画,解读《女心理师》中的智能语音识别系统

这种对声音信号进行分解(变换)的过程就是机器学习中经常提到的“傅立叶变换”。

[En]

This process of decomposing (transforming) sound signals is the “Fourier transform” that is often mentioned in machine learning.

个推科普漫画,解读《女心理师》中的智能语音识别系统
傅里叶变换(Fourier Transform,简称”FT”)

它是机器学习领域中一种非常常用的算法,其功能是对数字信号进行分析,以方便后续的数据处理。

[En]

It is a very commonly used algorithm in the field of machine learning, and its function is to analyze the digital signal in order to facilitate the subsequent data processing.

个推科普漫画,解读《女心理师》中的智能语音识别系统

第二步: 对声音进行基础的特征识别

在将声音文件解析成声图后,机器可以简单地识别和判断声音,如说话者的性别和年龄。

[En]

After the sound file is parsed into a sonogram, the machine can simply identify and judge the sound, such as the sex and age of the speaker.

我们知道,男性和女性的声音有非常明显的区别。一般来说,男声洪亮,幅度较高,频率较低;女声尖锐细腻,频率较高,幅度较低。同时,还有一些异常情况,如生病时大喊救命、呻吟等,也可以用频率和幅度来描述。

[En]

We know that there is a very obvious difference between male and female voices. Generally speaking, the male voice is loud, the amplitude is higher and the frequency is lower; the female voice is sharp and thin, the frequency is higher and the amplitude is lower. At the same time, there are some abnormal conditions, such as shouting for help and moaning during illness, which can also be described by frequency and amplitude.

个推科普漫画,解读《女心理师》中的智能语音识别系统

第三步: 将语音转化成文本,便于机器学习

女性心理医生中的智能语音识别系统还可以将语音内容实时翻译成文本,并根据语音内容进行预警。

[En]

The intelligent speech recognition system in the female psychologist can also translate the voice content into text in real time and make an early warning according to the voice content.

计算机是如何理解人类语言的?

[En]

How do computers understand human language?

所谓”人工智能”,其中离不开”人工”的作用。实现智能语音识别的本质其实是将声音波形特征和特定文字一一匹配。这就需要在前期构建 语音样本库,由人工对语音样本进行标注,然后抽取出声音波形特征和文字的对应关系,让机器去学习。通过大量的训练、学习,计算机便拥有了将语音转化成文字的能力。

个推科普漫画,解读《女心理师》中的智能语音识别系统

不过,在该阶段,转译后的文字对于计算机而言,就如同”天书”,计算机并不能理解文字中蕴含的涵义,更不能Get到文字中说话人所要表达的情感。

个推科普漫画,解读《女心理师》中的智能语音识别系统

因此,也有必要教会计算机对人类的语音进行“听懂”,并使其具备一定的专业知识水平,能够对文字内容进行情感分析和独立推理,从而实现智能预警,更好地辅助研判。

[En]

Therefore, it is also necessary to teach the computer to “understand” human speech, and make it have a certain level of professional knowledge, and be able to carry out emotional analysis and independent reasoning on the text content, so as to achieve intelligent early warning and better assist research and judgment.

怎样才能实现呢?让我们来看看第四步。

[En]

How can it be realized? Let’s take a look at step four.

第四步: 对文本内容进行情感分析

我们知道句子是由词组成的,包括停顿词(of,and,round,de,Better.),积极词(便宜,干净,美丽,好和便宜。),否定词(死,脏,坏,坏。),程度词(OK,非常好,凑合,一般,特别。)疑问词(到底是不是,不是,出乎意料,出乎意料,毕竟,简单,简单,相反,为什么,为什么。)和否定词(不,莫,不,福,不,不。)诸若此类。为了理解句子的情感和态度,有必要对句子中每个词的词性进行分析。因此我们需要对前一阶段的译文进行切分,然后综合每个词的情感倾向来得到句子的整体情感态度。

[En]

We know that sentences are made up of words, including pause words (of, and, ground, de, between.), positive words (cheap, clean, beautiful, good and cheap.), negative words (dead, dirty, bad, bad.), degree words (OK, very good, make do, general, special.) Interrogative words (whether, not, unexpectedly, unexpectedly, after all, simply, simply, on the contrary, why, why.) and negative words (no, Mo, no, Fu, no, no.) and so on. In order to understand the emotion and attitude of a sentence, it is necessary to analyze the part of speech of each word in the sentence. * so we need to segment the translated text in the previous stage, and then synthesize the emotional tendency of each word to get the overall emotional attitude of the sentence. *

个推科普漫画,解读《女心理师》中的智能语音识别系统

当计算机对文本进行情感分析时,还必须将其转化为数学问题才能解决。

[En]

When the computer carries on the emotional analysis of the text, it still has to transform it into a mathematical problem before it can be solved.

计算机通常使用以下类型的数学表达式来计算句子的情感态度:

[En]

Computers generally use the following types of mathematical expressions to calculate the emotional attitude of a sentence:

个推科普漫画,解读《女心理师》中的智能语音识别系统

举个例子,比如在”难道非得让我说差么?” (疑问词”难道”往往和否定词结合起到双重否定的作用,有时人们也会把”难道”单独当成否定词来使用) 这样一句话中,”难道”和”非”都是否定词,所以该句话的整体分值就可以计算出来了,是(-1)^21-1 = -1,那么这句话要表达的就是偏负面的态度。

再比如”难道这样不好吗?”中,”难道”和”不”都是否定词,分值为(-1)^211=1,那这句话的情感就是偏正面的。

总的来说,词汇的情感是积极的还是消极的,在不同的领域有不同的标准和观点。举个例子,在音响行业,说一台洗衣机很响,其实是一个非常积极的词,但在家电行业,说洗衣机很响,其实是表达了一种消极的态度。因此,不同行业的算法工程师需要为自己所在的行业或领域构建特色词库。

[En]

Generally speaking, whether the emotion of vocabulary is positive or negative, there are different standards and opinions in different fields. For example, “loud” is actually a very positive word in the audio industry, but in the home appliance industry, saying that a washing machine is “loud” actually expresses a negative attitude. * therefore, algorithm engineers in different industries need to build a characteristic thesaurus for their industry or field. *

在这一点上,如果你看一下女性心理学家身上的智能语音识别系统,就很容易理解它的“机智”了。在心理咨询或救援领域,“跳楼”、“自杀”都是反面词,在“不要来找我”这句话里也有一个反面词“不”。因此,当系统判断语音内容非常负面和负面时,会自动弹出相应的警报。

[En]

At this point, if you look at the intelligent speech recognition system in the female psychologist, it is easy to understand its “cleverness”. In the field of psychological counseling or rescue, “jumping off a building” and “suicide” are negative words, and there is also a negative word “no” in the sentence “don’t come to me”. Therefore, when the system judges that the voice content is very negative and negative, it will automatically pop up the corresponding alarm.

个推科普漫画,解读《女心理师》中的智能语音识别系统

▲图片来源:优酷网剧《女心理学家》

[En]

▲ photo source: Youku online drama “female psychologist”

看到这一步,祝贺你左脚踏入智能语音识别科学的大门!

[En]

Seeing this step, congratulations on putting your left foot into the door of intelligent speech recognition science!

第五步: 构建行业知识图谱

如你所见,在上图中的报警弹出框中,还有一条提示:需要专业支持,智能化程度很高。事实上,在现实生活场景中,许多电商、互联网医疗等行业企业打造的智能客服系统也进化到了非常高的智能化水平,不仅能理解文字内容,还能自主推理联想,提出相关专业建议,辅助决策。

[En]

As you can see, in the alarm pop-up box in the above picture, there is also a reminder that “professional support is needed”, with a high level of intelligence. In fact, in real life scenarios, the intelligent customer service systems built by many enterprises in e-commerce, Internet medical and other industries have also evolved to a very high level of intelligence, and they can not only understand the content of the text, but also do reasoning and association on their own, put forward relevant professional suggestions, and assist decision-making.

而这个程度的”脑力”实现,使用到的正是 知识图谱(Knowledge Graph)的技术。

个推科普漫画,解读《女心理师》中的智能语音识别系统

知识图本质上是揭示实体之间关系的语义网络,广泛应用于搜索引擎、文本挖掘等领域。比如,当用户使用搜索引擎搜索“水果”时,就会出现“水果的分类”、“水果的营养价值”、“最近的水果商店在哪里”等相关词汇。在它的背后,是“水果”领域的知识图谱。

[En]

Knowledge graph, which is essentially a semantic network that reveals the relationship between entities, is widely used in search engines, text mining and other fields. For example, when users use a search engine to search for “fruit”, there will be related terms such as “classification of fruit”, “nutritional value of fruit” and “where is the nearest fruit store”. Behind it is the knowledge graph in the field of “fruit”.

我们在现实生活中使用的智能客服系统的智能,来自于对特定行业知识图谱的不断学习。例如,电商平台的智能客服依托于商品、订单、物流等知识图谱。当用户查询商品时,智能客服会调用相关地图,为用户提供产品详情、订单状态、物流状态、历史价格走势、商品用途等相关信息和建议。

[En]

The “intelligence” of the intelligent customer service system we use in real life comes from the continuous learning of the knowledge graph of a specific industry. For example, the intelligent customer service of e-commerce platform relies on the knowledge graph of goods, orders, logistics and so on. When users inquire about a commodity, intelligent customer service will call the relevant map to provide users with product details, order status, logistics status, historical price trend, commodity use and other relevant information and suggestions.

看完这篇文章,相信大家对智能语音识别、知识图谱等技术原理都有了深刻的理解。技术不仅是生产力,也是温度。在正确的应用姿态下,大数据和人工智能可以在各个行业发挥巨大的积极价值。

[En]

After reading this article, I believe everyone has a deep understanding of the technical principles of intelligent speech recognition, knowledge graph and so on. Technology is not only productivity, but also temperature. In the correct application posture, big data and artificial intelligence can play a huge positive value in a variety of industries.

个推的图挖掘实践

作为一家数据智能公司,个推在知识图谱、图挖掘等方面的实践也非常丰富。比如,个推在开展大数据抗疫时,正是基于万亿级图的构建和挖掘,实现了 疫情态势研判、传播路径分析等场景应用。详情查看>> 2021WAIC | 每日互动CTO叶新江:万亿级图下的数据智能

Original: https://blog.csdn.net/Androilly/article/details/122199531
Author: 个推技术
Title: 个推科普漫画,解读《女心理师》中的智能语音识别系统

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/498479/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球