深度学习之语音识别-音频基础知识、声谱图(Spectrogram)

音频基础知识

声音的三要素

1.音调

人耳对声音高低的感觉称为音调(也叫音频)。音调主要与声波的频率有关。声波的频率高,则音调也高。当我们分别敲击一个小鼓和一个大鼓时,会感觉它们所发出的声音不同。小鼓被敲击后振动频率快,发出的声音比较清脆,即音调较高;而大鼓被敲击后振动频率较慢,发出的声音比较低沉,即音调较低。一般音频 儿童>女生>男生。人耳听觉音频范围是20Hz-20000Hz

深度学习之语音识别-音频基础知识、声谱图(Spectrogram)

2. 音量

这就是响度。人耳对声音强度的主观感觉称为响度。响度与声波振动的幅度有关。一般来说,声波振动的幅度越大,响度越大。当我们用力较大时,鼓膜振动大,声音大;当我们轻轻击鼓时,鼓膜振动幅度小,声音弱。

[En]

Which is loudness. The subjective feeling of the human ear on the strength of sound is called loudness. Loudness is related to the amplitude of sound wave vibration. Generally speaking, the greater the amplitude of sound wave vibration, the greater the loudness. When we beat the drum with a greater force, the vibration of the tympanic membrane is large and the sound is loud; when we beat the drum gently, the amplitude of the vibration of the tympanic membrane is small and the sound is weak.

此外,人们对响度的感知还与声波的频率有关,相同强度的声波,如果频率不同,人耳感受到的响度也不同。

[En]

In addition, people’s perception of loudness is also related to the frequency of sound waves, the same intensity of sound waves, if the frequency is different, the loudness felt by the human ear is also different.

深度学习之语音识别-音频基础知识、声谱图(Spectrogram)
3.音色

就是音响产品。音色是相同音量、相同音调的两种声音之间的差异,或者说是人耳对不同频率和强度的声波的综合反应。音色与声波的振动波形或声音的频谱结构有关。

[En]

That is, audio products. Timbre is the difference between two sounds with the same loudness and the same tone, or the comprehensive response of the human ear to sound waves of various frequencies and intensities. The timbre is related to the vibration waveform of the sound wave, or to the spectral structure of the sound.

音叉(乐器)可以产生单频声波,其波形为正弦波。但事实上,人们在自然界中听到的大多数声音都有非常复杂的波形,这些波形由基波和各种谐波组成。和声的数量和强度构成了不同的音色。当各种发声物体发出相同的音调时,它们的基本成分是相同的。然而,由于谐波的数量不同,每个谐波的幅度也不同,所以它产生不同的音色。

[En]

A tuning fork (a musical instrument) can produce a single-frequency sound wave whose waveform is a sine wave. But in fact, most of the sounds people hear in nature have very complex waveforms, which are composed of fundamental waves and a variety of harmonics. The number and strength of harmonics constitute different timbre. When all kinds of vocal objects produce the same tonal sound, their fundamental components are the same. However, because the number of harmonics is different, and the amplitude of each harmonic is different, it produces different timbre.

深度学习之语音识别-音频基础知识、声谱图(Spectrogram)

; 声谱图

什么是声波图

声音是一种 震动(vibration),它会形成波(wave),然后通过空气、水或者固体进行传播。

这种振动可以通过两种方式改变。

[En]

This vibration can be changed in two ways.

  • 通过改变它们的 频率(frequency),即这个震动震得有多快,称之为 音高(pitch)
  • 通过改变它们的 振幅(amplitude),即这个震动的具有的能量大小,被称为 *音量(volume)

声图是通过二维图像向我们展示声音数据,如下图所示:

[En]

The sonogram is to show us the sound data through two-dimensional images, as shown in the following figure:

深度学习之语音识别-音频基础知识、声谱图(Spectrogram)

这是一张超声图。它由以下部分组成:

[En]

This is a sonogram. It consists of the following parts:

  • 横坐标(时间序列):横坐标表示时间序列
  • 纵坐标(频率):纵坐标表示声音频率,纵坐标越大,说明频率越高,越接近0,说明频率越低。
  • 颜色(振幅):颜色代表振幅,颜色越亮,表示振幅越高。越暗,表示振幅越小

; 声波图举例

Google提供了一个网页,可以很方便的生成声波图,有兴趣可以去试试:https://musiclab.chromeexperiments.com/Spectrogram/

1 鸣声声波图

深度学习之语音识别-音频基础知识、声谱图(Spectrogram)
可以看到,鸟鸣的频率很高,但由于录音原因,幅度(响度)很低。
[En]

It can be seen that the frequency of bird calls is very high, but due to recording reasons, the amplitude (loudness) is very low.

2 竖琴

深度学习之语音识别-音频基础知识、声谱图(Spectrogram)
竖琴的音调频率比鸟类的低得多。底色相对为红色,表示音调最大。
[En]

The tone frequency of the harp is much lower than that of birds. The bottom color is relatively red, indicating that the tone is the loudest.

3 人声

深度学习之语音识别-音频基础知识、声谱图(Spectrogram)
这是我随便说的一句话,人声的语气还比较低。实际上,每个单词之间都有一个轻微的停顿。
[En]

This is a casual remark made by me, the tone of the human voice is still relatively low. And there is actually a slight pause between each word.

4 口哨

深度学习之语音识别-音频基础知识、声谱图(Spectrogram)
这是我用嘴吹的一声短哨,哨声的音调比较高。
[En]

This is a short whistle I whistled with my mouth, and the tone of the whistle is relatively high.

让我给你举几个例子。如果你感兴趣,你可以点击并玩。它仍然很有趣。

[En]

Let me give you these examples. if you are interested, you can click in and play. It’s still fun.

参考资料

音频基础知识:https://www.jianshu.com/p/f56114df9c0b

What is a Spectrogram?:https://www.youtube.com/watch?v=sIckmJkH2Oc

Google Spectrogram:https://musiclab.chromeexperiments.com/Spectrogram/

Original: https://blog.csdn.net/zhaohongfei_358/article/details/118282214
Author: iioSnail
Title: 深度学习之语音识别-音频基础知识、声谱图(Spectrogram)

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/498170/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球