《MATLAB语音信号分析与合成（第二版）》：第2章 语音信号的时域、频域特性和短时分析技术

《MATLAB语音信号分析与合成（第二版）》是中科院声学所的大佬宋知用老师数十年经验积累下的呕心之作，对于语音信号处理相关感兴趣的同学，日后希望在语音信号分析、处理与合成相关领域进行一定研究的话，可以以此进行入门。

[En]

The main direction of my graduate tutor is related to speech signal processing. Although my major thesis during my postgraduate period is digital image processing, the so-called speech and image are not separated. Although I paddle on the wavelet transform of the teacher’s main graduate lecture, but in the later tutor’s speech signal processing course design and engineering application, I also included a little bit of speech. He won the first place in the group in the closing test, and the tutor specially handed out a meal fund of 300 US dollars to encourage him.

[En]

This time to pick up speech recognition, just started with Mr. Song’s book, can be regarded as their own review, mainly to introduce the source code in each chapter, this is the second chapter of the book’s four simulation application examples, not to say much, start!

1. 数据与函数路径设置

[En]

Add, save, and start the simulation.

; 2. MATLAB仿真一：语音信号短时能量图

%
% pr2_3_1
clear all; clc; close all;

filedir=[];                % 设置路径
filename='bluesky3.wav';   % 设置文件名
fle=[filedir filename];    % 构成完整的路径和文件名

wlen=200; inc=80;          % 给出帧长和帧移
win=hanning(wlen);         % 给出海宁窗
N=length(x);               % 信号长度
X=enframe(x,win,inc)';     % 分帧
fn=size(X,2);              % 求出帧数
time=(0:N-1)/Fs;           % 计算出信号的时间刻度
for i=1 : fn
u=X(:,i);              % 取出一帧
u2=u.*u;               % 求出能量
En(i)=sum(u2);         % 对一帧累加求和
end
subplot 211; plot(time,x,'k'); % 画出时间波形
title('语音波形');
ylabel('幅值'); xlabel(['时间/s' 10 '(a)']);
frameTime=frame2time(fn,wlen,inc,Fs);   % 求出每帧对应的时间
subplot 212; plot(frameTime,En,'k')     % 画出短时能量图
title('短时能量');
ylabel('幅值'); xlabel(['时间/s' 10 '(b)']);



1. MATLAB仿真二：语音信号短时平均过零率图
%
% pr2_3_2
clear all; clc; close all;

filedir=[];                       % 设置路径
filename='bluesky3.wav';          % 设置文件名
fle=[filedir filename];           % 构成完整的路径和文件名
x=xx-mean(xx);                    % 消除直流分量
wlen=200; inc=80;                 % 设置帧长、帧移
win=hanning(wlen);                % 窗函数
N=length(x);                      % 求数据长度
X=enframe(x,win,inc)';            % 分帧
fn=size(X,2);                     % 获取帧数
zcr1=zeros(1,fn);                 % 初始化
for i=1:fn
z=X(:,i);                     % 取得一帧数据
for j=1: (wlen- 1) ;          % 在一帧内寻找过零点
if z(j)* z(j+1)< 0       % 判断是否为过零点
zcr1(i)=zcr1(i)+1;   % 是过零点，记录1次
end
end
end
time=(0:N-1)/Fs;                  % 计算时间坐标
frameTime=frame2time(fn,wlen,inc,Fs);  % 求出每帧对应的时间
% 作图
subplot 211; plot(time,x,'k'); grid;
title('语音波形');
ylabel('幅值'); xlabel(['时间/s' 10 '(a)']);
subplot 212; plot(frameTime,zcr1,'k'); grid;
title('短时平均过零率');
ylabel('幅值'); xlabel(['时间/s' 10 '(b)']);



1. MATLAB仿真三：语音信号语谱图
%
% pr2_4_1
clear all; clc; close all;

filedir=[];                         % 设置路径
filename='bluesky3.wav';            % 设置文件名
fle=[filedir filename];             % 构成完整的路径和文件名
wlen=200; inc=80; win=hanning(wlen);% 设置帧长，帧移和窗函数
N=length(x); time=(0:N-1)/Fs;       % 计算时间
y=enframe(x,win,inc)';              % 分帧
fn=size(y,2);                       % 帧数
frameTime=(((1:fn)-1)*inc+wlen/2)/Fs; % 计算每帧对应的时间
W2=wlen/2+1; n2=1:W2;
freq=(n2-1)*Fs/wlen;                % 计算FFT后的频率刻度
Y=fft(y);                           % 短时傅里叶变换
clf                                 % 初始化图形
%=====================================================%
% Plot the STFT result              % 画出语谱图
%=====================================================%
set(gcf,'Position',[20 100 600 500]);
axes('Position',[0.1 0.1 0.85 0.5]);
imagesc(frameTime,freq,abs(Y(n2,:))); % 画出Y的图像
axis xy; ylabel('频率/Hz');xlabel('时间/s');
title('语谱图');
m = 64;
LightYellow = [0.6 0.6 0.6];
MidRed = [0 0 0];
Black = [0.5 0.7 1];
Colors = [LightYellow; MidRed; Black];
colormap(SpecColorMap(m,Colors));

%=====================================================%
% Plot the Speech Waveform          % 画出语音信号的波形
%=====================================================%
axes('Position',[0.07 0.72 0.9 0.22]);
plot(time,x,'k');
xlim([0 max(time)]);
xlabel('时间/s'); ylabel('幅值');
title('语音信号波形');



1. MATLAB仿真四：语音信号短时功率谱密度函数图
%
% pr2_4_2
clear all; clc; close all;

filedir=[];                                    % 设置路径
filename='bluesky3.wav';                       % 设置文件名
fle=[filedir filename];                        % 构成完整的路径和文件名
nwind=240; noverlap=160; inc=nwind-noverlap;   % 设置帧长为240，重叠为160，帧移为80
w_nwind=hanning(200); w_noverlap=195;          % 设置段长为200，段重叠为195
nfft=200;                                      % FFT长度为200
% 对每帧用pwelch_2计算短时功率谱密度
[Pxx] = pwelch_2(wavin0, nwind, noverlap, w_nwind, w_noverlap, nfft);
freq=(0:nfft/2)*fs/nfft;                       % 计算频率刻度
% 作图
imagesc(frameTime,freq,Pxx); axis xy
ylabel('频率/Hz');
xlabel('时间/s');
title('短时功率谱密度函数')
m = 256; LightYellow = [0.6 0.6 0.6];
MidRed = [0 0 0]; Black = [0.5 0.7 1];
Colors = [LightYellow; MidRed; Black];
colormap(SpecColorMap(m,Colors));



[En]

The short-term stationarity of speech signal is a very important characteristic, in which short-term energy, short-term average zero-crossing rate, spectrogram and short-term power spectral density function are the basis of speech signal analysis, which need to be deeply understood and familiarized. if you are interested in the content of this chapter or want to fully learn, it is suggested to study the content of the second chapter in the book. In the later stage, some of these knowledge points will be discussed and supplemented on the basis of their own understanding. You are welcome to learn and communicate together.

Original: https://blog.csdn.net/sinat_34897952/article/details/124029488
Author: mozun2020
Title: 《MATLAB语音信号分析与合成（第二版）》：第2章 语音信号的时域、频域特性和短时分析技术

