【语音识别】基于matlab GUI MFCC+VQ说话人识别系统【含Matlab源码 1153期】

⛄一、MFCC+VQ简介

1 引言
在这样一个人类生活的高度互动的社会里,确定一个人的身份是非常重要的。传统的身份验证方法(如密钥、证书、密码等)已经不能满足社会的需要。基于生物特征的身份认证技术为我们提供了一种更加方便可靠的身份认证方法。它引起了国内外学术界和企业界的高度关注。说话人识别技术是根据语音波形中反映说话人生理和行为特征的语音参数自动识别说话人身份的技术。

[En]

In such a highly interactive society in which human beings live, it is very important to determine a person’s identity. Traditional identity authentication methods (such as keys, certificates, passwords, etc.) can no longer meet the needs of society. Identity authentication technology based on biometrics provides us with a more convenient and reliable method. It has attracted great attention from academic and business circles at home and abroad. Speaker recognition technology is a technology that automatically recognizes the identity of the speaker according to the speech parameters that reflect the physiological and behavioral characteristics of the speaker in the speech waveform.

属于生物识别技术的一种。从说话人识别系统的职能上看, 可以分为说话人辨认系统和说话人确认系统。从识别基于的对象来看, 又可以分为基于文本的说话人识别系统和文本无关的说话人识别系统两大类。根据待识别的说话人是否在注册的说话人集合内, 说话人识别可以分为开集 (Open-set) 识别和闭集 (Close-set) 识别。说话人识别的关键问题在于特征参数的选择与识别模型的建立, 目前常用的特征参数有LPC、LPCC以及MFCC等, 常用的识别模型有DTW、VQ、HMM等。

2 说话人识别的过程及系统框架
图1中, 建立和应用说话人识别系统分为两个部分:训练 (或注册) 部分和识别部分。

【语音识别】基于matlab GUI MFCC+VQ说话人识别系统【含Matlab源码 1153期】
图1 说话人识别系统框图
3 说话人识别中的特征提取
3.1 Mel倒谱参数 (MFCC)
倒谱特征是用于说话人个性特征和说话人识别的最有效的特征之一[4]。实验表明, 大部分情况下, MFCC优于其他倒谱系数。其提取及计算过程如下:
(1) 原始语音信号S (n) 经过预加重、分帧、加窗等处理, 得到每个语音帧的时域信号X (n) 。然后经过离散傅里叶变换 (DFT) 后得到离散频谱X (k) 。设语音信号的DFT为:
【语音识别】基于matlab GUI MFCC+VQ说话人识别系统【含Matlab源码 1153期】
式中x (n) 为输入的语音信号, N表示傅里叶变换的点数。
(2) 将上述离散频谱X (k) 通过Mel频率滤波器组得到Mel频谱并通过对数能量的处理, 得到对数频谱S (m) 。
(3) 计算每个滤波器组输出的对数能量为:
【语音识别】基于matlab GUI MFCC+VQ说话人识别系统【含Matlab源码 1153期】
(4) 经离散余弦变换 (DCT) 得到MFCC系数:
【语音识别】基于matlab GUI MFCC+VQ说话人识别系统【含Matlab源码 1153期】
例如, 语音样本为”说话人识别”, 采样率8000kHz, 精度为8bit, 滤波器组数为24, 系数选取了前面的16个 (C0~C15) 提取的MFCC参数如图2, 图中x轴表示语音分析的帧数, y轴表示倒谱系数的维数, z轴表示对应的倒谱值。图2 (a) 带有倒谱C0项, 图2 (b) 不带有倒谱C0项。
【语音识别】基于matlab GUI MFCC+VQ说话人识别系统【含Matlab源码 1153期】
图2 MFCC参数
从图中可以看出, 对于MFCC系数, 它的第一维的值C0的能量很大, 故在一般的识别系统中, 将其称为能量系数, 不作为倒谱系数中的一员。

3.2 不同参数结合特征
表征说话人特征的主要参数有:基音周期、倒谱系数、共振峰频率和带宽、声调轮廓等。在所有上述特征参数中,我们不能说哪一个参数可以单独使用来有效和可靠地表征说话人。一般来说,为了更有效地表示说话人的特征,更经常地使用多个特征参数的组合。当组合参数之间的相关性不大时,会有较好的效果,因为它们反映了语音信号的不同特征。

[En]

The main parameters that characterize the speaker’s characteristics are: pitch period, cepstrum coefficient, formant frequency and bandwidth, tone profile and so on. Among all the above characteristic parameters, we can not say which parameter can be used alone to effectively and reliably characterize the speaker. In general, in order to represent the speaker’s characteristics more effectively, the combination of several feature parameters is more often used. When the correlation between the combined parameters is not large, it will have a better effect, because they reflect the different characteristics of the speech signal.

(1) 基音特征参数与倒谱特征结合, 它们分别描述了说话人声道、声带特征, 可以充分反映说话人特征。
(2) 利用倒谱系数和差值倒谱系数作为描述声道的信息, 利用基音和差值基音来描述激励派。
(3) 采用倒谱系数和相应的差分倒谱参数相结合等。

4 识别模型VQ
对于N个说话人集合的系统, 需要为每个人建立一个码本。训练的时候, 用LBG算法, 由说话人语音的训练样本序列聚类生成码本。识别的时候, 用同样的方法从待识别语音中提取特征序列X1…XN, 然后用系统中建立的N个码本对其进行矢量量化, 用式

【语音识别】基于matlab GUI MFCC+VQ说话人识别系统【含Matlab源码 1153期】
定义的平均量化畸变程度判断该矢量序列与哪一个码本的分布最为接近。其中Yji表示第i个说话人码本的第j个码字, T是特征矢量序列的长度, 也就是待识别语音所包含帧的总数, 式中, d (Xi, Yji) 采用欧式距离测度。最终的识别结果就是Di最小所对应的第i个说话人。
利用矢量量化技术时, 主要有两个问题要解决:
(1) 设计一个好的码本;
(2) 未知矢量的量化。

; ⛄二、部分源代码

function varargout = untitled(varargin)
% UNTITLED M-file for untitled.fig
% UNTITLED, by itself, creates a new UNTITLED or raises the existing
% singleton*.

%
% H = UNTITLED returns the handle to a new UNTITLED or the handle to
% the existing singleton*.

%
% UNTITLED(‘CALLBACK’,hObject,eventData,handles,…) calls the local
% function named CALLBACK in UNTITLED.M with the given input arguments.

%
% UNTITLED(‘Property’,’Value’,…) creates a new UNTITLED or raises the
% existing singleton*. Starting from the left, property value pairs are
% applied to the GUI before untitled_OpeningFunction gets called. An
% unrecognized property name or invalid value makes property application
% stop. All inputs are passed to untitled_OpeningFcn via varargin.

%
% *See GUI Options on GUIDE’s Tools menu. Choose “GUI allows only one
% instance to run (singleton)”.

%
% See also: GUIDE, GUIDATA, GUIHANDLES

% Edit the above text to modify the response to help untitled

% Last Modified by GUIDE v2.5 21-May-2021 13:54:38

% Begin initialization code – DO NOT EDIT
gui_Singleton = 1;
gui_State = struct(‘gui_Name’, mfilename, …

‘gui_Singleton’, gui_Singleton, …

‘gui_OpeningFcn’, @untitled_OpeningFcn, …

‘gui_OutputFcn’, @untitled_OutputFcn, …

‘gui_LayoutFcn’, [] , …

‘gui_Callback’, []);
if nargin & isstr(varargin{1})
gui_State.gui_Callback = str2func(varargin{1});
end

if nargout
[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
gui_mainfcn(gui_State, varargin{:});
end
% End initialization code – DO NOT EDIT

% — Executes just before untitled is made visible.

function untitled_OpeningFcn(hObject, eventdata, handles, varargin)
% This function has no output args, see OutputFcn.

% hObject handle to figure
% eventdata reserved – to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% varargin command line arguments to untitled (see VARARGIN)

% Choose default command line output for untitled
handles.output = hObject;

% Update handles structure
guidata(hObject, handles);

% This sets up the initial plot – only do when we are invisible
% so window can get raised using untitled.

if strcmp(get(hObject,’Visible’),’off’)
plot(sin(1:0.01:25));
end

axes(handles.axes1);cla;plot(rand(5));
axes(handles.axes3);cla;plot(rand(5));

% UIWAIT makes untitled wait for user response (see UIRESUME)
% uiwait(handles.figure1);

% — Outputs from this function are returned to the command line.

function varargout = untitled_OutputFcn(hObject, eventdata, handles)
% varargout cell array for returning output args (see VARARGOUT);
% hObject handle to figure
% eventdata reserved – to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)

% Get default command line output from handles structure
varargout{1} = handles.output;

% — Executes on button press in pushbutton1.

function pushbutton1_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton1 (see GCBO)
% eventdata reserved – to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
axes(handles.axes1);
cla;
%plot(rand(5));

% ——————————————————————–
function FileMenu_Callback(hObject, eventdata, handles)
% hObject handle to FileMenu (see GCBO)
% eventdata reserved – to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)

% ——————————————————————–
function OpenMenuItem_Callback(hObject, eventdata, handles)
% hObject handle to OpenMenuItem (see GCBO)
% eventdata reserved – to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
file = uigetfile(‘*.fig’);
if ~isequal(file, 0)
open(file);
end

% ——————————————————————–
function PrintMenuItem_Callback(hObject, eventdata, handles)
% hObject handle to PrintMenuItem (see GCBO)
% eventdata reserved – to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
printdlg(handles.figure1)

% ——————————————————————–
function CloseMenuItem_Callback(hObject, eventdata, handles)
% hObject handle to CloseMenuItem (see GCBO)
% eventdata reserved – to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
selection = questdlg([‘Close ‘ get(handles.figure1,’Name’) ‘?’],…

[‘Close ‘ get(handles.figure1,’Name’) ‘…’],…

‘Yes’,’No’,’Yes’);
if strcmp(selection,’No’)
return;
end

delete(handles.figure1)

% — Executes during object creation, after setting all properties.

function popupmenu1_CreateFcn(hObject, eventdata, handles)
% hObject handle to popupmenu3 (see GCBO)
% eventdata reserved – to be defined in a future version of MATLAB
% handles empty – handles not created until after all CreateFcns called

% Hint: popupmenu controls usually have a white background on Windows.

% See ISPC and COMPUTER.

if ispc
set(hObject,’BackgroundColor’,’white’);
else
set(hObject,’BackgroundColor’,get(0,’defaultUicontrolBackgroundColor’));
end

set(hObject, ‘String’, {‘plot(rand(5))’, ‘plot(sin(1:0.01:25))’, ‘comet(cos(1:.01:10))’, ‘bar(1:10)’, ‘plot(membrane)’, ‘surf(peaks)’});

% — Executes on selection change in popupmenu3.

function popupmenu1_Callback(hObject, eventdata, handles)
% hObject handle to popupmenu3 (see GCBO)
% eventdata reserved – to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)

% Hints: contents = get(hObject,’String’) returns popupmenu3 contents as cell array
% contents{get(hObject,’Value’)} returns selected item from popupmenu3

% — Executes on button press in pushbutton4.

function pushbutton4_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton4 (see GCBO)
% eventdata reserved – to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)

n=get(handles.popupmenu2, ‘Value’);

second=1;%录音时长(秒)
framelnc = 100;%length of Frames excursion
framelen = 256;%length of Frames 采样频率12.500kHz ,桢长为020.5ms
Fs =8000;
pause(2);

message={‘录音开始!’};
msgbox(message);
x= audiorecord(second*Fs, Fs, ‘double’);
message={‘录音结束!’};
msgbox(message);
pause(1);
audioplay(x, Fs);

[x1,x2,amp,zcr]=vad2(x,framelen,framelnc);

axes(handles.axes1);
cla;
%subplot(3,1,1)
plot(x)
axis([1 length(x) -1 1])
line([x1 _framelnc x1_framelnc],[-1 1],’color’,’red’);
line([x2 _framelnc x2_framelnc],[-1 1],’color’,’red’);
ylabel(‘归一化的原始信号’)
text(x1 _framelnc,0.5,’起始端点 \rightarrow’,…
‘HorizontalAlignment’,’right’)
text(x2_framelnc,0.5,’\leftarrow 结束端点 ‘,…

‘HorizontalAlignment’,’left’)

axes(handles.axes3);
cla;
plot(amp,’b’);
hold on;
plot(zcr,’y’);
length(amp)
length(zcr)
pmax=max(max(amp),max(zcr));
pmin=min(min(amp),min(zcr));
axis([1 length(amp) 0 pmax])
line([x1 x1],[pmin,pmax],’color’,’red’);
line([x2 x2],[pmin,pmax],’color’,’red’);
ylabel(‘短时能量(蓝色),过零率(黄色)’)

text(x1,pmax/2,’起始端点 \rightarrow’,…

‘HorizontalAlignment’,’right’)
text(x2,pmax/2,’\leftarrow 结束端点 ‘,…

‘HorizontalAlignment’,’left’)

file = sprintf(‘mytrain\s%d.wav’,n);
WAVWRITE(x,Fs,file);

% — Executes on button press in pushbutton5.

function pushbutton5_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton5 (see GCBO)
% eventdata reserved – to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
%code=train1(‘mytrain’,4);
Fm = 100;%length of Frames excursion
Fn = 256;%length of Frames 采样频率12.500kHz ,桢长为020.5ms
k = 16; % number of centroids required
n=8;
traindir=’mytrain’;

⛄三、运行结果

【语音识别】基于matlab GUI MFCC+VQ说话人识别系统【含Matlab源码 1153期】
【语音识别】基于matlab GUI MFCC+VQ说话人识别系统【含Matlab源码 1153期】

; ⛄四、matlab版本及参考文献

1 matlab版本
2014a

2 参考文献
[1]韩纪庆,张磊,郑铁然.语音信号处理(第3版)[M].清华大学出版社,2019.

[2]柳若边.深度学习:语音识别技术实践[M].清华大学出版社,2019.

3 备注
本部分摘录自互联网,仅供参考,如有侵权,请联系删除

[En]

Brief introduction this part is extracted from the Internet, for reference only, if infringement, contact to delete

Original: https://blog.csdn.net/TIQCmatlab/article/details/119045794
Author: 海神之光
Title: 【语音识别】基于matlab GUI MFCC+VQ说话人识别系统【含Matlab源码 1153期】

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/513218/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球