✅作者简介:热爱科研的Matlab仿真开发者,修心和技术同步精进,matlab项目合作可私信。
🍎个人主页:Matlab科研工作室
🍊个人信条:格物致知。
更多Matlab仿真内容点击👇
⛄ 内容介绍
语音识别是一门交叉学科,它与语音学,语言学,数字信号处理,模式识别,最优化理论,计算机科学等众多学科紧密相连,是一门既有理论价值又有实际意义的学科.语音识别发展了几十年,取得了很大的进步.语音识别在某些方面还是存在一些问题:如对自然语言的识别和理解还属于初级阶段;语音信息量大不易存储;语音的模糊性;单个字母或词,字的语音特性受上下文的影响,以致改变了重音,音调,音量和发音速度等;环境噪声和其它干扰对语音识别有严重影响,致使识别率低.因此,解决以上问题是语音识别发展的重点.
⛄ 部分代码
function [x,mn,mx]=melbankm(p,n,fs,fl,fh,w)
%MELBANKM determine matrix for a mel-spaced filterbank [X,MN,MX]=(P,N,FS,FL,FH,W)
%
% Inputs: p number of filters in filterbank
% n length of fft
% fs sample rate in Hz
% fl low end of the lowest filter as a fraction of fs (default = 0)
% fh high end of highest filter as a fraction of fs (default = 0.5)
% w any sensible combination of the following:
% 't' triangular shaped filters in mel domain (default)
% 'n' hanning shaped filters in mel domain
% 'm' hamming shaped filters in mel domain
%
% 'z' highest and lowest filters taper down to zero (default)
% 'y' lowest filter remains at 1 down to 0 frequency and
% highest filter remains at 1 up to nyquist freqency
%
% If 'ty' or 'ny' is specified, the total power in the fft is preserved.
%
% Outputs: x a sparse matrix containing the filterbank amplitudes
% If x is the only output argument then size(x)=[p,1+floor(n/2)]
% otherwise size(x)=[p,mx-mn+1]
% mn the lowest fft bin with a non-zero coefficient
% mx the highest fft bin with a non-zero coefficient
%
% Usage: f=fft(s); f=fft(s);
% x=melbankm(p,n,fs); [x,na,nb]=melbankm(p,n,fs);
% n2=1+floor(n/2); z=log(x*(f(na:nb)).*conj(f(na:nb)));
% z=log(x*abs(f(1:n2)).^2);
% c=dct(z); c(1)=[];
%
% To plot filterbanks e.g. plot(melbankm(20,256,8000)')
%
% Copyright (C) Mike Brookes 1997
% Version: $Id: melbankm.m,v 1.3 2005/02/21 15:22:13 dmb Exp $
%
% VOICEBOX is a MATLAB toolbox for speech processing.
% Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% This program is free software; you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation; either version 2 of the License, or
% (at your option) any later version.
%
% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
%
% You can obtain a copy of the GNU General Public License from
% ftp://prep.ai.mit.edu/pub/gnu/COPYING-2.0 or by writing to
% Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
if nargin < 6
w='tz';
if nargin < 5
fh=0.5;
if nargin < 4
fl=0;
end
end
end
f0=700/fs;
fn2=floor(n/2);
lr=log((f0+fh)/(f0+fl))/(p+1);
% convert to fft bin numbers with 0 for DC term
bl=n*((f0+fl)*exp([0 1 p p+1]*lr)-f0);
b2=ceil(bl(2));
b3=floor(bl(3));
if any(w=='y')
pf=log((f0+(b2:b3)/n)/(f0+fl))/lr;
fp=floor(pf);
r=[ones(1,b2) fp fp+1 p*ones(1,fn2-b3)];
c=[1:b3+1 b2+1:fn2+1];
v=2*[0.5 ones(1,b2-1) 1-pf+fp pf-fp ones(1,fn2-b3-1) 0.5];
mn=1;
mx=fn2+1;
else
b1=floor(bl(1))+1;
b4=min(fn2,ceil(bl(4)))-1;
pf=log((f0+(b1:b4)/n)/(f0+fl))/lr;
fp=floor(pf);
pm=pf-fp;
k2=b2-b1+1;
k3=b3-b1+1;
k4=b4-b1+1;
r=[fp(k2:k4) 1+fp(1:k3)];
c=[k2:k4 1:k3];
v=2*[1-pm(k2:k4) pm(1:k3)];
mn=b1+1;
mx=b4+1;
end
if any(w=='n')
v=1-cos(v*pi/2);
elseif any(w=='m')
v=1-0.92/1.08*cos(v*pi/2);
end
if nargout > 1
x=sparse(r,c,v);
else
x=sparse(r,c+mn-1,v,p,1+fn2);
end
⛄ 运行结果
⛄ 参考文献
[1] 李法强. 基于小波变换的语音识别的研究和应用[D]. 桂林理工大学, 2011.
[2] 邹蓬, 张鹏, 陈艳. 一种基于小波变换的数字水印技术[J]. 中国民航大学学报, 2005.
[3] 金小华. 基于DWT的数字视频水印技术[J]. 苏州市职业大学学报, 2009.
[4] 杨雄. 基于小波变换的数字图像水印研究[D]. 华中师范大学.