💥1 概述
伪造语音识别是当前智能语音技术研究的一个重要研究领域,是集信息安全、语音学、人工智能等跨学科交叉的应用性研究方向。当前社会新型犯罪中电信诈骗案件频发,急需一种能够自动有效区分语音真伪的方法。本文结合提出的最大稠密卷积神经网络(Max Dense Convolution Neural Network,MDCN)和频谱注意力模块(Spec-Attention Block),实现了噪声环境的准确识别。通过比对分析噪声环境基频、音强、频谱图等声学特性上的表现,分析差异,得出规律性结论,解释了伪噪声环境声音声学可以被识别的原理,为进一步的自动识别提供了理论依据。研究设计了一种表征音强离散程度的声学特征RMSA。该特征量化和表征了噪声环境声音和真实声音在音强变化率上的差异,并和FFV特征及SNS特征融合后作为输入识别模型的高维特征;研究设计了一种最大稠密卷积神经网络模型MDCN。在构造稠密卷积神经网络的稠密块时,使用了最大特征映射函数,在保留模型稠密式连接、减少信息遗忘的同时,也强化了卷积神经元所学内容中的有效信息,为提高分类识别能力提供了良好的模型。也研究设计了一种名为Spec-Attention Block的注意力模块。依据语音谐波形态和单个音素频谱的分布切分窄带频谱图,对精细化分割后的结果从空间及通道两个维度进行选择性关注,使模型更加聚焦于可区分伪造和真实语音上的谐波位置和频谱宽泛程度,增强了模型对语音声学特性的感知,进一步提高了识别能力。 现代传感应用需要简单性、通用性和出色的性能。显然,昆虫使用最少的资源与环境实现有限但令人满意的相互作用。该实验的动机是假设类似的控制原理可以应用于人工控制系统。以这种方式,根据先前的经验,进行了理论和实践研究,以确定在真实条件下识别短而朴实无华的类似噪音的声音的最佳程序。其结果是完全由启发式算法构建的识别类似噪声的环境声音的最佳混合程序。本文使用频谱作为特征向量识别不利声音的成功。
📚2 运行结果
运行时间较长,需要耐心等待哦!
部分代码:
% Plot Hidden layers clear; Range = [8:29]; FitType1 = 'poly2'; load('./ResultPoints'); Scale = 1; Points = TrainingNumArray(Range); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Confusion matrix transformation % %Predicted %Cricket: %it is turned out to be a cricket CvrcakCvrcak = ConfuctionMatrixSum(Range*3-2,1); %it is turned out to be a fly CvrcakMuva = ConfuctionMatrixSum(Range*3-1,1); %it is turned out to be a confusion sound CvrcakZbunj = ConfuctionMatrixSum(Range*3,1); %Fly: %it is turned out to be a cricket MuvaCvrcak = ConfuctionMatrixSum(Range*3-2,2); %it is turned out to be a fly MuvaMuva = ConfuctionMatrixSum(Range*3-1,2); %it is turned out to be a confusion sound MuvaZbunj = ConfuctionMatrixSum(Range*3,2); %Other: %it is turned out to be a cricket ZbunjCvrcak = ConfuctionMatrixSum(Range*3-2,3); %it is turned out to be a fly ZbunjMuva = ConfuctionMatrixSum(Range*3-1,3); %it is turned out to be a confusion sound ZbunjZbunj = ConfuctionMatrixSum(Range*3,3); %Predicted in Total CvrcakUkupno = CvrcakCvrcak + CvrcakMuva+CvrcakZbunj; MuvaUkupno = MuvaCvrcak + MuvaMuva+MuvaZbunj; ZbunjUkupno = ZbunjCvrcak + ZbunjMuva + ZbunjZbunj; CvrcakPredvidjeno = CvrcakCvrcak./CvrcakUkupno/Scale*100; MuvaPredvidjeno = MuvaMuva./MuvaUkupno/Scale*100; ZbunjPredvidjeno = ZbunjZbunj./ZbunjUkupno/Scale*100; %Realized in Total UkupnoCvrcak = CvrcakCvrcak + MuvaCvrcak + ZbunjCvrcak; UkupnoMuva = CvrcakMuva + MuvaMuva + ZbunjMuva; UkupnoZbunj = CvrcakZbunj + MuvaZbunj + ZbunjZbunj; CvrcakOstvareno = CvrcakCvrcak./UkupnoCvrcak/Scale*100; MuvaOstvareno = MuvaMuva./UkupnoMuva/Scale*100; ZbunjOstvareno = ZbunjZbunj./UkupnoZbunj/Scale*100; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% EtrainingFit = fit(Range',EtrainingSum(Range),FitType1); EtestingFit = fit(Range',EtestingSum(Range),FitType1); figure(1) hold on set(gca,'FontSize',14) %xlim([-1 Length-StartPoint+1]) h1 = plot(EtrainingFit,'b',Range,EtrainingSum(Range),'b.'); h2 = plot(EtestingFit,'r',Range,EtestingSum(Range),'r.'); set(h1,'linewidth',2); set(h2,'linewidth',2); grid on set(gcf,'PaperUnits','inches','PaperPosition',[0 0 9.4 6.6]) set(gcf,'Color',[1 1 1]); set(gca, 'XTick',[14 29]); set(gca, 'XTickLabel',{'100','1000'}); leg = legend('Training Recorded','Training Approximation','Testing Recorded','Testing Approximation'); legend('Location','northeast'); xlim ([Range(1)-1 Range(end)+1]) ylim ([0.0 0.3]) xlabel('Number of Training Ponts'); ylabel('Mean Squared Error'); print(gcf,'-dtiff','-r1000','./Fig_8') figure(2) hold on set(gca,'FontSize',14) h3 = plot(Range,CvrcakPredvidjeno,'ro'); h4 = plot(Range,CvrcakOstvareno,'rx'); h5 = plot(Range,MuvaPredvidjeno,'bo'); h6 = plot(Range,MuvaOstvareno,'bx'); h7 = plot(Range,ZbunjPredvidjeno,'go'); h8 = plot(Range,ZbunjOstvareno,'gx'); set(gca,'FontSize',14) set(h3,'linewidth',2); set(h4,'linewidth',2); set(h5,'linewidth',2); set(h6,'linewidth',2); set(h7,'linewidth',2); set(h8,'linewidth',2); set(h3,'markers',8); set(h4,'markers',10); set(h5,'markers',8); set(h6,'markers',10); set(h7,'markers',8); set(h8,'markers',10); grid on set(gcf,'Color',[1 1 1]); set(gca, 'XTick',[14 29]); set(gca, 'XTickLabel',{'100','1000'}); leg = legend('Cricket Recognized','Cricket Target','Fly Recognized','Fly Target','Confusing Recognized','Confusing Target'); set(leg, 'Location', 'southeast') set(gcf,'PaperUnits','inches','PaperPosition',[0 0 9.4 6.6]) xlim ([Range(1)-1 Range(end)+1]) ylim ([40 110]) xlabel('Number of Training Ponts'); ylabel('Succes(%)'); print(gcf,'-dtiff','-r1000','./Fig_9')
🎉3 参考文献
部分理论来源于网络,如有侵权请联系删除。
[1]Miloš Simonović, Marko Kovandžić, Ivan Ćirić, Vlastimir Nikolić (2021) Acoustic Recognition of Noise-like Environmental Sounds by Using Artificial Neural Network .
[2]张明新,张东滨,倪宏.用于噪声鲁棒语音识别的声学模型及解码策略[J].电声技术,2006(06):40-43.DOI:10.16311/j.audioe.2006.06.011.