开发者社区 > ModelScope模型即服务 > 语音 > 正文

使用Paraformer 最新版本,按照官方文档执行报错

执行代码如下:

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks


p = pipeline('auto-speech-recognition', 'damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch')


rec_result = p(audio_in='zh_test.wav')

print(rec_result)

报错信息:

/opt/conda/lib/python3.7/site-packages/torchaudio/compliance/kaldi.py in _get_window(waveform, padded_window_size, window_size, window_shift, window_type, blackman_coeff, snip_edges, raw_energy, energy_floor, dither, remove_dc_offset, preemphasis_coefficient)
    175 
    176     # size (m, window_size)
--> 177     strided_input = _get_strided(waveform, window_size, window_shift, snip_edges)
    178 
    179     if dither != 0.0:

/opt/conda/lib/python3.7/site-packages/torchaudio/compliance/kaldi.py in _get_strided(waveform, window_size, window_shift, snip_edges)
     57         Tensor: 2D tensor of size (m, ``window_size``) where each row is a frame
     58     """
---> 59     assert waveform.dim() == 1
     60     num_samples = waveform.size(0)
     61     strides = (window_shift * waveform.stride(0), waveform.stride(0))

AssertionError: 

展开
收起
little_ant0 2023-02-10 10:35:43 773 0
1 条回答
写回答
取消 提交回答
  • 已解决,要求.wav文件16kHz

    2023-02-10 14:51:51
    赞同 展开评论 打赏

包括语音识别、语音合成、语音唤醒、声学设计及信号处理、声纹识别、音频事件检测等多个领域

相关电子书

更多
低代码开发师(初级)实战教程 立即下载
冬季实战营第三期:MySQL数据库进阶实战 立即下载
阿里巴巴DevOps 最佳实践手册 立即下载