分享 | OpenCV4.5.4 语音识别使用测试(含详细步骤)

点击下方 卡片,关注” OpenCV与AI深度学习“公众号!

视觉/图像重磅干货,第一时间送达!

导读

本文主要为大家分享OpenCV4.5.4中语音识别实例的使用(验证)与注意事项。

背景介绍

OpenCV4.5.4的DNN模块中新增了对语音识别的支持,本文以Python版本实例来做验证介绍。

使用步骤

Python-OpenCV实例代码位置:OpenCV4.5.4_Release\opencv\sources\samples\dnn\speech_recognition.py

分享 | OpenCV4.5.4 语音识别使用测试(含详细步骤)

使用步骤:

【1】下载语音识别模型:

https://drive.google.com/drive/folders/1wLtxyao4ItAg8tt4Sb63zt6qXzhcQoR6

分享 | OpenCV4.5.4 语音识别使用测试(含详细步骤)

模型下载jasper_reshape.onnx,然后重命名为:jasper.onnx,放到py文件同目录

【2】下载测试音频:

如上图中下载audio6.flac和audio6.flac,初步测试发现程序不支持mp3格式音频,需转为flac或wav格式,其他格式暂未尝试。

【3】安装soundfile包:

pip install soundfile 即可。

【4】cmd命令行运行:

python speech_recognition.py –input_audio=./audio/audio6.flac

audio6.flac音频: 00:00/ 00:11

audio6.flac识别结果:

分享 | OpenCV4.5.4 语音识别使用测试(含详细步骤)
Predicting...

Audio file 1/1
['an american instead of going in a leisure hour to dance merrily at some place of public resort as the fellows of his calling continued to do throughout the greater part of europe shuts himself up at home to drink']

audio10.flac音频: 00:00/ 00:27

audio10.flac识别结果:

分享 | OpenCV4.5.4 语音识别使用测试(含详细步骤)
Predicting...

Audio file 1/1
['she opened the door softly there sat missus wilson in the old rocking chair with one sick death like boy lying on her knee crying without let or pause but softly gently as fearing to disturb the troubled gasping child while behind her old alice let her fast dropping tears fall down on the dead body of the other twin which she was laying out on a board placed on a sort of sofa settee in the corner of the room']

上述两个片段的音频识别效果都很好。请注意,此模型不支持中文识别。尝试两个英语音频片段:

[En]

The audio recognition results of the above two segments are good. Note that this model does not support Chinese recognition. Try two English audio segments:

第一段音频:https://www.tingclass.net/show-5406-3632-1.html

分享 | OpenCV4.5.4 语音识别使用测试(含详细步骤)

python speech_recognition.py –input_audio=./audio/CET4.wav

识别结果:

Predicting...

Audio file 1/1
['o hom m bell amo hn haha am o waa iha  me howa e al ru e  hi hera morbo ao ha yur you move fore hung mo by wholl hab your hu mo ah  miseur luuel u lonlur wole olla iwer home all  bou o how bu olur aa men he ul um aha ol a oh a he notn ol all hole ar rule sa mer peaile hall her orha ah be a hen hom all murn a bown lok ano gerl orhehan or holy mule i ea the lol and theyn whole mon wingle all form ']

呃呃,和实际结果差别很大,结果中的单词也很多看不懂。

换另一段音频:https://m.kekenet.com/Article/201504/369129.shtml

分享 | OpenCV4.5.4 语音识别使用测试(含详细步骤)

python speech_recognition.py –input_audio=./audio/english.wav

识别结果:

Predicting...

Audio file 1/1
[" shakish am am shut shash an shi hang ca iunkun usha y oru u warm room  wo o emon o  chjonnoe e  ah wo an o a hush e i've o ask rule ur o sqawe grewh ula u ho a o ah"]

这一段音频识别结果还是很差。

初步分析应该是模型训练时的音频跟我们测试的音频差异较大,要想得到好的识别结果,还得自己训练。例程代码speech_recognition.py中还包含预训练模型下载地址,大家有兴趣可以自己尝试。相关内容如有新的动态再分享给大家!

Original: https://blog.csdn.net/stq054188/article/details/121981613
Author: Color Space
Title: 分享 | OpenCV4.5.4 语音识别使用测试(含详细步骤)

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/498278/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球