- This is a document of how to run Espnet (v1) ASR Demo and its model quantization
- Test enviroment:
UbuntuCUDAGCC21.0411.611.2
Note: Please follow the original installation guide provided by Espnet. Only some notes below should be paid attention to.
Requirements
soxsndfileffmpegflacinstalledinstallednot installednot installed
Install Kaldi
- The Kaldi installation includes two parts: 1. tools installation 2. src installation. Make sure install them all in order
- Once installed, many
.o
binary files can be found in directories such as:<kaldi-root>\{featbin,fgmmbin,fstbin,etc.}</kaldi-root>
Install Espnet
- Kaldi should be linked into
<espnet>/tools</espnet>
(check guide) Option A) Setup Anaconda environment
is choosen in this document, so a virtual enviromentespnet
is created withpython==3.8
- Since the current CUDA version is 11.6, which is not compatible with pytorch 1.10.1, so
espnet
should be installed by$ make TH_VERSION=1.10.1 CUDA_VERSION=11.3
, which specifies the version pytorch and CUDA - Custom tools in
[Optional] Custom tool installation
are not installed - install chainer in the
espnet
conda enviroment bypip install chainer==6.0.0
(cupy
is not installed due to some errors)
This demo is to decode (translate)
.wav
audio file into words
Notes: some
To quantize the model from FP32 to INT8
Espnet provides dynamic quantization method through pytorch API.
To enable dynamic quantization, add the following codes in espnet/utils/recog_wav.sh
file line 248-249
--quantize-asr-model True \
--quantize-dtype "qint8" \
Now we can perform decoding as described in the last section
Original: https://blog.csdn.net/GLinttsd/article/details/123933717
Author: GLinttsd
Title: Espnet ASR Demo & Quantization Document
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/512741/
转载文章受原作者版权保护。转载请注明原作者出处!