Espnet ASR Demo & Quantization Document

  • This is a document of how to run Espnet (v1) ASR Demo and its model quantization
  • Test enviroment:

UbuntuCUDAGCC21.0411.611.2

Note: Please follow the original installation guide provided by Espnet. Only some notes below should be paid attention to.

Requirements

soxsndfileffmpegflacinstalledinstallednot installednot installed

Install Kaldi

  • The Kaldi installation includes two parts: 1. tools installation 2. src installation. Make sure install them all in order
  • Once installed, many .o binary files can be found in directories such as: <kaldi-root>\{featbin,fgmmbin,fstbin,etc.}</kaldi-root>

Install Espnet

  • Kaldi should be linked into <espnet>/tools</espnet> (check guide)
  • Option A) Setup Anaconda environment is choosen in this document, so a virtual enviroment espnet is created with python==3.8
  • Since the current CUDA version is 11.6, which is not compatible with pytorch 1.10.1, so espnet should be installed by $ make TH_VERSION=1.10.1 CUDA_VERSION=11.3, which specifies the version pytorch and CUDA
  • Custom tools in [Optional] Custom tool installation are not installed
  • install chainer in the espnet conda enviroment by pip install chainer==6.0.0 (cupy is not installed due to some errors)

This demo is to decode (translate) .wav audio file into words

Notes: some

To quantize the model from FP32 to INT8

Espnet provides dynamic quantization method through pytorch API.

To enable dynamic quantization, add the following codes in espnet/utils/recog_wav.sh file line 248-249

        --quantize-asr-model True \
        --quantize-dtype "qint8" \

Now we can perform decoding as described in the last section

Original: https://blog.csdn.net/GLinttsd/article/details/123933717
Author: GLinttsd
Title: Espnet ASR Demo & Quantization Document

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/512741/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球