景联文科技：语音识别技术有哪些应用场景？

2023年5月25日上午5:44 • 人工智能 • 阅读 64

近年来，全球各行业都受到新冠肺炎疫情的影响，越来越多的企业致力于研发新技术，为疫情防控贡献力量。目前，市场上已经推出了语音识别智能电梯系统，通过语音识别技术和电梯控制系统的结合，可以有效避免人们乘坐电梯时的接触感染风险。

[En]

In recent years, various industries around the world have been affected by the COVID-19 epidemic, and more and more enterprises are committed to developing new technologies to contribute to the prevention and control of the epidemic. At present, a voice recognition intelligent elevator system has been launched in the market, which can effectively avoid the risk of contact infection when people take the elevator through the combination of speech recognition technology and elevator control system.

什么是语音识别技术？

语音识别技术的目标是将人类语音的词汇内容转换为计算机可读的输入。

[En]

The goal of speech recognition technology is to convert the vocabulary content of human speech into computer-readable input.

语音识别技术的原理是让机器通过识别，将语音信号转化为文本，再将理解转化为指令的技术。其目的是使机器能够“理解”人们在说什么，并做出相应的反应。

[En]

The principle of speech recognition technology is to let the machine through recognition, the speech signal into text, and then the understanding into instructions technology. The aim is to enable the machine to “understand” what people are saying and react accordingly.

语音识别系统由两部分组成：声学识别模块和语言理解模块，它们分别是语音到音节和音节到单词的计算。一个连续语音识别系统由四个主要部分组成：特征提取、声学模型、语言模型和解码器。

[En]

The speech recognition system consists of two parts: acoustic recognition model and language understanding model, which are the calculation of speech to syllable and syllable to word respectively. A continuous speech recognition system consists of four main parts: feature extraction, acoustic model, language model and decoder.

特征提取是指在去除语音信号中对语音识别无用的信息后，保留能够反映语音本质特征的关键信息，对其进行处理，然后以特定的形式表达出来进行进一步处理。

[En]

Feature extraction means that after removing the useless information for speech recognition in the speech signal, retain the key information that can reflect the essential features of the speech, process it, and then express it in a specific form for further processing.

声学模型可以理解为对声音进行建模并将语音输入转换为声学表示的输出。

[En]

The acoustic model can be understood as modeling the sound and converting the speech input into the output of acoustic representation.

语言模型是用于计算句子出现概率的模型。简单来说，就是计算句子语法是否正确的概率。

[En]

A language model is a model used to calculate the probability of the occurrence of a sentence. to put it simply, it is to calculate the probability of whether the sentence is grammatically correct.

解码器指的是语音技术中的识别过程。

[En]

The decoder refers to the recognition process in speech technology.

语音识别的本质是一个模式识别的过程。将未知语音模式与已知语音模式进行比较，将最佳匹配的参考模式作为识别结果。

[En]

The essence of speech recognition is a process of pattern recognition. Comparing the unknown speech pattern with the known speech pattern, the best matching reference pattern is regarded as the recognition result.

语音识别技术的应用场景

语音输入

智能语音输入通过实时语音识别实现，可以排除生僻字和拼音障碍，为用户节省了输入时间，提高了输入体验。

[En]

Intelligent speech input, which can get rid of rare words and pinyin obstacles, is realized by real-time speech recognition, saving input time and improving input experience for users.

语音搜索

语音识别技术可以应用于语音搜索，通过语音直接输入搜索内容，并将其应用于手机搜索、网页搜索、汽车搜索等搜索场景，解放了人们的双手，使搜索更加高效。

[En]

Speech recognition technology can be used in voice search, input the search content directly by voice, and apply it to mobile phone search, web search, car search and other search scenarios, which liberates people’s hands and makes the search more efficient.

语音指令

语音识别技术可以用于语音命令，无需人工操作，可以直接通过语音向设备或软件下达命令，控制其运行，适用于视频网站、智能硬件等重大搜索场景。

[En]

Speech recognition technology can be used in voice commands, without manual operation, can directly issue commands to the device or software through voice to control its operation, and is suitable for video websites, intelligent hardware and other major search scenarios.

社交聊天

语音识别技术可以应用于社交聊天，直接采用语音输入成文字的方式，使得输入速度更快。或者在不方便或无法播放语音消息时，可以直接将语音转换为文字进行查看，满足了多种聊天场景，为用户提供了便利。

[En]

Speech recognition technology can be used in social chat, directly using the way of voice input into text, making the input faster. Or when it is inconvenient or unable to play the voice message, the voice can be directly converted into text for viewing, which satisfies a variety of chat scenarios and provides convenience for users.

游戏娱乐

语音识别技术可以用于游戏娱乐。在游戏期间，双手可能无法打字，语音输入可以将语音转换为文本，让用户在玩游戏和娱乐的同时，可以直观地看到聊天内容。很好地满足了用户多样化的聊天需求。

[En]

Speech recognition technology can be used in game entertainment. During the game, both hands may not be able to type, and voice input can convert voice into text, so that users can see the chat content intuitively while playing games and entertainment. It well meets the diversified chat needs of users.

字幕生成

在字幕生成中可以使用语音识别技术，它可以将直播和录制的视频中的语音转换为文本，并且可以轻松地生成字幕。

[En]

Speech recognition technology can be used in subtitle generation, which can convert voice in live and recorded videos into text, and subtitles can be generated easily and easily.

会议纪要

可以使用语音识别技术编写会议纪要，将会议、法庭听证、约谈等场景的音频信息转换为文本，通过实时语音识别及时实现，有效降低了人工录音的成本，提高了效率。

[En]

Speech recognition technology can be used to write meeting minutes, convert the audio information of meetings, court hearings, interviews and other scenes into text, which can be realized in time through real-time speech recognition, which can effectively reduce the cost of manual recording and improve efficiency.

数据标注对语音识别技术的重要性

在语音识别技术中，基于动态时间规整（Dynamic Time Warping）的算法在连续语音识别中仍是主流方法。该方法的运算量较大，但技术上相对较简单，识别正确率高；基于非参数模型的矢量量化（VQ）的方法所需的模型训练数据，训练和识别的时间，工作存储空间都较小，在孤立字（词）语音识别系统中可以得到很好的应用。最后一种基于参数模型的隐马尔可夫模型（HMM）的方法主要被用在大词汇量的语音识别系统，它需要较多的模型以训练数据，需要较长的训练和识别时间，还需要较大的存储空间，一般连续隐马尔可夫模型要比离散隐马尔可夫模型的计算量要大，但识别率相比较高。

近年来，人工智能的场景应用不断发展，实现人工智能的方法主要是机器学习，尤其是深度学习。在实际应用中，深度学习算法大多采用有监督的学习模式。它对人工智能的基础数据有很强的依赖。语音识别技术是人工智能技术的一种。只有依靠海量和高质量的数据来提高算法的精度，机器学习的质量才能达到最佳效果。

[En]

In recent years, the scene application of artificial intelligence is developing continuously, and the method of realizing artificial intelligence is mainly machine learning, especially deep learning. In practical application, most of the deep learning algorithms adopt supervised learning mode. It has a strong dependence on the basic data of artificial intelligence. Speech recognition technology is one of the artificial intelligence technologies. Only by relying on massive and high-quality data to improve the accuracy of the algorithm, can the quality of machine learning achieve the best effect.

可以说，数据在很大程度上决定了算法的准确性，也决定了语音识别技术的落地程度。

[En]

It can be said that the data not only determines the accuracy of the algorithm to a large extent, but also determines the landing degree of speech recognition technology.

景联文科技为语音识别技术提供一站式数据解决方案

井连文科技作为一家专业的人工智能基础数据服务商，已经采集了《20小时麦克风采集射频噪声数据》。

[En]

Jing Lianwen Technology, as a professional artificial intelligence basic data service provider, has collected “20-hour microphone to collect RF noise data.”

集》、《1000人唤醒词麦克风语言数据集》、《21000段ASR语音转写数据集》等数据集，可直接提供给算法厂商用于算法研究。

景联文科技作为一家专业的数据采集标注公司，针对数据定制标注服务景联文科技建有先进的数据标注平台与成熟的标注、审核、质检机制，支持语音工程：语音切割、ASR语音转写、语音情绪判定、声纹识别标注等标注方法，可为语音识别技术提供数据支持。

此外，景联文科技在全国拥有四大标注基地，拥有全职标注团队900余人，为长三角地区规模最大的AI数据服务商。我们拥有自研数据标注平台和全品类标注工具，可全方位满足合作方各类数据标注需求。同时平台支持本地化部署，SAAS服务，甲方可直接通过后台进行在线质检和验收。景联文科技实行管家服务制，为每一位客户提供专属商务及项目经理，提前对项目进行部署，提前开始，提前交付，还可为客户加急需求提供24小时加班业务，尽力为客户提供高质量的一站式数据解决方案。

未来，景联文科技也将持续加强AI基础建设，不断提升企业级数智化运用能力搭建，继续助力人工智能应用的不断落地。

Original: https://blog.csdn.net/weixin_55551028/article/details/122502692
Author: 景联文科技
Title: 景联文科技：语音识别技术有哪些应用场景？

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/512510/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

机器学习——神经网络、决策树（深度学习）

目录一、（人工）神经网络（Neural Networks） 1.1 神经元（Neurons in the brain）和大脑 1.2 需求预测 1.3 举例：图像感知 1.4 神…

人工智能 2023年6月15日
0087
基于ESP32CAM的物联网相机系统⑧（用原生JavaWeb实现双摄像头WIFI图传）

第一篇：最简单DIY基于ESP32CAM的物联网相机系统①（用网页实现拍照图传）第二篇：最简单DIY基于ESP32CAM的物联网相机系统②（在JAVAWEB服务器实现图片查看器）第…

人工智能 2023年6月20日
0070
VScode中配置 C/C++ 环境,超级详细，问题分析全面，绝对好用

VScode中配置 C/C++ 环境,超级详细，问题分析全面，绝对好用 VScode中配置 C/C++ 环境你好！这是你第一次使用 Markdown编辑器所展示的欢迎页。如果…

人工智能 2023年5月30日
00158
苹果手机代数_你造吗? iPhone的每代“S” 含义都不一样!

原标题：你造吗? iPhone的每代”S” 含义都不一样! 自iPhone 3GS开始，每代iPhone都会隔年推出”S”系产品，相…

人工智能 2023年5月27日
0075
决策树之基尼指数理解

基尼指数和信息熵都是用来描述系统混乱度的量数学形式不一样，干的事是一样的不纯度（impurity）–GINI系数：（不纯度就是混乱度）公式例子（与信息熵干的是一…

人工智能 2023年6月15日
0095
transformers库使用–tokenizer

在我使用transformers进行预训练模型学习及微调的时候，需要先对数据进行预处理，然后经过处理过的数据才能送进bert模型里，这个过程中使用的主要的工具就是tokenizer…

人工智能 2023年7月22日
0065
分类模型的评价指标及实现（Python)

本文根据自己对分类模型的评价指标的理解以及其它博主的理解进行总结而成，有疑问或不对地方，请留言指出。什么是评价指标？评价指标：是针对同份数据，不同算法模型或者同模型但不…

人工智能 2023年7月1日
0099
深度学习环境配置：Windows安装TensorFlow并在Jupyter notebook上使用

前言深度学习环境配置：Windows安装TensorFlow并在Jupyter notebook上使用安装Anaconda 官网下载地址：https://www.anacond…

人工智能 2023年5月25日
0081
基于图像gist特征的NWPU-RESISC45数据分类实战

gist特征也是我第一次听说，之前在使用机器学习模型来进行图像分类的时候，要么是使用LBP、HOG之类的特征提取算法来计算对应的特征数据，要么是直接将深度学习和机器学习做集成，深度…

人工智能 2023年7月2日
0057
ROS学习（八）launch启动文件的使用方法

前言使用命令行输入代码需要不断打开终端比较繁琐，而且容易输入错误，那么有没有什么方法可以快速启动所需节点呢？一、launch文件介绍 Launch文件：通过XML文件实现多节点…

人工智能 2023年6月23日
00104
数据预处理：离散特征编码方法

文章目录数据预处理：离散特征编码方法 * 无监督方法： – 1.序号编码OrdinalEncoder 2.独热编码OneHotEncoder 3.二进制编码Binar…

人工智能 2023年7月15日
0070
Keras-训练网络时的问题：loss一直为nan，accuracy一直为一个固定的数

在使用VGG19做分类任务时，遇到一个问题：loss一直为nan，accuracy一直为一个固定的数，如下输出所示，即使加入了自动调整学习率 (ReduceLROnPlateau)…

人工智能 2023年5月24日
0087
MySQL高级SQL语句（存储过程）

MySQL高级SQL语句（存储过程） MySQL高级SQL语句（存储过程） * 一、存储过程的概述 – 1.1 什么是存储过程 1.2 存储过程的有点二、创建、调用和…

人工智能 2023年6月26日
0083
图像处理2

⦁ 使用文件快速读写方法把4.jpg中的草地和树木改为紫色，保存为图片：代码： include using namespace std; int main(){LPCTSTR s…

人工智能 2023年6月20日
0072
【FPGA】基于HLS的全连接神经网络手写体识别

目录一系统分析 1.1 全连接神经网络简介二通过HLS 编写全连接神经网络传入权重参数和偏置参数文件 2.1 获得图片、权重以及偏置的参数 2.2 编写C语言的全连接算子 …

人工智能 2023年7月12日
0050
【数学建模】青少年犯罪问题 | 逐步回归分析法stepwise函数 | 残差分析rcoplot

提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档文章目录一、逐步回归分析法 * 1.1.逐步回归分析定义，最优回归方程 1.2.stepwise函数介绍二、例…

人工智能 2023年6月17日
00120

2024 年 4 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

景联文科技：语音识别技术有哪些应用场景？

大家都在看