文章目录
问题
讯飞文字转写长语音只有5h免费,想要体验50000分钟白嫖的,看我另一篇文章
最近,我看了一些教程,发现网上没有字幕,也没有配对。看起来很尴尬。
[En]
Recently, I was watching some tutorials and found that there were no subtitles and no matches on the Internet. It looked very awkward.
因此我使用au处理了视频,得到了视频声音,wav格式,20多分钟长度
然后利用iFLYTEK的语音识别接口进行识别,得到每句话对应的文字和视频时间。
[En]
Then the speech recognition interface of iFLYTEK is used to recognize it, and the corresponding time of the text and video of each sentence is obtained.
然后按照srt格式对其进行了输出
通过这种方式,您可以自动为没有字幕的视频添加字幕。
[En]
In this way, you can automatically add subtitles to videos without subtitles.
我的需求大致得到了满足。把它录下来。
[En]
My needs are roughly met. Record it.
解决
截图
视频字幕效果
字幕由语音识别自动添加。
[En]
Subtitles are added automatically by speech recognition.
代码框输出格式
最后会生成srt字幕文件
; srt格式原理
如图所示,第一个是序号,第二个是字幕显示时间段,精确到微秒,文字在底部,中英文随心所欲
[En]
As shown in the picture, the first is the serial number, and the second is the subtitle display time period, accurate to microseconds, with the text at the bottom, at will in both Chinese and English
字幕数量一般是按顺序增加的,但对于视频来说用处不大,主要是为了方便译者翻译和观看,但不可或缺,这是一种必要的格式。
[En]
The subtitle number is generally increased sequentially, but it is not useful for video, mainly for the convenience of translators to translate and watch, but indispensable, this is a necessary format.
更加详细的看这个链接,这是我查的资料https://www.cnblogs.com/tocy/p/subtitle-format-srt.html
识别语音的讯飞接口调用函数
这个直接复制粘贴就行,只是一个调用的函数,非常通用,下面的另外一个函数是调用他的,位于同一个文件夹下的两个py文件
voice_get_text.py
import base64
import hashlib
import hmac
import json
import os
import time
import requests
lfasr_host = 'http://raasr.xfyun.cn/api'
api_prepare = '/prepare'
api_upload = '/upload'
api_merge = '/merge'
api_get_progress = '/getProgress'
api_get_result = '/getResult'
file_piece_sice = 10485760
lfasr_type = 0
has_participle = 'false'
has_seperate = 'true'
max_alternatives = 0
suid = ''
class SliceIdGenerator:
"""slice id生成器"""
def __init__(self):
self.__ch = 'aaaaaaaaa`'
def getNextSliceId(self):
ch = self.__ch
j = len(ch) - 1
while j >= 0:
cj = ch[j]
if cj != 'z':
ch = ch[:j] + chr(ord(cj) + 1) + ch[j + 1:]
break
else:
ch = ch[:j] + 'a' + ch[j + 1:]
j = j - 1
self.__ch = ch
return self.__ch
class RequestApi(object):
def __init__(self, appid, secret_key, upload_file_path):
self.appid = appid
self.secret_key = secret_key
self.upload_file_path = upload_file_path
def gene_params(self, apiname, taskid=None, slice_id=None):
appid = self.appid
secret_key = self.secret_key
upload_file_path = self.upload_file_path
ts = str(int(time.time()))
m2 = hashlib.md5()
m2.update((appid + ts).encode('utf-8'))
md5 = m2.hexdigest()
md5 = bytes(md5, encoding='utf-8')
signa = hmac.new(secret_key.encode('utf-8'), md5, hashlib.sha1).digest()
signa = base64.b64encode(signa)
signa = str(signa, 'utf-8')
file_len = os.path.getsize(upload_file_path)
file_name = os.path.basename(upload_file_path)
param_dict = {}
if apiname == api_prepare:
slice_num = int(file_len / file_piece_sice) + (0 if (file_len % file_piece_sice == 0) else 1)
param_dict['app_id'] = appid
param_dict['signa'] = signa
param_dict['ts'] = ts
param_dict['file_len'] = str(file_len)
param_dict['file_name'] = file_name
param_dict['slice_num'] = str(slice_num)
elif apiname == api_upload:
param_dict['app_id'] = appid
param_dict['signa'] = signa
param_dict['ts'] = ts
param_dict['task_id'] = taskid
param_dict['slice_id'] = slice_id
elif apiname == api_merge:
param_dict['app_id'] = appid
param_dict['signa'] = signa
param_dict['ts'] = ts
param_dict['task_id'] = taskid
param_dict['file_name'] = file_name
elif apiname == api_get_progress or apiname == api_get_result:
param_dict['app_id'] = appid
param_dict['signa'] = signa
param_dict['ts'] = ts
param_dict['task_id'] = taskid
return param_dict
def gene_request(self, apiname, data, files=None, headers=None):
response = requests.post(lfasr_host + apiname, data=data, files=files, headers=headers)
result = json.loads(response.text)
if result["ok"] == 0:
print("{} success:".format(apiname) + str(result))
return result
else:
print("{} error:".format(apiname) + str(result))
exit(0)
return result
def prepare_request(self):
return self.gene_request(apiname=api_prepare,
data=self.gene_params(api_prepare))
def upload_request(self, taskid, upload_file_path):
file_object = open(upload_file_path, 'rb')
try:
index = 1
sig = SliceIdGenerator()
while True:
content = file_object.read(file_piece_sice)
if not content or len(content) == 0:
break
files = {
"filename": self.gene_params(api_upload).get("slice_id"),
"content": content
}
response = self.gene_request(api_upload,
data=self.gene_params(api_upload, taskid=taskid,
slice_id=sig.getNextSliceId()),
files=files)
if response.get('ok') != 0:
print('upload slice fail, response: ' + str(response))
return False
print('upload slice ' + str(index) + ' success')
index += 1
finally:
'file index:' + str(file_object.tell())
file_object.close()
return True
def merge_request(self, taskid):
return self.gene_request(api_merge, data=self.gene_params(api_merge, taskid=taskid))
def get_progress_request(self, taskid):
return self.gene_request(api_get_progress, data=self.gene_params(api_get_progress, taskid=taskid))
def get_result_request(self, taskid):
return self.gene_request(api_get_result, data=self.gene_params(api_get_result, taskid=taskid))
def all_api_request(self):
pre_result = self.prepare_request()
taskid = pre_result["data"]
self.upload_request(taskid=taskid, upload_file_path=self.upload_file_path)
self.merge_request(taskid=taskid)
while True:
progress = self.get_progress_request(taskid)
progress_dic = progress
if progress_dic['err_no'] != 0 and progress_dic['err_no'] != 26605:
print('task error: ' + progress_dic['failed'])
return
else:
data = progress_dic['data']
task_status = json.loads(data)
if task_status['status'] == 9:
print('task ' + taskid + ' finished')
break
print('The task ' + taskid + ' is in processing, task status: ' + str(data))
time.sleep(20)
aaa=self.get_result_request(taskid=taskid)
return aaa
print(aaa)
处理结果,得到字符
放入自己在讯飞申请的语音转文字功能的id与key,执行后会得到一个巨长的声音识别后的dict字符串,自己处理一下变成srt格式就行了。当然这里我写的输出就是srt
video_to_txt.py
import voice_get_text
import datetime
video_path=input("音频路径:").replace("\\",'/')
print("开始处理...请等待")
api = voice_get_text.RequestApi(appid="申请的id", secret_key="申请的key",
upload_file_path=video_path)
myresult=api.all_api_request()
def get_format_time(time_long):
def format_number(num):
if len(str(num))>1:
return str(num)
else:
return "0"+str(num)
myhour=0
mysecond=int(time_long/1000)
myminute=0
mymilsec=0
if mysecond<1:
return "00:00:00,%s"%(time_long)
else:
if mysecond>60:
myminute=int(mysecond/60)
if myminute>60:
myhour=int(myminute/60)
myminute=myminute-myhour*60
mysecond=mysecond-myhour*3600-myminute*60
mymilsec=time_long-1000*(mysecond+myhour*3600+myminute*60)
return "%s:%s:%s,%s"%(format_number(myhour),format_number(myminute),format_number(mysecond),\
format_number(mymilsec))
else:
mysecond=int(mysecond-myminute*60)
mymilsec=time_long-1000*(mysecond+myminute*60)
return "00:%s:%s,%s"%(format_number(myminute),format_number(mysecond),format_number(mymilsec))
else:
mymilsec=time_long-mysecond*1000
return "00:00:%s,%s"%(mysecond,mymilsec)
myresult_str=myresult["data"]
myresult_sp=myresult_str.split("},{")
myresult_sp=myresult_sp[1:-1]
myword=""
flag_num=0
for i in myresult_sp:
flag_num+=1
print(i)
word=[]
key=[]
a=i.split(",")
for j in a:
temp=j.split(":")
key.append(temp[0][1:-1])
word.append(temp[1][1:-1])
get_dic=dict(zip(key,word))
print(get_dic)
bg= get_format_time(int(get_dic["bg"]))
ed= get_format_time(int(get_dic["ed"]))
real_word=get_dic["onebest"]
newword=str(flag_num)+"\n"+bg+" --> "+ed+'\n'+real_word+"\n\n\n"
myword=myword+newword
print(myword)
nowTime_str = datetime.datetime.strftime(datetime.datetime.now(), '%Y-%m-%d %H-%M-%S')
path_file=r"C:\Users\Administrator.DESKTOP-KMH7HN6\Desktop\video_text\%s.srt"%(nowTime_str)
f = open(path_file,'a')
f.write(myword)
f.write('\n')
f.close()
print('已经识别完成,见输出目录下的srt文件')
input()
列表合成字典
这是一个可以忽略的随机代码。
[En]
This is a random code that can be ignored.
keys = ['a', 'b', 'c']
values = [1, 2, 3]
mydic = dict(zip(keys, values))
print (mydic)
Original: https://blog.csdn.net/lidashent/article/details/113987349
Author: lidashent
Title: python-视频声音根据语音识别自动转为带时间的srt字幕文件
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/515510/
转载文章受原作者版权保护。转载请注明原作者出处!