【Bert + BiLSTM + CRF】实现实体命名识别，后续封装Dataset，DataLoader，进行批次训练

2023年5月27日下午8:07 • 人工智能 • 阅读 116

上次介绍了【Bert + BiLSTM + CRF】实现实体命名识别的简单应用，只使用了单个例子跑，这次接着上回继续更新，封装了一下Dataset，并进行了批量数据的训练。本项目使用的标注好的数据集可以私信找我要哦！全程无bug跑完！

项目结构：

【Bert + BiLSTM + CRF】实现实体命名识别，后续封装Dataset，DataLoader，进行批次训练

bert-base-chinese: 存放了bert模型，vocab.txt ，config.json
data: 标注好的数据
output：输出的日志文件和模型文件
dataSet.py: 数据预处理代码
main.py: 训练和验证的代码

直接上代码，后面给讲解

dataSet.py

from torch.utils.data import Dataset,DataLoader
from transformers import BertTokenizer
import torch
import warnings
import os
import json
import sys
import re
warnings.filterwarnings('ignore')

def collect_data(path,original_value,result_value,a,b,c,d,e,f):
    with open(path,'r',encoding='utf-8') as file:
        s = json.load(file)

        try:
            for i,k in enumerate(s):
                if k=='originalValue':
                    original_value.append(s['originalValue'])
                if k=='resultValue' and s['resultValue']!='':
                    result_value.append(s['resultValue'])
                if k=='classify':
                    classify_data = s[k]
                    a.append(classify_data['组织学分型']) if "组织学分型" in classify_data else a.append(" ")
                    b.append(classify_data['癌结节']) if "癌结节" in classify_data else b.append(" ")
                    c.append(classify_data['两侧切缘是否有癌浸润']) if "两侧切缘是否有癌浸润" in classify_data else c.append(" ")
                    d.append(classify_data['pCRM']) if "pCRM" in classify_data else d.append(" ")
                    e.append(classify_data['脉管']) if "脉管" in classify_data else e.append(" ")
                    f.append(classify_data['神经']) if "神经" in classify_data else f.append(" ")
        except Exception:
            print(f'Errors occus at path : {path}, key : "{k}", with reasons : {sys.exc_info()}')
    return original_value,result_value,a,b,c,d,e,f

def fun4Word(data):
    output = ''
    for i in data:
        word = ''
        label = ''
        word_label = re.split(r'(\[[^\]]+\]/aj_lcjl|\[[^\]]+\]/aj_hzjl|\[[^\]]+\]/lbj_z|\[[^\]]+\]/lbj_y|\[[^\]]+\]/lbj_fz|\[[^\]]+\]/mlh1|\[[^\]]+\]/msh2|\[[^\]]+\]/msh6|\[[^\]]+\]/pms2|\[[^\]]+\]/ki67|\[[^\]]+\]/p53)',i)
        for f in word_label:
            if 'lbj_y' in f:
                word_index = f[1:-7]
                if len(word_index)>1:
                    label_index = "B_lbjy "+(len(word_index)-2)*'M_lbjy '+"E_lbjy "
                else:
                    label_index = "W_lbjy "
                word += word_index
                label += label_index
            elif 'lbj_z' in f:
                word_index = f[1:-7]
                if len(word_index) > 1:
                    label_index = "B_lbjz " + (len(word_index) - 2)* 'M_lbjz '+ "E_lbjz "
                else:
                    label_index = "W_lbjz "
                word += word_index
                label += label_index
            elif 'lbj_fz' in f:
                word_index = f[1:-8]
                if len(word_index) > 1:
                    label_index = "B_lbjfz " + (len(word_index) - 2)*'M_lbjfz ' + "E_lbjfz "
                else:
                    label_index = "W_lbjfz "
                word += word_index
                label += label_index
            elif 'aj_lcjl' in f:
                word_index = f[1:-9]
                if 'cm' in word_index:
                    if len(word_index) > 3:
                        label_index = "B_ajl " + (len(word_index) - 4) * 'M_ajl ' + "E_ajl "+"O "*2
                    else:
                        label_index = "W_ajl " +"O "*2
                elif 'c' in word_index:
                    if len(word_index) > 2:
                        label_index = "B_ajl " + (len(word_index) - 2) * 'M_ajl ' + "E_ajl " +'O '
                    else:
                        label_index = "W_ajl " +'O '
                else:
                    if len(word_index) > 1:
                        label_index = "B_ajl " + (len(word_index) - 2) * 'M_ajl ' + "E_ajl "
                    else:
                        label_index = "W_ajl "
                word += word_index
                label += label_index
            elif 'aj_hzjl' in f:
                word_index = f[1:-9]
                if 'cm' in word_index:
                    if len(word_index) > 3:
                        label_index = "B_ajh " + (len(word_index) - 4) * 'M_ajh ' + "E_ajh " + "O " * 2
                    else:
                        label_index = "W_ajh " + "O " * 2
                elif 'c' in word_index:
                    if len(word_index) > 2:
                        label_index = "B_ajh " + (len(word_index) - 2) * 'M_ajh ' + "E_ajh " + 'O '
                    else:
                        label_index = "W_ajh " + 'O '
                else:
                    if len(word_index) > 1:
                        label_index = "B_ajh " + (len(word_index) - 2) * 'M_ajh ' + "E_ajh "
                    else:
                        label_index = "W_ajh "
                word += word_index
                label += label_index
            elif 'mlh1' in f:
                word_index = f[1:-6]
                if len(word_index) > 1:
                    label_index = "B_mlh1 " + (len(word_index) - 2) * 'M_mlh1 ' + "E_mlh1 "
                else:
                    label_index = "W_mlh1 "
                word += word_index
                label += label_index
            elif 'msh2' in f:
                word_index = f[1:-6]
                if len(word_index) > 1:
                    label_index = "B_msh2 " + (len(word_index) - 2) * 'M_msh2 ' + "E_msh2 "
                else:
                    label_index = "W_msh2 "
                word += word_index
                label += label_index
            elif 'msh6' in f:
                word_index = f[1:-6]
                if len(word_index) > 1:
                    label_index = "B_msh6 " + (len(word_index) - 2) * 'M_msh6 ' + "E_msh6 "
                else:
                    label_index = "W_msh6 "
                word += word_index
                label += label_index
            elif 'pms2' in f:
                word_index = f[1:-6]
                if len(word_index) > 1:
                    label_index = "B_pms2 " + (len(word_index) - 2) * 'M_pms2 ' + "E_pms2 "
                else:
                    label_index = "W_pms2 "
                word += word_index
                label += label_index
            elif 'ki67' in f:
                word_index = f[1:-6]
                if len(word_index) > 1:
                    label_index = "B_ki67 " + (len(word_index) - 2) * 'M_ki67 ' + "E_ki67 "
                else:
                    label_index = "W_ki67 "
                word += word_index
                label += label_index
            elif 'p53' in f:

                word_index = f[1:-5]
                if len(word_index) > 1:
                    label_index = "B_p53 " + (len(word_index) - 2) * 'M_p53 ' + "E_p53 "
                else:
                    label_index = "W_p53 "
                word += word_index
                label += label_index
            else:
                word += f
                label +=len(f)*"O "
        if word !='':
            output += word + ' //' + label + '\n'
    return output

def label_process(data):
    word = ''
    label = ''
    word_label = re.split(
        r'(\[[^\]]+\]/aj_lcjl|\[[^\]]+\]/aj_hzjl|\[[^\]]+\]/lbj_z|\[[^\]]+\]/lbj_y|\[[^\]]+\]/lbj_fz|\[[^\]]+\]/mlh1|\[[^\]]+\]/msh2|\[[^\]]+\]/msh6|\[[^\]]+\]/pms2|\[[^\]]+\]/ki67|\[[^\]]+\]/p53)',data)
    for f in word_label:
        if 'lbj_y' in f:
            word_index = f[1:-7]
            if len(word_index) > 1:
                label_index = "B_lbjy " + (len(word_index) - 2) * 'M_lbjy ' + "E_lbjy "
            else:
                label_index = "W_lbjy "
            word += word_index
            label += label_index
        elif 'lbj_z' in f:
            word_index = f[1:-7]
            if len(word_index) > 1:
                label_index = "B_lbjz " + (len(word_index) - 2) * 'M_lbjz ' + "E_lbjz "
            else:
                label_index = "W_lbjz "
            word += word_index
            label += label_index
        elif 'lbj_fz' in f:
            word_index = f[1:-8]
            if len(word_index) > 1:
                label_index = "B_lbjfz " + (len(word_index) - 2) * 'M_lbjfz ' + "E_lbjfz "
            else:
                label_index = "W_lbjfz "
            word += word_index
            label += label_index
        elif 'aj_lcjl' in f:
            word_index = f[1:-9]
            if 'cm' in word_index:
                if len(word_index) > 3:
                    label_index = "B_ajl " + (len(word_index) - 4) * 'M_ajl ' + "E_ajl " + "O " * 2
                else:
                    label_index = "W_ajl " + "O " * 2
            elif 'c' in word_index:
                if len(word_index) > 2:
                    label_index = "B_ajl " + (len(word_index) - 2) * 'M_ajl ' + "E_ajl " + 'O '
                else:
                    label_index = "W_ajl " + 'O '
            else:
                if len(word_index) > 1:
                    label_index = "B_ajl " + (len(word_index) - 2) * 'M_ajl ' + "E_ajl "
                else:
                    label_index = "W_ajl "
            word += word_index
            label += label_index
        elif 'aj_hzjl' in f:
            word_index = f[1:-9]
            if 'cm' in word_index:
                if len(word_index) > 3:
                    label_index = "B_ajh " + (len(word_index) - 4) * 'M_ajh ' + "E_ajh " + "O " * 2
                else:
                    label_index = "W_ajh " + "O " * 2
            elif 'c' in word_index:
                if len(word_index) > 2:
                    label_index = "B_ajh " + (len(word_index) - 2) * 'M_ajh ' + "E_ajh " + 'O '
                else:
                    label_index = "W_ajh " + 'O '
            else:
                if len(word_index) > 1:
                    label_index = "B_ajh " + (len(word_index) - 2) * 'M_ajh ' + "E_ajh "
                else:
                    label_index = "W_ajh "
            word += word_index
            label += label_index
        elif 'mlh1' in f:
            word_index = f[1:-6]
            if len(word_index) > 1:
                label_index = "B_mlh1 " + (len(word_index) - 2) * 'M_mlh1 ' + "E_mlh1 "
            else:
                label_index = "W_mlh1 "
            word += word_index
            label += label_index
        elif 'msh2' in f:
            word_index = f[1:-6]
            if len(word_index) > 1:
                label_index = "B_msh2 " + (len(word_index) - 2) * 'M_msh2 ' + "E_msh2 "
            else:
                label_index = "W_msh2 "
            word += word_index
            label += label_index
        elif 'msh6' in f:
            word_index = f[1:-6]
            if len(word_index) > 1:
                label_index = "B_msh6 " + (len(word_index) - 2) * 'M_msh6 ' + "E_msh6 "
            else:
                label_index = "W_msh6 "
            word += word_index
            label += label_index
        elif 'pms2' in f:
            word_index = f[1:-6]
            if len(word_index) > 1:
                label_index = "B_pms2 " + (len(word_index) - 2) * 'M_pms2 ' + "E_pms2 "
            else:
                label_index = "W_pms2 "
            word += word_index
            label += label_index
        elif 'ki67' in f:
            word_index = f[1:-6]
            if len(word_index) > 1:
                label_index = "B_ki67 " + (len(word_index) - 2) * 'M_ki67 ' + "E_ki67 "
            else:
                label_index = "W_ki67 "
            word += word_index
            label += label_index
        elif 'p53' in f:

            word_index = f[1:-5]
            if len(word_index) > 1:
                label_index = "B_p53 " + (len(word_index) - 2) * 'M_p53 ' + "E_p53 "
            else:
                label_index = "W_p53 "
            word += word_index
            label += label_index
        else:
            word += f
            label += len(f) * "O "
    return word,label

def my_collate(data):
    inputs, labels = [],[]
    for i,dat in enumerate(data):
        (input,label) = dat
        inputs.append(input)
        labels.append(label)
    return torch.tensor(inputs),torch.tensor(labels)

class MyDataSet(Dataset):

    def __init__(self,max_length = 512):

        labels = ['B_lbjy', 'M_lbjy', 'E_lbjy', 'W_lbjy', 'B_lbjz', 'M_lbjz', 'E_lbjz', 'W_lbjz', 'B_lbjfz', 'M_lbjfz',
                  'E_lbjfz', 'W_lbjfz',
                  'B_ajl', 'M_ajl', 'E_ajl', 'W_ajl', 'B_ajh', 'M_ajh', 'E_ajh', 'W_ajh', 'B_mlh1', 'M_mlh1', 'E_mlh1',
                  'W_mlh1', 'B_msh2', 'M_msh2', 'E_msh2', 'W_msh2',
                  'B_msh6', 'M_msh6', 'E_msh6', 'W_msh6', 'B_pms2', 'M_pms2', 'E_pms2', 'W_pms2', 'B_ki67', 'M_ki67',
                  'E_ki67', 'W_ki67', 'B_p53', 'M_p53', 'E_p53', 'W_p53', 'O']
        self.tag_num = len(labels)
        original_value, result_value, a, b, c, d, e, f = [], [], [], [], [], [], [], []
        count = 0
        root = 'data'
        tokenizer = BertTokenizer.from_pretrained('bert-base-chinese')

        root_path = os.listdir(root)
        for path in root_path:
            father_path = os.path.join(root, path)
            child_paths = os.listdir(os.path.join(root, path))
            for child_path in child_paths:
                count += 1
                original_value, result_value, a, b, c, d, e, f = collect_data(os.path.join(father_path, child_path),
                                                                              original_value, result_value, a, b, c, d,
                                                                              e, f)
                if result_value=='': print(f'result_value null: count :{count}, path:{os.path.join(father_path, child_path)}')
        print(f'Data Collection Info : original_value : {len(original_value)} result_value : {len(result_value)} '
              f'a : {len(a)} b : {len(b)} c :{len(c)} d : {len(d)} e : {len(e)} f : {len(f)} final count : {count}')

        tokenized_data = []
        encoded_labels = []
        for i,sentence in enumerate(result_value):
            word, label = label_process(sentence)

            if len(word)>max_length:
                word = word[:max_length]

            s = tokenizer.encode_plus(word,return_token_type_ids=True,return_attention_mask=True,return_tensors='pt',
                                             padding='max_length',max_length=max_length)
            tokenized_data.append(s)

            label = label.strip().split(' ')

            if len(label)>max_length:
                label = label[:max_length]
            if len(label)<max_length:
                label += ['O'] * (max_length-len(label))

            l = {k: v for v, k in enumerate(labels)}
            encoded_label = [l[k] for k in label]
            encoded_labels.append(encoded_label)
            if s.input_ids.shape[1]>max_length or s.attention_mask.shape[1]>max_length or s.token_type_ids.shape[1]>max_length:
                print(f'len data:{s.input_ids.shape} {s.attention_mask.shape} {s.token_type_ids.shape} len label:{len(encoded_label)}')
        self.data = tokenized_data
        self.label = encoded_labels

    def __getitem__(self, index):
        return self.data[index],self.label[index]

    def __len__(self):
        return len(self.data)

if __name__ == '__main__':
    dataset = MyDataSet()
    token_count = 0
    data_loader = DataLoader(dataset=dataset,shuffle=False,batch_size=10,collate_fn=my_collate)
    for i,data in enumerate(data_loader):
        inputs,labels = data
        print(f'inputs_size:{inputs.shape}\t labels_size:{labels.shape}')
        token_count +=1
    print(f'token_count:{token_count}')

main.py


'''
@author : sito
@date : 2022-02-25
@description:
Trying to build model (Bert+BiLSTM+CRF) to solve the problem of Ner,
With low level of code and the persistute of transformers, torch, pytorch-crf
'''
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
from dataSet import MyDataSet
from torch.utils.data import DataLoader
from transformers import BertModel
from torchcrf import CRF
import time
import warnings
import logging
import sys
warnings.filterwarnings('ignore')

logger = logging.getLogger('training log')
logger.setLevel(logging.INFO)

rf_handler = logging.StreamHandler(sys.stderr)
rf_handler.setLevel(logging.INFO)
rf_handler.setFormatter(logging.Formatter("%(asctime)s - %(name)s - %(message)s"))

f_handler = logging.FileHandler('output/training.log')
f_handler.setLevel(logging.INFO)
f_handler.setFormatter(logging.Formatter("%(asctime)s - %(levelname)s - %(filename)s[:%(lineno)d] - %(message)s"))
logger.addHandler(rf_handler)
logger.addHandler(f_handler)

def my_collate(data):
    device = 'cuda' if torch.cuda.is_available() else 'cpu'
    input_ids, attention_mask, token_type_ids, labels = [],[],[],[]
    for i,dat in enumerate(data):
        (input,label) = dat
        input_ids.append(input.input_ids.cpu().squeeze().detach().numpy().tolist())
        attention_mask.append(input.attention_mask.cpu().squeeze().detach().numpy().tolist())
        token_type_ids.append(input.token_type_ids.cpu().squeeze().detach().numpy().tolist())
        labels.append(label)
    return {'input_ids': torch.tensor(input_ids).to(device), 'attention_mask':torch.tensor(attention_mask).to(device),
            'token_type_ids':torch.tensor(token_type_ids).to(device)}, torch.tensor(labels).to(device)

class Model(nn.Module):

    def __init__(self,tag_num):
        super().__init__()
        self.bert = BertModel.from_pretrained('bert-base-chinese')
        config = self.bert.config
        self.lstm = nn.LSTM(bidirectional=True, num_layers=2, input_size=config.hidden_size, hidden_size=config.hidden_size//2, batch_first=True)
        self.crf = CRF(tag_num)
        self.fc = nn.Linear(config.hidden_size,tag_num)

    def forward(self,x,y):
        with torch.no_grad():
            bert_output = self.bert(input_ids=x['input_ids'],attention_mask=x['attention_mask'],token_type_ids=x['token_type_ids'])[0]
        lstm_output, _ = self.lstm(bert_output)
        fc_output = self.fc(lstm_output)
        loss = self.crf(fc_output,y)
        tag = self.crf.decode(fc_output)
        return loss,tag

if __name__ == '__main__':

    epoches = 50
    max_length = 512
    batch_size = 64
    lr = 0.0001

    dataset = MyDataSet(max_length)
    tag_num = dataset.tag_num
    data_loader = DataLoader(dataset=dataset, shuffle=False, batch_size=batch_size, collate_fn=my_collate)

    logger.info(f'>>> Training Start!')
    model = Model(tag_num).cuda()
    optimizer = optim.Adam(model.parameters(), lr=lr)
    scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer,T_max=50)
    for e in range(epoches):

        epoch_end_loss = 0
        model.train()
        for i,data in enumerate(data_loader):
            optimizer.zero_grad()
            inputs, labels = data
            loss,_ = model(inputs,labels)
            loss = abs(loss)
            loss.backward()
            optimizer.step()
            scheduler.step()
            epoch_end_loss = loss
            if i%10==0:
                logger.info(f'>>> epoch {e} <<< step {i} : loss : {loss}')
        logger.info(f'{time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())} epoch {e} training loss : {epoch_end_loss}')

        if e%10==0 and e!=0:
            model.eval()
            logger.info(f'{time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())} epoch {e} Start Evaluation!')
            step_end_accuracy = []
            with torch.no_grad():
                for i, data in enumerate(data_loader):
                    inputs, labels = data
                    _, tag = model(inputs,labels)
                    tag = np.array(tag).T

                    for i,(pre_y,real_y) in enumerate(zip(tag,labels)):
                        assert pre_y.shape[0]==real_y.shape[0]==max_length, \
                            f'length not match pre_y.shape[0]:{pre_y.shape[0]} real_y.shape[0]:{real_y.shape[0]}  max_length:{max_length}'
                        sum = pre_y.shape[0]
                        real_y_numpy= real_y.cpu().numpy()
                        cal = pre_y==real_y_numpy
                        count = np.where(cal>0)[0].size
                        accu = count/sum
                        step_end_accuracy.append(accu)
            epoch_end_accuracy = np.mean(step_end_accuracy)
            logger.info(f'{time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())} epoch {e} evaluation accuracy : {epoch_end_accuracy}')

            torch.save(model.state_dict(),f'model_p_{epoch_end_accuracy}.pt')

如果您为数据集添加标签，则不方便发布它。感兴趣的朋友可以给我发私信。

[En]

If you label the dataset, it is not convenient to post it. Interested friends can send a private message to me.

训练的过程也是比较简单了，一共50个epoch，每10个epoch进行一次evaluation，并保存我们的模型，优化器使用了Adam，学利率方面使用了余弦退火的策略，因为一开始学习率需要大一点，越到后面模型学习到的信息越多，就不需要很大的学习率了，小的学习率反而能增加模型的鲁棒性和性能，具体可以看一下albert的论文，里面有很多的训练trick。

欢迎大家一起学习交流，博主对计算机视觉和NLP方向都很感兴趣，以后也会不定时的更新一些好的比较有用的文章，感兴趣的童鞋可以关注我哦~

最后，发布培训输出的结果。

[En]

Finally, post the results of the training output.

可以看到在第10个epoch的时候，已经有0.96051的准确率了，bert果然是厉害啊！

Original: https://blog.csdn.net/m0_37576959/article/details/123233758
Author: Sito_zz
Title: 【Bert + BiLSTM + CRF】实现实体命名识别，后续封装Dataset，DataLoader，进行批次训练

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/527539/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

使用sklearn-LDA分析微博评论数据并进行主题聚类可视化

最近从微博评论中获取了部分关于俄乌局势的评论，于是尝试根据评论内容进行简单的LDA主题聚类分析。主要涉及评论数据清洗、LDA数据分析、pyLDAvis可视化、困惑度计算。一、数据…

人工智能 2023年6月15日
0075
pandas | DataFrame中的排序与汇总方法

今天是 pandas数据处理专题的第六篇文章，我们来聊聊DataFrame的排序与汇总运算。在上一篇文章当中我们主要介绍了DataFrame当中的 apply方法，如何在一个Da…

人工智能 2023年6月2日
0099
TensorFlow2.8.0代码分析之例子MultiBox Object Detection中main函数

该工具通过在计算机上运行音频识别模型，对连续不断的采样流，创建准确度统计信息。这是一个新的模型运行环境设置以，了解它们在实际应用中的效果。你需要为它提供一个包含你想要识别的声音…

人工智能 2023年7月10日
0079
pytorch源码编译

下载源代码和依赖库需要在内部隔离网络中从源码编译pytorch，但内部网络无法链接github。且pytorch依赖库众多，一个一个地下载依赖库不太现实。我采用的方法是：在外部可…

人工智能 2023年7月23日
0063
三、复现U-net网络（Pytorch）

一、U-net网络 ; 二、复现U-net网络左半部分 1，卷积+ReLU激活函数 ; ①(572,572,1)—>(570,570,64) 首先输入一张(572,572,…

人工智能 2023年7月22日
0066
SKnet论文解读

本文讲述sknet的核心部分:自适应性的注意力编码机制 SKNet 对不同输入使用的卷积核感受野不同,参数权重也不同,可以自适应的对输出进行处理注:本人才疏学浅,文章难免有疏漏之…

人工智能 2023年6月25日
00179
05 Transformer 中的前馈神经网络（FFN）的实现

2：20：理论链接博客配套视频链接: https://space.bilibili.com/383551518?spm_id_from=333.1007.0.0 b 站直接看配…

人工智能 2023年6月4日
00133
Firefly AIO-3399ProC开发板刷ubuntu系统安装rknntoolkit 1.6.0 + tensorflow 2.0 + pytorch 1.5.0

AIO-3399Pro刷ubuntu系统同时配置rknntoolkit 1.6.0 环境由于本人所做项目需要将轻量级的深度学习算法进行部署，故前些日子购买了核心板为RK3399的…

人工智能 2023年5月23日
00120
一次域环境下的渗透

一次域环境下的渗透在拿到shell后查看ip信息发现在域环境内（该系统位winserver 2003） Ipconfig /all 查看当前用户发现为system用户，该用户在域环…

人工智能 2023年6月27日
0061
tensorflow2.x（一）显存不够或内存不够要怎么办？

许多教程说，使用较少的样本数或更换较大的硬件。事实上，这是治标不治本的办法。 [En] Many tutorials say using a smaller number of s…

人工智能 2023年5月23日
00107
YOLOv5如何进行区域目标检测（手把手教学）

YOLOv5如何进行区域目标检测（手把手教学） 提示：本项&#…

人工智能 2023年7月3日
0082
【个人笔记 – 目录】OpenCV4 C++ 快速入门 30讲

个人资料，仅供学习使用修改时间——2022年2月10日 09:51:53学习课程：OpenCV4 C++ 快速入门视频30讲视频老师：贾志刚笔者对每一节课都做了详细的笔记，在包含…

人工智能 2023年7月19日
0071
【python数据分析】pandas数据合并

pandas数据合并使用contact，append，merge完成数据集合并自己学习用，欢迎大佬指正。 1.concat pd.concat()可以合并series和DataF…

人工智能 2023年6月19日
00108
【ROS进阶篇】第八讲（上） URDF文件的语法详解

【ROS进阶篇】第八讲（上） URDF文件的语法详解文章目录【ROS进阶篇】第八讲（上） URDF文件的语法详解前言 * 一、URDF的基本概念二、link标签 &#821…

人工智能 2023年7月28日
0087
物联网省/国赛AIOT智能家居全流程演示

文章目录前言一、虚拟仿真部署部分 * 打开虚拟终端，配置与Home Assistant连接 – 添加 MQTT 连接重启 HA服务 Home Assistant平…

人工智能 2023年7月31日
0075
解决分类中样本分布不平衡问题

目录一、什么是样本分布不平衡二、哪些运营场景中容易出现样本不均衡三、怎么处理样本不均衡 1. 通过过采样或欠采样解决样本不均衡 2. 通过正负样本的惩罚权重解决样本不均衡 3…

人工智能 2023年6月19日
00109

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

【Bert + BiLSTM + CRF】实现实体命名识别，后续封装Dataset，DataLoader，进行批次训练

大家都在看