十二、神经网络语言模型

2023年5月28日下午2:37 • 大数据 • 阅读 72

神经网络语言模型

1.NNLM的原理

1.1 语言模型

假设 S_表示某个有意义的句子，由一串特定顺序排列的词w 1 , w 2 , . . , w n w_1,w_2,..,w_n w 1 ,w 2 ,..,w n 组成， _n_是句子的长度。目的：计算 _S_在文本中(语料库)出现的可能性 _P(S)。

; 1.2 神经网络语言模型

直接从语言模型出发，将模型最优化过程转化为求词向量表示的过程.

2. NNLM的网络结构

2.1 NNLM的结构图

NNLM网络结构包括输入层、投影层，隐藏层和输出层

; 2.2 NNLM的计算过程

根据前面的n-1个单词，预测第n个单词的概率

2.3 环境

python3.7
torch==1.8.0

2.4 步骤

步骤一：读取数据


def load_data():
    sentences = ['i like dog', 'i love coffee', 'i hate milk']
    word_list = " ".join(sentences).split()
    word_list = list(set(word_list))

    word_dict = {w: i for i, w in enumerate(word_list)}

    number_dict = {i: w for i, w in enumerate(word_list)}
    return word_dict, number_dict,sentences

步骤二：实现mini-batch迭代器


def make_batch(sentences):
    input_batch = []
    target_batch = []

    for sen in sentences:
        word = sen.split()
        input = [word_dict[n] for n in word[:-1]]
        target = word_dict[word[-1]]

        input_batch.append(input)
        target_batch.append(target)

    return input_batch, target_batch

步骤三：超参数设置和mini-batch组装


dtype = torch.FloatTensor
n_class = len(word_dict)

n_step = len(sentences[0].split()) - 1
n_hidden = 2
m = 2

input_batch, target_batch = make_batch(sentences)
input_batch = torch.LongTensor(input_batch)
target_batch = torch.LongTensor(target_batch)

dataset = Data.TensorDataset(input_batch, target_batch)
loader = Data.DataLoader(dataset=dataset, batch_size=16, shuffle=True)

步骤四：模型构建


class NNLM(nn.Module):
    def __init__(self):
"""
        C: 词向量，大小为|V|*m的矩阵
        H: 隐藏层的weight
        W: 输入层到输出层的weight
        d: 隐藏层的bias
        U: 输出层的weight
        b: 输出层的bias
        1. 首先将输入的 n-1 个单词索引转为词向量，然后将这 n-1 个词向量进行 concat，形成一个 (n-1)*w 的向量，用 X 表示
        2. 将 X 送入隐藏层进行计算，hidden = tanh(d + X * H) [3,4]  * [4 * 2]
        3. 输出层共有|V|个节点，每个节点yi表示预测下一个单词i的概率，y的计算公式为y = b + X * W + hidden * U
        n_step: 文中用n_step个词预测下一个词，在本程序中其值为2
        n_hidden： 隐藏层（中间那一层）神经元的数量
        m: 词向量的维度
"""
        super(NNLM, self).__init__()

        self.C = nn.Embedding(n_class, m)
        self.H = nn.Parameter(torch.randn(n_step * m, n_hidden).type(dtype))
        self.W = nn.Parameter(torch.randn(n_step * m, n_class).type(dtype))
        self.d = nn.Parameter(torch.randn(n_hidden).type(dtype))
        self.U = nn.Parameter(torch.randn(n_hidden, n_class).type(dtype))
        self.b = nn.Parameter(torch.randn(n_class).type(dtype))
        print("---")

    def forward(self, X):
"""
        X: [batch_size, n_step]
"""
        X = self.C(X)
        X = X.view(-1, n_step * m)
        hidden_out = torch.tanh(self.d + torch.mm(X, self.H))
        output = self.b + torch.mm(X, self.W) + torch.mm(hidden_out, self.U)
        return output

步骤五：实例化模型和预测


model = NNLM()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)

for epoch in range(5000):
    for batch_x, batch_y in loader:
        optimizer.zero_grad()
        output = model(batch_x)

        loss = criterion(output, batch_y)
        if (epoch + 1) % 1000 == 0:
            print('Epoch:', '%04d' % (epoch + 1), 'cost = ', '{:.6f}'.format(loss))
        loss.backward()
        optimizer.step()

predict = model(input_batch).data.max(1, keepdim=True)[1]

print([sen.split()[:n_step] for sen in sentences], '->', [number_dict[n.item()] for n in predict.squeeze()])

2.6 运行结果

`
运行结果：

Original: https://blog.csdn.net/GUANGZHAN/article/details/121614095
Author: 智享AI
Title: 十二、神经网络语言模型

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/532073/

转载文章受原作者版权保护。转载请注明原作者出处！

大数据

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Linux一些错误总结

wget相对于浏览器来说，速度会比较偏慢，特别是国外的网站。 mwget是一个多线程下载应用，可以明显提高下载速度。 mwget安装步骤如下： #!/bin/bash wget h…

大数据 2023年5月27日
0085
慕课软件工程(第十章.课后习题-应用题)

慕课软件工程(第十章.课后习题-应用题) 原创 mozhimen2022-06-23 23:22:07©著作权文章标签管理系统对象模型图书管理系统文章分类 Hadoop …

大数据 2023年5月26日
0083
qt中的数据库操作和数据库的加密存储（SQLite三）

目录一、qt中的数据库操作 * 1. Qt中加入sql数据库模块 2.添加并且连接数据库 – （1）添加数据库（2）连接数据库 3.打开数据库并且通过QSqlQue…

大数据 2023年11月11日
0049
基于Mysql、Redis和zookeeper实现分布式锁，解决超卖问题

大数据 2023年11月16日
0038
彻底搞懂BPE（Byte Pair Encode）原理（附代码实现）

Byte Pair Encoding 既然你查到这了，就不解释BPE是干啥的了，直接上原理！核心思想迭代合并出现频率高的字符对。例子 1.准备一个语料库（corpus），并统…

大数据 2023年5月28日
00122
NLTK：Resource punkt not found. Please use the NLTK Downloader to obtain the resource

NLTK可以干啥 NLTK是Python自然语言处理的工具包！网上有很多文档啦！列几个链接叭！NLTK详细功能介绍…………&#823…

大数据 2023年5月28日
00115
大数据面试重点之hive(五)

HQL：行转列、列转行可回答：Hive中怎么实现列转行，行转列？问过的一些公司：Shopee(2021.07)，美团(2021.08)x2 参考答案：1、行转列：UDF聚合函数相…

大数据 2023年11月12日
0040
浏览器之常用插件

为了开发更加的方便,我们可能有必要安装一些常用的插件,来辅助我们更好地完成工作.所以,在此总结,常用到的插件… 为了开发更加的方便,我们可能有必要安装一些常用的插件,来…

大数据 2023年5月26日
00118
数据库之SQLite

QT之数据库SQLite 之后要封装一个数据库类文件直接调用（正在开发中…）一、增二、删三、改四、查五、pro文件添加 QT += sql 六、.h头文件 i…

大数据 2023年11月12日
0051
Redis分布式锁

大数据 2023年11月16日
0041
sqlite3基础总结

SQLite 使用 B-tree 处理索引，使用 B+tree 处理表数据。 .headers on 或 .h on 或 .head on 查询时显示表头： .headers on…

大数据 2023年11月11日
0042
SQLite安装与使用（Linux）

1.安装 sudo apt-get install sqlite3 2.可执行程序sqlite3在 /bin 目录下，使用sqlite3+空格+数据库文件，可打开数据库 3.使用….

大数据 2023年11月11日
0068
Redis的两种持久化机制

Redis的两种持久化机制 1、持久化机制 client—>redis(内存)—>内存数据-数据持久化—>磁盘两种方法快照…

大数据 2023年6月2日
0078
CloudCanal社区版本上新1.0.3版本！支持高可用集群化部署

新特性支持高可用集群化部署(参考教程点我查看) 支持钉钉群 & 短信告警发送给同为 SYSTEM 角色的团队伙伴，以支持团队化运维支持 PostgreSQL->M…

大数据 2023年6月2日
0064
Docker从入门到精通（八）——Docker Compose

恭喜大家，学到这里，对于 docker 的基础玩法大家应该都会了，下面会介绍 docker的一些编排工具。 1、为什么需要 Docker Compose? 官网镇楼：https:/…

大数据 2023年5月28日
0068
Kafka从入门到放弃(二) —— 详说生产者

上一篇对Kafka做了简单介绍，还没看的朋友可以点击下方链接。 Kafka从入门到放弃(一) —— 初识别Kafka 消息中间件必须与生产者和消费者一起存在才有意义，这次先来聊聊K…

大数据 2023年6月3日
0086

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31