[TensorFlow] 交叉熵损失函数，加权交叉熵损失函数

2023年5月24日下午6:05 • 人工智能 • 阅读 109

写在前面

一、基础计算

当存在多个类别时，通常使用交叉熵损失函数来衡量模型的效果，这也是调整模型参数的重要依据。交叉熵损失函数的公式为：

[En]

When there are multiple categories, the cross entropy loss function is usually used to measure the effect of the model, which is also an important basis for adjusting the parameters of the model. The formula of the cross entropy loss function is:

L = 1 N ∑ i L i = − 1 N ∑ i ∑ C = 1 M y i c l o g P i c \begin{aligned} L = \frac{1}{N}\sum_{i}L_i = -\frac{1}{N} \sum_{i} \sum_{C=1}^My_{ic}logP_{ic} \end{aligned}L =N 1 i ∑L i =−N 1 i ∑C =1 ∑M y i c l o g P i c
代码：

import tensorflow as tf
import numpy as np

sess=tf.Session()

logits = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 10, 3],
                   [8, 2, 0],
                   [9, 6, 3]], dtype=np.float32)

labels = np.array([[0, 0, 1],
                   [0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0],
                   [0, 1, 0]], dtype=np.float32)

softmax_out=tf.nn.softmax(logits)
print("softmax_out is::")
print(sess.run(softmax_out))

print("labels * tf.log(softmax_out) is::")
print(sess.run(labels * tf.log(softmax_out)))

print("cross_entropy1 is::")
cross_entropy1 = -tf.reduce_sum(labels * tf.log(softmax_out), axis=1)
print(sess.run(cross_entropy1))

结果：

softmax_out is::
[[9.0030573e-02 2.4472848e-01 6.6524094e-01]
 [9.0030573e-02 2.4472848e-01 6.6524094e-01]
 [4.7384717e-02 9.5174748e-01 8.6788135e-04]
 [9.9719369e-01 2.4717962e-03 3.3452120e-04]
 [9.5033026e-01 4.7314156e-02 2.3556333e-03]]
labels * tf.log(softmax_out) is::
[[-0.0000000e+00 -0.0000000e+00 -4.0760601e-01]
 [-0.0000000e+00 -0.0000000e+00 -4.0760601e-01]
 [-0.0000000e+00 -4.9455535e-02 -0.0000000e+00]
 [-2.8102510e-03 -0.0000000e+00 -0.0000000e+00]
 [-0.0000000e+00 -3.0509458e+00 -0.0000000e+00]]
cross_entropy1 is::
[4.0760601e-01 4.0760601e-01 4.9455535e-02 2.8102510e-03 3.0509458e+00]

二、tf.nn.softmax_cross_entropy_with_logits和tf.nn.sparse_softmax_cross_entropy_with_logits

1.两个函数的输出结果相同，区别在于输入的labels不同。
2.对于sparse_softmax_cross_entropy_with_logits， labels的size是[batch_size]，每个label的取值范围是[0, num_classes-1]，即每个样本的label就是0、1、2.

3.对于softmax_cross_entropy_with_logits, labels的size是[batch_size, num_classes]，即sparse_softmax_cross_entropy_with_logits中labels的one-hot值。

代码：

import tensorflow as tf
import numpy as np

sess = tf.Session()

logits = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 10, 3],
                   [8, 2, 0],
                   [9, 6, 3]], dtype=np.float32)

print("cross_entropy2 is::")
cross_entropy2 = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels)
print(sess.run(cross_entropy2))

labels = np.array([[0, 0, 1],
                   [0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0],
                   [0, 1, 0]], dtype=np.float32)

print("cross_entropy3 is::")
classes = tf.argmax(labels, axis=1)
cross_entropy3 = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=classes)
print(sess.run(cross_entropy3))

结果：

cross_entropy2 is::
[4.0760595e-01 4.0760595e-01 4.9455538e-02 2.8102214e-03 3.0509458e+00]
cross_entropy3 is::
[4.0760595e-01 4.0760595e-01 4.9455538e-02 2.8102214e-03 3.0509458e+00]

三、tf.losses.softmax_cross_entropy 和 tf.losses.sparse_softmax_cross_entropy

1.主要用于进行不同样本的loss计算，但可通过权重来控制loss损失值
2.默认weights=1，等价于tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits)
3.weights为标量w时，等价于w*tf.reduce_mean(tf.nn.softmax_corss…)
4.weights为向量时，算出的每个loss需要乘以对应样本权重，再求均值
5.tf.losses.sparse_softmax_cross_entropy 同理等等价于tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits),只不过输入labels是非one-hot编码格式

代码：

import tensorflow as tf
import numpy as np
sess = tf.Session()

logits = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 10, 3],
                   [8, 2, 0],
                   [9, 6, 3]], dtype=np.float32)

labels = np.array([[0, 0, 1],
                   [0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0],
                   [0, 1, 0]], dtype=np.float32)

cross1 = tf.nn.softmax_cross_entropy_with_logits(labels=labels,logits=logits)
cross2 = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits)
cross3 = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits, weights=0.3)

print("cross1 is::")
print (sess.run(cross1))
print("cross2 is::")
print (sess.run(cross2))
print("tf.reduce_mean(cross1) is::")
print (sess.run(tf.reduce_mean(cross1)))

print("cross3 is::")
print (sess.run(cross3))
print("0.3*tf.reduce_mean(cross1) is::")
print (sess.run(0.3*tf.reduce_mean(cross1)))

结果：

cross1 is::
[4.0760595e-01 4.0760595e-01 4.9455538e-02 2.8102214e-03 3.0509458e+00]
cross2 is::
0.7836847
tf.reduce_mean(cross1) is::
0.7836847

四、加权交叉熵损失 – 样本加权

tf.losses.softmax_cross_entropy可以对损失进行加权，当每个样本的权重不同时，可以按照样本的权重加权。

import tensorflow as tf
import numpy as np
sess = tf.Session()

logits = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 10, 3],
                   [8, 2, 0],
                   [9, 6, 3]], dtype=np.float32)

labels = np.array([[0, 0, 1],
                   [0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0],
                   [0, 1, 0]], dtype=np.float32)

cross2 = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits)
cross3 = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits, weights=0.3)

print("cross2 is::")
print (sess.run(cross2))
print("cross3 is::")
print (sess.run(cross3))
print("0.3*tf.reduce_mean(cross1) is::")
print (sess.run(0.3*tf.reduce_mean(cross1)))

cross4 = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits,weights=[1, 2, 3, 4, 5])
print("cross4 is::", cross4)
print (sess.run(cross4))
print("sum", 4.0760595e-01 * 1 + 4.0760595e-01 * 2 + 4.9455538e-02 * 3 + 2.8102214e-03 * 4 + 3.0509458e+00 * 5)
print("average", (4.0760595e-01 * 1 + 4.0760595e-01 * 2 + 4.9455538e-02 * 3 + 2.8102214e-03 * 4 + 3.0509458e+00 * 5)/5.)

结果：

cross2 is::
0.7836847
cross3 is::
0.23510543
0.3*tf.reduce_mean(cross1) is::
0.23510541
cross4 is::
3.3274307
sum 16.6371543496
average 3.3274308699199997

五、加权交叉熵损失 – 类别加权

实际上，可能会有这样的情况，每个样本都没有权重，但对于不同的类别，都有对应的权重，比如类别3不想出错，所以类别3的损失权重可能比其他类别大。

[En]

In fact, there may be such a situation, there is no weight for each sample, but for different categories, there are corresponding weights, for example, category 3 does not want errors, so the loss weight of category 3 may be larger than the others.

对于我们需要重视的类别，可以给其较高的权重，（权重越高，损失越大，模型越会学好这个类别），如：
如果类别较少，可以给予更高的权重，以使他们训练得更好。

[En]

If there are fewer categories, higher weights can be given to make them train better.
如果某个类别不允许出现错误，则需要对数据进行尽可能好的训练，并增加其权重。

[En]

If errors are not allowed in a certain category, you need to train the data as well as possible and increase its weight.


logits = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 10, 3],
                   [8, 2, 0],
                   [9, 6, 3]], dtype=np.float32)

labels = np.array([[0, 0, 1],
                   [0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0],
                   [0, 1, 0]], dtype=np.float32)

class_weights = tf.constant([[1.0, 1.5, 4.0]])
weights = tf.reduce_sum(class_weights * labels, axis=1)
print("weights is::", sess.run(weights))

cross1 = tf.nn.softmax_cross_entropy_with_logits(labels=labels,logits=logits)
print("cross1 is::")
print(sess.run(cross1))
weighted_losses = cross1 * weights
print("weighted_losses is::")
print(sess.run(weighted_losses))
print("tf.reduce_mean(weighted_losses) is::")
print(sess.run(tf.reduce_mean(weighted_losses)))

softmax_out=tf.nn.softmax(logits)
print("softmax_out is::")
print(sess.run(softmax_out))
print("labels * tf.log(softmax_out) is::")
print(sess.run(labels * tf.log(softmax_out)))

sum = 4.0760601e-01 * 4 + 4.0760601e-01 * 4 + 4.9455535e-02 * 1.5 + 2.8102510e-03 * 1 + 3.0509458e+00 * 1.5
print("sum/5 is::", sum/5)

结果：

weights is:: [4.  4.  1.5 1.  1.5]
cross1 is::
[4.0760595e-01 4.0760595e-01 4.9455538e-02 2.8102214e-03 3.0509458e+00]
weighted_losses is::
[1.6304238e+00 1.6304238e+00 7.4183308e-02 2.8102214e-03 4.5764189e+00]
tf.reduce_mean(weighted_losses) is::
1.582852
softmax_out is::
[[9.0030573e-02 2.4472848e-01 6.6524094e-01]
 [9.0030573e-02 2.4472848e-01 6.6524094e-01]
 [4.7384717e-02 9.5174748e-01 8.6788135e-04]
 [9.9719369e-01 2.4717962e-03 3.3452120e-04]
 [9.5033026e-01 4.7314156e-02 2.3556333e-03]]
labels * tf.log(softmax_out) is::
[[-0.0000000e+00 -0.0000000e+00 -4.0760601e-01]
 [-0.0000000e+00 -0.0000000e+00 -4.0760601e-01]
 [-0.0000000e+00 -4.9455535e-02 -0.0000000e+00]
 [-2.8102510e-03 -0.0000000e+00 -0.0000000e+00]
 [-0.0000000e+00 -3.0509458e+00 -0.0000000e+00]]
sum/5 is:: 1.5828520667

六、加权交叉熵损失 – 类别转移矩阵加权

以上是针对不同类别的，权重是不同的，但如果您没有固定类别的权重，但例如，如果您将类别1预测为类别2，则此错误的成本会更高，而类别1则被预测为成本较低的类别3。损失怎么能用这种方式加权呢？例如，预测值和真值预测权重矩阵为：

[En]

The above is for different categories, and the weights are different, but if you do not have weights for a fixed category, but for example, if you predict category 1 as category 2, this error is more costly, while category 1 is predicted to be category 3 at a lower cost. how can losses be weighted in this way? For example, the predicted value and true value prediction weight matrix are:

真实类别1真实类别2真实类别3预测类别1142预测类别2511预测类别3431

其中w 01 = 4 w_{01}=4 w 0 1 =4表示预测类别为1实际类别为2的权重，w 02 = 2 w_{02}=2 w 0 2 =2表示预测类别为1实际类别为3的权重。

代码：


logits = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 10, 3],
                   [2, 8, 0],
                   [9, 6, 3]], dtype=np.float32)

labels = np.array([[0, 1, 0],
                   [0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0],
                   [0, 1, 0]], dtype=np.float32)
transfer_weights = tf.constant([[1.0, 4.0, 2.0],
                               [5.0, 1.0, 1.0],
                               [4.0, 3.0, 1.0]])

weighted_logits = tf.matmul(logits, transfer_weights)
print("weighted_logits is::")
print(sess.run(weighted_logits))

softmax_out=tf.nn.softmax(weighted_logits)
print("softmax_out is::")
print(sess.run(softmax_out))
print("labels * tf.log(softmax_out) is::")
print(sess.run(labels * tf.log(softmax_out)))

cross1 = tf.nn.softmax_cross_entropy_with_logits(labels=labels,logits=weighted_logits)
cross2 = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=weighted_logits)
print ("cross1 is::", sess.run(cross1))
print ("cross2 is::", sess.run(cross2))

sum = 8.000336 + 34. + 22. + 0. + 0.6931472
print("average is::", sum/5.)

结果：

weighted_logits is::
[[23. 15.  7.]
 [53. 39. 19.]
 [69. 47. 27.]
 [42. 16. 12.]
 [51. 51. 27.]]
softmax_out is::
[[9.99664545e-01 3.35350080e-04 1.12497425e-07]
 [9.99999166e-01 8.31528041e-07 1.71390690e-15]
 [1.00000000e+00 2.78946810e-10 5.74952202e-19]
 [1.00000000e+00 5.10908893e-12 9.35762291e-14]
 [5.00000000e-01 5.00000000e-01 1.88756719e-11]]
labels * tf.log(softmax_out) is::
[[ -0.         -8.000336   -0.       ]
 [ -0.         -0.        -34.       ]
 [  0.        -22.         -0.       ]
 [  0.         -0.         -0.       ]
 [ -0.         -0.6931472  -0.       ]]
cross1 is:: [ 8.000336  34.        22.         0.         0.6931472]
cross2 is:: 12.938696
average is:: 12.93869664

Original: https://blog.csdn.net/quiet_girl/article/details/119854970
Author: nana-li
Title: [TensorFlow] 交叉熵损失函数，加权交叉熵损失函数

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/508661/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

人工智能/虚拟现实技术的工程伦理分析：以电影《头号玩家》为例

*人工智能已经迎来第三次浪潮，一方面，人工智能已经应用于社会的方方面面，并日益发挥着无可替代的作用；另一方面，人工智能存在着局限性以及争议。本文聚焦于人工智能的一个争议点：人工智…

人工智能 2023年6月26日
00107
PyTorch数据归一化处理：transforms.Normalize及计算图像数据集的均值和方差

PyTorch数据归一化处理：transforms.Normalize及计算图像数据集的均值和方差 1.数据归一化处理：transforms.Normalize * 1.1 理解t…

人工智能 2023年6月15日
0068
海康工业摄像头调用（linux基于python和opencv）

1.下载官网客户端（其中包含SDK）官方网站海康机器人-机器视觉-下载中心安装deb文件 sudo dpkg -i deb文件&a…

人工智能 2023年6月19日
00125
Beam Search快速理解及代码解析

Beam Search 简单介绍一下在文本生成任务中常用的解码策略Beam Search（集束搜索）。生成式任务相比普通的分类、tagging等NLP任务会复杂不少。在生成的时候…

人工智能 2023年5月28日
00102
长期稳定的项目—steam搬砖

啊哦~你想找的内容离你而去了哦内容不存在，可能为如下原因导致： ① 内容还在审核中 ② 内容以前存在，但是由于不符合新的规定而被删除 ③ 内容地址错误 ④ 作者删除了内容。可…

人工智能 2023年7月29日
0062
论文阅读笔记之手术器械分类的注意约束自适应核选择网络（SKA-ResNet）（一）

Adaptive kernel selection network with attention constraint for surgical instrument classi…

人工智能 2023年7月2日
0071
基于MATLAB的简单手势识别

匆匆在看完了MOOC的《数字图像处理》，为了巩固所学，做了一个简单的手势识别（只能识别手势1、2、3）！ 0.1、MATLAB R2021b安装软件包下载地址微信公众号：小白课代…

人工智能 2023年6月18日
0093
pythonpandas读取多列数据为一列_Python+Pandas读取excel一列或者多列保存为列表

excel内容读取一列保存为list，项目名称为例： import pandas as pd def excel_one_line_to_list(): df = pd.read…

人工智能 2023年7月9日
0064
（图像检测1）Py-faster-rcnn-master目录解析

代码来源：https://github.com/rbgirshick/py-faster-rcnn 代码目录： Fast-Rcnn是一个two-steps目标检测算法，与之对应的是…

人工智能 2023年7月12日
0062
[趣味][人工智能生成文字]chatGPT使用教程

啊哦~你想找的内容离你而去了哦内容不存在，可能为如下原因导致： ① 内容还在审核中 ② 内容以前存在，但是由于不符合新的规定而被删除 ③ 内容地址错误 ④ 作者删除了内容。可…

人工智能 2023年7月31日
0078
MAE 代码实战详解

MAE 代码实战详解 if__name__==”main“ * model.forward – model.forward.encorder m…

人工智能 2023年6月16日
0094
pytorch中LSTM的输出的理解，以及batch_first=True or False的输出层的区别

还记得寒假，我也纠结过这个问题，当时好像弄清楚了，感觉没什么问题，然后最近上手又感觉有点懵逼，赶紧记下来，免得以后忘记。网上搜了很多，但是好像没有简单易懂的例子。目录输出层o…

人工智能 2023年6月16日
0095
PyTorch模型可视化

在PyTorch深度学习中，最常用的模型可视化工具是Facebook（中文为脸书，目前已改名为Meta）公司开源的Visdom，本节通过案例详细介绍该模型可视化工具。简介 Vis…

人工智能 2023年7月22日
0069
《玩态人生》重新学会玩

前方高能警告，在您开始阅读时，请记住，这只是作者小鬼个人的价值观思考。如果看到一半发现严重与您的体系不同，发现不适。请坚决关掉，因为接下来写的内容，可能跟主流的价值观存在很多冲突的…

人工智能 2023年6月4日
0077
数据库课程设计——学生信息管理系统C#,SQL Sever

目录利用SQL Sever和 VS C#实现一、程序流程图二、具体实现：利用SQL Sever和 VS实现，使用C#连接数据库 1、新建一个名为MySchool的数据库…

人工智能 2023年7月30日
0074
Windows-安装dlib库（亲测绝对可以，超详细）

dlib 是一个C++的库，安装在python下，用来做人脸识别和检测。如果直接在终端下直接使用，或者直接在pycharm里面添加这个包 pip install dlib 直接报…

人工智能 2023年7月30日
0059

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

[TensorFlow] 交叉熵损失函数，加权交叉熵损失函数

写在前面

一、基础计算

二、tf.nn.softmax_cross_entropy_with_logits和tf.nn.sparse_softmax_cross_entropy_with_logits

三、tf.losses.softmax_cross_entropy 和 tf.losses.sparse_softmax_cross_entropy

四、加权交叉熵损失 – 样本加权

五、加权交叉熵损失 – 类别加权

六、加权交叉熵损失 – 类别转移矩阵加权

大家都在看