[TensorFlow] 交叉熵损失函数,加权交叉熵损失函数

写在前面

一、基础计算

当存在多个类别时,通常使用交叉熵损失函数来衡量模型的效果,这也是调整模型参数的重要依据。交叉熵损失函数的公式为:

[En]

When there are multiple categories, the cross entropy loss function is usually used to measure the effect of the model, which is also an important basis for adjusting the parameters of the model. The formula of the cross entropy loss function is:

L = 1 N ∑ i L i = − 1 N ∑ i ∑ C = 1 M y i c l o g P i c \begin{aligned} L = \frac{1}{N}\sum_{i}L_i = -\frac{1}{N} \sum_{i} \sum_{C=1}^My_{ic}logP_{ic} \end{aligned}L =N 1 ​i ∑​L i ​=−N 1 ​i ∑​C =1 ∑M ​y i c ​l o g P i c ​​
代码:

import tensorflow as tf
import numpy as np

sess=tf.Session()

logits = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 10, 3],
                   [8, 2, 0],
                   [9, 6, 3]], dtype=np.float32)

labels = np.array([[0, 0, 1],
                   [0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0],
                   [0, 1, 0]], dtype=np.float32)

softmax_out=tf.nn.softmax(logits)
print("softmax_out is::")
print(sess.run(softmax_out))

print("labels * tf.log(softmax_out) is::")
print(sess.run(labels * tf.log(softmax_out)))

print("cross_entropy1 is::")
cross_entropy1 = -tf.reduce_sum(labels * tf.log(softmax_out), axis=1)
print(sess.run(cross_entropy1))

结果:

softmax_out is::
[[9.0030573e-02 2.4472848e-01 6.6524094e-01]
 [9.0030573e-02 2.4472848e-01 6.6524094e-01]
 [4.7384717e-02 9.5174748e-01 8.6788135e-04]
 [9.9719369e-01 2.4717962e-03 3.3452120e-04]
 [9.5033026e-01 4.7314156e-02 2.3556333e-03]]
labels * tf.log(softmax_out) is::
[[-0.0000000e+00 -0.0000000e+00 -4.0760601e-01]
 [-0.0000000e+00 -0.0000000e+00 -4.0760601e-01]
 [-0.0000000e+00 -4.9455535e-02 -0.0000000e+00]
 [-2.8102510e-03 -0.0000000e+00 -0.0000000e+00]
 [-0.0000000e+00 -3.0509458e+00 -0.0000000e+00]]
cross_entropy1 is::
[4.0760601e-01 4.0760601e-01 4.9455535e-02 2.8102510e-03 3.0509458e+00]

二、tf.nn.softmax_cross_entropy_with_logits和tf.nn.sparse_softmax_cross_entropy_with_logits

1.两个函数的输出结果相同,区别在于输入的labels不同。
2.对于sparse_softmax_cross_entropy_with_logits, labels的size是[batch_size],每个label的取值范围是[0, num_classes-1],即每个样本的label就是0、1、2.

3.对于softmax_cross_entropy_with_logits, labels的size是[batch_size, num_classes],即sparse_softmax_cross_entropy_with_logits中labels的one-hot值。

代码:

import tensorflow as tf
import numpy as np

sess = tf.Session()

logits = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 10, 3],
                   [8, 2, 0],
                   [9, 6, 3]], dtype=np.float32)

print("cross_entropy2 is::")
cross_entropy2 = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels)
print(sess.run(cross_entropy2))

labels = np.array([[0, 0, 1],
                   [0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0],
                   [0, 1, 0]], dtype=np.float32)

print("cross_entropy3 is::")
classes = tf.argmax(labels, axis=1)
cross_entropy3 = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=classes)
print(sess.run(cross_entropy3))

结果:

cross_entropy2 is::
[4.0760595e-01 4.0760595e-01 4.9455538e-02 2.8102214e-03 3.0509458e+00]
cross_entropy3 is::
[4.0760595e-01 4.0760595e-01 4.9455538e-02 2.8102214e-03 3.0509458e+00]

三、tf.losses.softmax_cross_entropy 和 tf.losses.sparse_softmax_cross_entropy

1.主要用于进行不同样本的loss计算,但可通过权重来控制loss损失值
2.默认weights=1,等价于tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits)
3.weights为标量w时,等价于w*tf.reduce_mean(tf.nn.softmax_corss…)
4.weights为向量时,算出的每个loss需要乘以对应样本权重,再求均值
5.tf.losses.sparse_softmax_cross_entropy 同理等等价于tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits),只不过输入labels是非one-hot编码格式

代码:

import tensorflow as tf
import numpy as np
sess = tf.Session()

logits = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 10, 3],
                   [8, 2, 0],
                   [9, 6, 3]], dtype=np.float32)

labels = np.array([[0, 0, 1],
                   [0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0],
                   [0, 1, 0]], dtype=np.float32)

cross1 = tf.nn.softmax_cross_entropy_with_logits(labels=labels,logits=logits)
cross2 = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits)
cross3 = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits, weights=0.3)

print("cross1 is::")
print (sess.run(cross1))
print("cross2 is::")
print (sess.run(cross2))
print("tf.reduce_mean(cross1) is::")
print (sess.run(tf.reduce_mean(cross1)))

print("cross3 is::")
print (sess.run(cross3))
print("0.3*tf.reduce_mean(cross1) is::")
print (sess.run(0.3*tf.reduce_mean(cross1)))

结果:

cross1 is::
[4.0760595e-01 4.0760595e-01 4.9455538e-02 2.8102214e-03 3.0509458e+00]
cross2 is::
0.7836847
tf.reduce_mean(cross1) is::
0.7836847

四、加权交叉熵损失 – 样本加权

tf.losses.softmax_cross_entropy可以对损失进行加权,当每个样本的权重不同时,可以按照样本的权重加权。

import tensorflow as tf
import numpy as np
sess = tf.Session()

logits = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 10, 3],
                   [8, 2, 0],
                   [9, 6, 3]], dtype=np.float32)

labels = np.array([[0, 0, 1],
                   [0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0],
                   [0, 1, 0]], dtype=np.float32)

cross2 = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits)
cross3 = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits, weights=0.3)

print("cross2 is::")
print (sess.run(cross2))
print("cross3 is::")
print (sess.run(cross3))
print("0.3*tf.reduce_mean(cross1) is::")
print (sess.run(0.3*tf.reduce_mean(cross1)))

cross4 = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=logits,weights=[1, 2, 3, 4, 5])
print("cross4 is::", cross4)
print (sess.run(cross4))
print("sum", 4.0760595e-01 * 1 + 4.0760595e-01 * 2 + 4.9455538e-02 * 3 + 2.8102214e-03 * 4 + 3.0509458e+00 * 5)
print("average", (4.0760595e-01 * 1 + 4.0760595e-01 * 2 + 4.9455538e-02 * 3 + 2.8102214e-03 * 4 + 3.0509458e+00 * 5)/5.)

结果:

cross2 is::
0.7836847
cross3 is::
0.23510543
0.3*tf.reduce_mean(cross1) is::
0.23510541
cross4 is::
3.3274307
sum 16.6371543496
average 3.3274308699199997

五、加权交叉熵损失 – 类别加权

实际上,可能会有这样的情况,每个样本都没有权重,但对于不同的类别,都有对应的权重,比如类别3不想出错,所以类别3的损失权重可能比其他类别大。

[En]

In fact, there may be such a situation, there is no weight for each sample, but for different categories, there are corresponding weights, for example, category 3 does not want errors, so the loss weight of category 3 may be larger than the others.

对于我们需要重视的类别,可以给其较高的权重,(权重越高,损失越大,模型越会学好这个类别),如:
如果类别较少,可以给予更高的权重,以使他们训练得更好。

[En]

If there are fewer categories, higher weights can be given to make them train better.

如果某个类别不允许出现错误,则需要对数据进行尽可能好的训练,并增加其权重。

[En]

If errors are not allowed in a certain category, you need to train the data as well as possible and increase its weight.


logits = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 10, 3],
                   [8, 2, 0],
                   [9, 6, 3]], dtype=np.float32)

labels = np.array([[0, 0, 1],
                   [0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0],
                   [0, 1, 0]], dtype=np.float32)

class_weights = tf.constant([[1.0, 1.5, 4.0]])
weights = tf.reduce_sum(class_weights * labels, axis=1)
print("weights is::", sess.run(weights))

cross1 = tf.nn.softmax_cross_entropy_with_logits(labels=labels,logits=logits)
print("cross1 is::")
print(sess.run(cross1))
weighted_losses = cross1 * weights
print("weighted_losses is::")
print(sess.run(weighted_losses))
print("tf.reduce_mean(weighted_losses) is::")
print(sess.run(tf.reduce_mean(weighted_losses)))

softmax_out=tf.nn.softmax(logits)
print("softmax_out is::")
print(sess.run(softmax_out))
print("labels * tf.log(softmax_out) is::")
print(sess.run(labels * tf.log(softmax_out)))

sum = 4.0760601e-01 * 4 + 4.0760601e-01 * 4 + 4.9455535e-02 * 1.5 + 2.8102510e-03 * 1 + 3.0509458e+00 * 1.5
print("sum/5 is::", sum/5)

结果:

weights is:: [4.  4.  1.5 1.  1.5]
cross1 is::
[4.0760595e-01 4.0760595e-01 4.9455538e-02 2.8102214e-03 3.0509458e+00]
weighted_losses is::
[1.6304238e+00 1.6304238e+00 7.4183308e-02 2.8102214e-03 4.5764189e+00]
tf.reduce_mean(weighted_losses) is::
1.582852
softmax_out is::
[[9.0030573e-02 2.4472848e-01 6.6524094e-01]
 [9.0030573e-02 2.4472848e-01 6.6524094e-01]
 [4.7384717e-02 9.5174748e-01 8.6788135e-04]
 [9.9719369e-01 2.4717962e-03 3.3452120e-04]
 [9.5033026e-01 4.7314156e-02 2.3556333e-03]]
labels * tf.log(softmax_out) is::
[[-0.0000000e+00 -0.0000000e+00 -4.0760601e-01]
 [-0.0000000e+00 -0.0000000e+00 -4.0760601e-01]
 [-0.0000000e+00 -4.9455535e-02 -0.0000000e+00]
 [-2.8102510e-03 -0.0000000e+00 -0.0000000e+00]
 [-0.0000000e+00 -3.0509458e+00 -0.0000000e+00]]
sum/5 is:: 1.5828520667

六、加权交叉熵损失 – 类别转移矩阵加权

以上是针对不同类别的,权重是不同的,但如果您没有固定类别的权重,但例如,如果您将类别1预测为类别2,则此错误的成本会更高,而类别1则被预测为成本较低的类别3。损失怎么能用这种方式加权呢?例如,预测值和真值预测权重矩阵为:

[En]

The above is for different categories, and the weights are different, but if you do not have weights for a fixed category, but for example, if you predict category 1 as category 2, this error is more costly, while category 1 is predicted to be category 3 at a lower cost. how can losses be weighted in this way? For example, the predicted value and true value prediction weight matrix are:

真实类别1真实类别2真实类别3预测类别1142预测类别2511预测类别3431

其中w 01 = 4 w_{01}=4 w 0 1 ​=4表示预测类别为1实际类别为2的权重,w 02 = 2 w_{02}=2 w 0 2 ​=2表示预测类别为1实际类别为3的权重。

代码:


logits = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 10, 3],
                   [2, 8, 0],
                   [9, 6, 3]], dtype=np.float32)

labels = np.array([[0, 1, 0],
                   [0, 0, 1],
                   [0, 1, 0],
                   [1, 0, 0],
                   [0, 1, 0]], dtype=np.float32)
transfer_weights = tf.constant([[1.0, 4.0, 2.0],
                               [5.0, 1.0, 1.0],
                               [4.0, 3.0, 1.0]])

weighted_logits = tf.matmul(logits, transfer_weights)
print("weighted_logits is::")
print(sess.run(weighted_logits))

softmax_out=tf.nn.softmax(weighted_logits)
print("softmax_out is::")
print(sess.run(softmax_out))
print("labels * tf.log(softmax_out) is::")
print(sess.run(labels * tf.log(softmax_out)))

cross1 = tf.nn.softmax_cross_entropy_with_logits(labels=labels,logits=weighted_logits)
cross2 = tf.losses.softmax_cross_entropy(onehot_labels=labels, logits=weighted_logits)
print ("cross1 is::", sess.run(cross1))
print ("cross2 is::", sess.run(cross2))

sum = 8.000336 + 34. + 22. + 0. + 0.6931472
print("average is::", sum/5.)

结果:

weighted_logits is::
[[23. 15.  7.]
 [53. 39. 19.]
 [69. 47. 27.]
 [42. 16. 12.]
 [51. 51. 27.]]
softmax_out is::
[[9.99664545e-01 3.35350080e-04 1.12497425e-07]
 [9.99999166e-01 8.31528041e-07 1.71390690e-15]
 [1.00000000e+00 2.78946810e-10 5.74952202e-19]
 [1.00000000e+00 5.10908893e-12 9.35762291e-14]
 [5.00000000e-01 5.00000000e-01 1.88756719e-11]]
labels * tf.log(softmax_out) is::
[[ -0.         -8.000336   -0.       ]
 [ -0.         -0.        -34.       ]
 [  0.        -22.         -0.       ]
 [  0.         -0.         -0.       ]
 [ -0.         -0.6931472  -0.       ]]
cross1 is:: [ 8.000336  34.        22.         0.         0.6931472]
cross2 is:: 12.938696
average is:: 12.93869664

Original: https://blog.csdn.net/quiet_girl/article/details/119854970
Author: nana-li
Title: [TensorFlow] 交叉熵损失函数,加权交叉熵损失函数

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/508661/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球