pytorch 深度学习之线性代数

包含数字的标量称为标量。标量由只有一个元素的张量表示:

[En]

A scalar containing a number is called scalar. A scalar is represented by a tensor with only one element:

import torch

x = torch.tensor(3.0)
y = torch.tensor(2.0)

x + y,x * y,x / y,x ** y
(tensor(5.), tensor(6.), tensor(1.5000), tensor(9.))

将向量想象为标量值的列表。我们称这些标量值为向量的元素(Element)或分量(Component)。该向量由一维张量处理。通常,张量可以有任何长度,这取决于机器的内存限制,下标可以用来引用向量的任何元素。

[En]

Think of a vector as a list of scalar values. We call these scalar values elements (element) or components (component) of a vector. The vector is processed by one-dimensional tensor. In general, tensors can have any length, depending on the memory limitations of the machine, and subscripts can be used to refer to any element of the vector.

[\mathbf{A}=\left[\begin{array}{c}\mathbf{a}{1}^{\top} \ \mathbf{a}{2}^{\top} \ \vdots \ \mathbf{a}_{m}^{\top}\end{array}\right] ]

x = torch.arange(4)
x,x[3]
(tensor([0, 1, 2, 3]), tensor(3))

长度,维度和形状

向量的长度通常称为向量的维度,张量的长度可以通过调用Python的内置len()函数来访问。当一个向量由张量(只有一个轴)表示时,我们还可以通过.Shape属性访问该向量的长度。形状是列出张量沿每个轴的长度(尺寸)的一组元素。对于只有一个轴的张量,形状只有一个元素:

[En]

The length of a vector is often called the dimension of a vector, and the length of the tensor can be accessed by calling the built-in len () function of Python. When a vector is represented by a tensor (there is only one axis), we can also access the length of the vector through the .shape property. A shape is a group of elements that lists the length (dimension) of a tensor along each axis. For a tensor with only one axis, the shape has only one element:

len(x),x.shape
(4, torch.Size([4]))

正如向量将标量从零阶扩展到一阶一样,矩阵将向量从一阶扩展到二阶。

[En]

Just as a vector extends a scalar from order zero to order one, a matrix extends a vector from order one to order two.

[\mathbf{A}=\left[\begin{array}{cccc}a_{11} & a_{12} & \cdots & a_{1 n} \ a_{21} & a_{22} & \cdots & a_{2 n} \ \vdots & \vdots & \ddots & \vdots \ a_{m 1} & a_{m 2} & \cdots & a_{m n}\end{array}\right] ]

A = torch.arange(20).reshape(5, 4)
A,A[-1],A[2][3]
(tensor([[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
         [ 8,  9, 10, 11],
         [12, 13, 14, 15],
         [16, 17, 18, 19]]),
 tensor([16, 17, 18, 19]),
 tensor(11))

当我们交换矩阵的行和列时,结果称为矩阵的转置:

[En]

When we switch rows and columns of a matrix, the result is called the transpose of the matrix:

A.T
tensor([[ 0,  4,  8, 12, 16],
        [ 1,  5,  9, 13, 17],
        [ 2,  6, 10, 14, 18],
        [ 3,  7, 11, 15, 19]])

作为方阵的一种特殊类型,对称矩阵(symmetric matrix)(\mathbf{A}) 等于其转置:(\mathbf{A}=\mathbf{A}^{\top}),定义一个对称矩阵:

B = torch.tensor([[1,2,3],[2,0,4],[3,4,5]])

B,B == B.T
(tensor([[1, 2, 3],
         [2, 0, 4],
         [3, 4, 5]]),
 tensor([[True, True, True],
         [True, True, True],
         [True, True, True]]))

正如向量是标量的推广,矩阵是向量的推广,我们可以构建具有更多轴的数据结构。张量为我们提供了描述具有任意多个轴的n维数组的一般方法:

[En]

Just as a vector is a generalization of a scalar and a matrix is a generalization of a vector, we can build data structures with more axes. Tensors provide us with a general method for describing n-dimensional arrays with any number of axes:

X = torch.arange(24).reshape(2,3,4)

X
tensor([[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
         [ 8,  9, 10, 11]],

        [[12, 13, 14, 15],
         [16, 17, 18, 19],
         [20, 21, 22, 23]]])

张量算法的基本性质

给定任意两个相同形状的张量,元素上的任何二元运算的结果都将是相同形状的张量:

[En]

Given any two tensors of the same shape, the result of any binary operation on the element will be a tensor of the same shape:

A = torch.arange(20,dtype=torch.float32).reshape(5,4)
B = A.clone()
A,A + B
(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [12., 13., 14., 15.],
         [16., 17., 18., 19.]]),
 tensor([[ 0.,  2.,  4.,  6.],
         [ 8., 10., 12., 14.],
         [16., 18., 20., 22.],
         [24., 26., 28., 30.],
         [32., 34., 36., 38.]]))

将标量相乘或相加不会更改张量的形状,其中张量的每个元素都与标量相加或相乘:

[En]

Multiplying or adding a scalar does not change the shape of the tensor, where each element of the tensor is added or multiplied by the scalar:

a = 2
X = torch.arange(24).reshape(2,3,4)
a + X,(a * X).shape
(tensor([[[ 2,  3,  4,  5],
          [ 6,  7,  8,  9],
          [10, 11, 12, 13]],

         [[14, 15, 16, 17],
          [18, 19, 20, 21],
          [22, 23, 24, 25]]]),
 torch.Size([2, 3, 4]))

计算张量元素的和:

[En]

Calculate the sum of tensor elements:

x = torch.arange(4, dtype=torch.float32)
x, x.sum()
(tensor([0., 1., 2., 3.]), tensor(6.))

默认情况下,调用求和函数会沿所有的轴降低张量的维度,使它变为一个标量。 我们还可以指定张量沿哪一个轴来通过求和降低维度。 以矩阵为例,为了通过求和所有行的元素来降维(轴0),我们可以在调用函数时指定 axis=0。 由于输入矩阵沿0轴降维以生成输出向量,因此输入轴0的维数在输出形状中消失:

A = torch.arange(20,dtype=torch.float32).reshape(5,4)
A_sum_axis0 = A.sum(axis=0)
A,A_sum_axis0,A_sum_axis0.shape
(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [12., 13., 14., 15.],
         [16., 17., 18., 19.]]),
 tensor([40., 45., 50., 55.]),
 torch.Size([4]))

指定 axis=1 将通过汇总所有列的元素降维(轴1)。因此,输入轴1的维数在输出形状中消失:

A = torch.arange(20,dtype=torch.float32).reshape(5,4)
A_sum_axis1 = A.sum(axis=1)
A,A_sum_axis1,A_sum_axis1.shape
(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [12., 13., 14., 15.],
         [16., 17., 18., 19.]]),
 tensor([ 6., 22., 38., 54., 70.]),
 torch.Size([5]))

沿行和列对矩阵求和等同于对矩阵的所有元素求和:

[En]

Summing a matrix along rows and columns is equivalent to summing all the elements of the matrix:

A.sum(axis=[0, 1])  # A.sum()
tensor(190.)

调用 mean 函数可以求平均值:

A.mean(),A.sum()/A.numel(),A.mean(axis=0),A.sum(axis=0)/A.shape[0]
(tensor(9.5000),
 tensor(9.5000),
 tensor([ 8.,  9., 10., 11.]),
 tensor([ 8.,  9., 10., 11.]))

有时,在调用函数计算总和或平均值时,保持相同的轴数很有用:

[En]

Sometimes it is useful to keep the number of axes the same when calling a function to calculate the sum or mean:

sum_A = A.sum(axis=1, keepdims=True)
sum_A,A/sum_A
(tensor([[ 6.],
         [22.],
         [38.],
         [54.],
         [70.]]),
 tensor([[0.0000, 0.1667, 0.3333, 0.5000],
         [0.1818, 0.2273, 0.2727, 0.3182],
         [0.2105, 0.2368, 0.2632, 0.2895],
         [0.2222, 0.2407, 0.2593, 0.2778],
         [0.2286, 0.2429, 0.2571, 0.2714]]))

给定两个向量:(\mathbf{x}, \mathbf{y} \in \mathbb{R}^{d}),它们的点积(dot product):(\mathbf{x}^{\top} \mathbf{y}),或者记为:(\langle\mathbf{x}, \mathbf{y}\rangle),相同位置的按元素乘积的和:(\mathbf{x}^{\top} \mathbf{y}=\sum_{i=1}^{d} x_{i} y_{i}):

y = torch.ones(4, dtype = torch.float32)
x, y, torch.dot(x, y)
(tensor([0., 1., 2., 3.]), tensor([1., 1., 1., 1.]), tensor(6.))

为矩阵 A 和向量 x 调用 torch.mv(A, x) 时,会执行矩阵-向量积。 注意,A 的列维数(沿轴1的长度)必须与 x 的维数(其长度)相同:

[\mathbf{A}=\left[\begin{array}{c}\mathbf{a}{1}^{\top} \ \mathbf{a}{2}^{\top} \ \vdots \ \mathbf{a}_{m}^{\top}\end{array}\right] ]

[\mathbf{A}=\left[\begin{array}{cccc}a_{11} & a_{12} & \cdots & a_{1 n} \ a_{21} & a_{22} & \cdots & a_{2 n} \ \vdots & \vdots & \ddots & \vdots \ a_{m 1} & a_{m 2} & \cdots & a_{m n}\end{array}\right] ]

[\mathbf{A} \mathbf{x}=\left[\begin{array}{c}\mathbf{a}{1}^{\top} \ \mathbf{a}{2}^{\top} \ \vdots \ \mathbf{a}{m}^{\top}\end{array}\right] \mathbf{x}=\left[\begin{array}{c}\mathbf{a}{1}^{\top} \mathbf{x} \ \mathbf{a}{2}^{\top} \mathbf{x} \ \vdots \ \mathbf{a}{m}^{\top} \mathbf{x}\end{array}\right] ]

A.shape,x.shape,torch.mv(A,x)
(torch.Size([5, 4]), torch.Size([4]), tensor([ 14.,  38.,  62.,  86., 110.]))

假设有两个矩阵:(mathbf{A}inmathbb{R}^{nTimes k}),(mathbf{B}inmathbb{R}^{kTimes m}):

[En]

Suppose there are two matrices: ( mathbf {A} in mathbb {R} ^ {n times k}), ( mathbf {B} in mathbb {R} ^ {k times m}):

[\mathbf{A}=\left[\begin{array}{cccc}a_{11} & a_{12} & \cdots & a_{1 k} \ a_{21} & a_{22} & \cdots & a_{2 k} \ \vdots & \vdots & \ddots & \vdots \ a_{n 1} & a_{n 2} & \cdots & a_{n k}\end{array}\right], \quad \mathbf{B}=\left[\begin{array}{cccc}b_{11} & b_{12} & \cdots & b_{1 m} \ b_{21} & b_{22} & \cdots & b_{2 m} \ \vdots & \vdots & \ddots & \vdots \ b_{k 1} & b_{k 2} & \cdots & b_{k m}\end{array}\right] ]

则计算结果 (\mathbf{C} \in \mathbb{R}^{n \times m}):

[\mathbf{C}=\mathbf{A B}=\left[\begin{array}{c}\mathbf{a}{1}^{\top} \ \mathbf{a}{2}^{\top} \ \vdots \ \mathbf{a}{n}^{\top}\end{array}\right]\left[\begin{array}{llll}\mathbf{b}{1} & \mathbf{b}{2} & \cdots & \mathbf{b}{m}\end{array}\right]=\left[\begin{array}{cccc}\mathbf{a}{1}^{\top} \mathbf{b}{1} & \mathbf{a}{1}^{\top} \mathbf{b}{2} & \cdots & \mathbf{a}{1}^{\top} \mathbf{b}{m} \ \mathbf{a}{2}^{\top} \mathbf{b}{1} & \mathbf{a}{2}^{\top} \mathbf{b}{2} & \cdots & \mathbf{a}{2}^{\top} \mathbf{b}{m} \ \vdots & \vdots & \ddots & \vdots \ \mathbf{a}{n}^{\top} \mathbf{b}{1} & \mathbf{a}{n}^{\top} \mathbf{b}{2} & \cdots & \mathbf{a}{n}^{\top} \mathbf{b}{m}\end{array}\right] ]

B = torch.ones(4,3)
torch.mm(A,B)
tensor([[ 6.,  6.,  6.],
        [22., 22., 22.],
        [38., 38., 38.],
        [54., 54., 54.],
        [70., 70., 70.]])

向量的范数告诉我们一个向量有多大。这里考虑的尺寸概念不涉及尺寸,而是部件的尺寸。

[En]

The norm of a vector tells us how big a vector is. The concept of size considered here does not involve dimensions, but the size of components.

在线性代数中,向量范数是将向量映射到标量的函数。给定任意向量(mathbf{X}),向量范数满足某些属性:

[En]

In linear algebra, a vector norm is a function that maps a vector to a scalar. Given any vector ( mathbf {X}), the vector norm satisfies some attributes:

  • 第一个性质:如果我们按常数因子(Q) 缩放向量的所有元素, 其范数也会按相同常数因子的绝对值缩放:(f(\alpha \mathbf{x})=|\alpha| f(\mathbf{x}))
  • 第二个性质:三角不等式:(F(mathbf{x}+mathbf{y})leq f(mathbf{x})+f(mathbf{y}))
    [En]

    the second property: triangular inequality: (f ( mathbf {x} + mathbf {y}) leq f ( mathbf {x}) + f ( mathbf {y}))*

  • 第三个属性:Norm必须为非负:(F(mathbf{x})geq 0)
    [En]

    third property: norm must be non-negative: (f ( mathbf {x}) geq 0)*

  • 最后一个性质要求范数最小为0,当且仅当向量全由0组成:(\forall i,[\mathbf{x}]_{i}=0 \Leftrightarrow f(\mathbf{x})=0)

欧几里得距离是一个 (L_{2}) 范数:

[\|\mathbf{x}\|{2}=\sqrt{\sum{i=1}^{n} x_{i}^{2}} ]

u = torch.tensor([3.0,-4.0])
torch.norm(u)
tensor(5.)

(L_{1}) 范数,表示为向量元素的绝对值之和:

[\|\mathbf{x}\|{1}=\sum{i=1}^{n}\left|x_{i}\right| ]

与 (L_{2}) 范数相比,(L_{1}) 范数受异常值的影响较小。

torch.abs(u).sum()
tensor(7.)

(L_{2}) 范数和 (L_{1}) 范数都是更一般的范数的特例:

[\|\mathbf{x}\|{p}=\left(\sum{i=1}^{n}\left|x_{i}\right|^{p}\right)^{1 / p} ]

类似于向量的 (L_{2}) 范数,矩阵 (\mathbf{X} \in \mathbb{R}^{m \times n}) 的Frobenius范数(Frobenius norm)是矩阵元素平方和的平方根:

[\|\mathbf{X}\|{F}=\sqrt{\sum{i=1}^{m} \sum_{j=1}^{n} x_{i j}^{2}} ]

Frobenius 范数满足向量范数的所有性质,它就像是矩阵形向量的 (L_{2}) 范数。 调用以下函数将计算矩阵的 Frobenius 范数。

torch.norm(torch.ones(4,9))
tensor(6.)

范数和目标

经常尝试解决优化问题:最大化分配给观测数据的概率;最小化预测与实际观测之间的距离。目标可能是深度学习算法最重要的组成部分(数据除外),通常用规范来表示。

[En]

Often try to solve the optimization problem: maximize the probability allocated to the observed data; minimize the distance between the prediction and the real observation. Goals, perhaps the most important component of deep learning algorithms (except data), are usually expressed as norms.

Original: https://www.cnblogs.com/xiaojianliu/p/16152580.html
Author: 6小贱
Title: pytorch 深度学习之线性代数

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/7253/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

免费咨询
免费咨询
扫码关注
扫码关注
联系站长

站长Johngo!

大数据和算法重度研究者!

持续产出大数据、算法、LeetCode干货,以及业界好资源!

2022012703491714

微信来撩,免费咨询:xiaozhu_tec

分享本页
返回顶部
最近整理资源【免费获取】:   👉 程序员最新必读书单  | 👏 互联网各方向面试题下载 | ✌️计算机核心资源汇总