# pytorch 深度学习之线性代数

[En]

A scalar containing a number is called scalar. A scalar is represented by a tensor with only one element:

import torch

x = torch.tensor(3.0)
y = torch.tensor(2.0)

x + y,x * y,x / y,x ** y

(tensor(5.), tensor(6.), tensor(1.5000), tensor(9.))


[En]

Think of a vector as a list of scalar values. We call these scalar values elements (element) or components (component) of a vector. The vector is processed by one-dimensional tensor. In general, tensors can have any length, depending on the memory limitations of the machine, and subscripts can be used to refer to any element of the vector.

[\mathbf{A}=\left[\begin{array}{c}\mathbf{a}{1}^{\top} \ \mathbf{a}{2}^{\top} \ \vdots \ \mathbf{a}_{m}^{\top}\end{array}\right] ]

x = torch.arange(4)
x,x[3]

(tensor([0, 1, 2, 3]), tensor(3))


## 长度，维度和形状

[En]

The length of a vector is often called the dimension of a vector, and the length of the tensor can be accessed by calling the built-in len () function of Python. When a vector is represented by a tensor (there is only one axis), we can also access the length of the vector through the .shape property. A shape is a group of elements that lists the length (dimension) of a tensor along each axis. For a tensor with only one axis, the shape has only one element:

len(x),x.shape

(4, torch.Size([4]))


[En]

Just as a vector extends a scalar from order zero to order one, a matrix extends a vector from order one to order two.

[\mathbf{A}=\left[\begin{array}{cccc}a_{11} & a_{12} & \cdots & a_{1 n} \ a_{21} & a_{22} & \cdots & a_{2 n} \ \vdots & \vdots & \ddots & \vdots \ a_{m 1} & a_{m 2} & \cdots & a_{m n}\end{array}\right] ]

A = torch.arange(20).reshape(5, 4)
A,A[-1],A[2][3]

(tensor([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19]]),
tensor([16, 17, 18, 19]),
tensor(11))


[En]

When we switch rows and columns of a matrix, the result is called the transpose of the matrix:

A.T

tensor([[ 0,  4,  8, 12, 16],
[ 1,  5,  9, 13, 17],
[ 2,  6, 10, 14, 18],
[ 3,  7, 11, 15, 19]])


B = torch.tensor([[1,2,3],[2,0,4],[3,4,5]])

B,B == B.T

(tensor([[1, 2, 3],
[2, 0, 4],
[3, 4, 5]]),
tensor([[True, True, True],
[True, True, True],
[True, True, True]]))


[En]

Just as a vector is a generalization of a scalar and a matrix is a generalization of a vector, we can build data structures with more axes. Tensors provide us with a general method for describing n-dimensional arrays with any number of axes:

X = torch.arange(24).reshape(2,3,4)

X

tensor([[[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11]],

[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])


## 张量算法的基本性质

[En]

Given any two tensors of the same shape, the result of any binary operation on the element will be a tensor of the same shape:

A = torch.arange(20,dtype=torch.float32).reshape(5,4)
B = A.clone()
A,A + B

(tensor([[ 0.,  1.,  2.,  3.],
[ 4.,  5.,  6.,  7.],
[ 8.,  9., 10., 11.],
[12., 13., 14., 15.],
[16., 17., 18., 19.]]),
tensor([[ 0.,  2.,  4.,  6.],
[ 8., 10., 12., 14.],
[16., 18., 20., 22.],
[24., 26., 28., 30.],
[32., 34., 36., 38.]]))


[En]

Multiplying or adding a scalar does not change the shape of the tensor, where each element of the tensor is added or multiplied by the scalar:

a = 2
X = torch.arange(24).reshape(2,3,4)
a + X,(a * X).shape

(tensor([[[ 2,  3,  4,  5],
[ 6,  7,  8,  9],
[10, 11, 12, 13]],

[[14, 15, 16, 17],
[18, 19, 20, 21],
[22, 23, 24, 25]]]),
torch.Size([2, 3, 4]))


[En]

Calculate the sum of tensor elements:

x = torch.arange(4, dtype=torch.float32)
x, x.sum()

(tensor([0., 1., 2., 3.]), tensor(6.))


A = torch.arange(20,dtype=torch.float32).reshape(5,4)
A_sum_axis0 = A.sum(axis=0)
A,A_sum_axis0,A_sum_axis0.shape

(tensor([[ 0.,  1.,  2.,  3.],
[ 4.,  5.,  6.,  7.],
[ 8.,  9., 10., 11.],
[12., 13., 14., 15.],
[16., 17., 18., 19.]]),
tensor([40., 45., 50., 55.]),
torch.Size([4]))


A = torch.arange(20,dtype=torch.float32).reshape(5,4)
A_sum_axis1 = A.sum(axis=1)
A,A_sum_axis1,A_sum_axis1.shape

(tensor([[ 0.,  1.,  2.,  3.],
[ 4.,  5.,  6.,  7.],
[ 8.,  9., 10., 11.],
[12., 13., 14., 15.],
[16., 17., 18., 19.]]),
tensor([ 6., 22., 38., 54., 70.]),
torch.Size([5]))


[En]

Summing a matrix along rows and columns is equivalent to summing all the elements of the matrix:

A.sum(axis=[0, 1])  # A.sum()

tensor(190.)


A.mean(),A.sum()/A.numel(),A.mean(axis=0),A.sum(axis=0)/A.shape[0]

(tensor(9.5000),
tensor(9.5000),
tensor([ 8.,  9., 10., 11.]),
tensor([ 8.,  9., 10., 11.]))


[En]

Sometimes it is useful to keep the number of axes the same when calling a function to calculate the sum or mean:

sum_A = A.sum(axis=1, keepdims=True)
sum_A,A/sum_A

(tensor([[ 6.],
[22.],
[38.],
[54.],
[70.]]),
tensor([[0.0000, 0.1667, 0.3333, 0.5000],
[0.1818, 0.2273, 0.2727, 0.3182],
[0.2105, 0.2368, 0.2632, 0.2895],
[0.2222, 0.2407, 0.2593, 0.2778],
[0.2286, 0.2429, 0.2571, 0.2714]]))


y = torch.ones(4, dtype = torch.float32)
x, y, torch.dot(x, y)

(tensor([0., 1., 2., 3.]), tensor([1., 1., 1., 1.]), tensor(6.))


[\mathbf{A}=\left[\begin{array}{c}\mathbf{a}{1}^{\top} \ \mathbf{a}{2}^{\top} \ \vdots \ \mathbf{a}_{m}^{\top}\end{array}\right] ]

[\mathbf{A}=\left[\begin{array}{cccc}a_{11} & a_{12} & \cdots & a_{1 n} \ a_{21} & a_{22} & \cdots & a_{2 n} \ \vdots & \vdots & \ddots & \vdots \ a_{m 1} & a_{m 2} & \cdots & a_{m n}\end{array}\right] ]

[\mathbf{A} \mathbf{x}=\left[\begin{array}{c}\mathbf{a}{1}^{\top} \ \mathbf{a}{2}^{\top} \ \vdots \ \mathbf{a}{m}^{\top}\end{array}\right] \mathbf{x}=\left[\begin{array}{c}\mathbf{a}{1}^{\top} \mathbf{x} \ \mathbf{a}{2}^{\top} \mathbf{x} \ \vdots \ \mathbf{a}{m}^{\top} \mathbf{x}\end{array}\right] ]

A.shape,x.shape,torch.mv(A,x)

(torch.Size([5, 4]), torch.Size([4]), tensor([ 14.,  38.,  62.,  86., 110.]))


[En]

Suppose there are two matrices: ( mathbf {A} in mathbb {R} ^ {n times k}), ( mathbf {B} in mathbb {R} ^ {k times m}):

[\mathbf{A}=\left[\begin{array}{cccc}a_{11} & a_{12} & \cdots & a_{1 k} \ a_{21} & a_{22} & \cdots & a_{2 k} \ \vdots & \vdots & \ddots & \vdots \ a_{n 1} & a_{n 2} & \cdots & a_{n k}\end{array}\right], \quad \mathbf{B}=\left[\begin{array}{cccc}b_{11} & b_{12} & \cdots & b_{1 m} \ b_{21} & b_{22} & \cdots & b_{2 m} \ \vdots & \vdots & \ddots & \vdots \ b_{k 1} & b_{k 2} & \cdots & b_{k m}\end{array}\right] ]

[\mathbf{C}=\mathbf{A B}=\left[\begin{array}{c}\mathbf{a}{1}^{\top} \ \mathbf{a}{2}^{\top} \ \vdots \ \mathbf{a}{n}^{\top}\end{array}\right]\left[\begin{array}{llll}\mathbf{b}{1} & \mathbf{b}{2} & \cdots & \mathbf{b}{m}\end{array}\right]=\left[\begin{array}{cccc}\mathbf{a}{1}^{\top} \mathbf{b}{1} & \mathbf{a}{1}^{\top} \mathbf{b}{2} & \cdots & \mathbf{a}{1}^{\top} \mathbf{b}{m} \ \mathbf{a}{2}^{\top} \mathbf{b}{1} & \mathbf{a}{2}^{\top} \mathbf{b}{2} & \cdots & \mathbf{a}{2}^{\top} \mathbf{b}{m} \ \vdots & \vdots & \ddots & \vdots \ \mathbf{a}{n}^{\top} \mathbf{b}{1} & \mathbf{a}{n}^{\top} \mathbf{b}{2} & \cdots & \mathbf{a}{n}^{\top} \mathbf{b}{m}\end{array}\right] ]

B = torch.ones(4,3)
torch.mm(A,B)

tensor([[ 6.,  6.,  6.],
[22., 22., 22.],
[38., 38., 38.],
[54., 54., 54.],
[70., 70., 70.]])


[En]

The norm of a vector tells us how big a vector is. The concept of size considered here does not involve dimensions, but the size of components.

[En]

In linear algebra, a vector norm is a function that maps a vector to a scalar. Given any vector ( mathbf {X}), the vector norm satisfies some attributes:

• 第一个性质：如果我们按常数因子(Q) 缩放向量的所有元素， 其范数也会按相同常数因子的绝对值缩放：(f(\alpha \mathbf{x})=|\alpha| f(\mathbf{x}))
• 第二个性质：三角不等式：(F(mathbf{x}+mathbf{y})leq f(mathbf{x})+f(mathbf{y}))
[En]

the second property: triangular inequality: (f ( mathbf {x} + mathbf {y}) leq f ( mathbf {x}) + f ( mathbf {y}))*

• 第三个属性：Norm必须为非负：(F(mathbf{x})geq 0)
[En]

third property: norm must be non-negative: (f ( mathbf {x}) geq 0)*

• 最后一个性质要求范数最小为0，当且仅当向量全由0组成:(\forall i,[\mathbf{x}]_{i}=0 \Leftrightarrow f(\mathbf{x})=0)

[\|\mathbf{x}\|{2}=\sqrt{\sum{i=1}^{n} x_{i}^{2}} ]

u = torch.tensor([3.0,-4.0])
torch.norm(u)

tensor(5.)


(L_{1}) 范数，表示为向量元素的绝对值之和：

[\|\mathbf{x}\|{1}=\sum{i=1}^{n}\left|x_{i}\right| ]

torch.abs(u).sum()

tensor(7.)


(L_{2}) 范数和 (L_{1}) 范数都是更一般的范数的特例：

[\|\mathbf{x}\|{p}=\left(\sum{i=1}^{n}\left|x_{i}\right|^{p}\right)^{1 / p} ]

[\|\mathbf{X}\|{F}=\sqrt{\sum{i=1}^{m} \sum_{j=1}^{n} x_{i j}^{2}} ]

Frobenius 范数满足向量范数的所有性质，它就像是矩阵形向量的 (L_{2}) 范数。 调用以下函数将计算矩阵的 Frobenius 范数。

torch.norm(torch.ones(4,9))

tensor(6.)


## 范数和目标

[En]

Often try to solve the optimization problem: maximize the probability allocated to the observed data; minimize the distance between the prediction and the real observation. Goals, perhaps the most important component of deep learning algorithms (except data), are usually expressed as norms.

Original: https://www.cnblogs.com/xiaojianliu/p/16152580.html
Author: 6小贱
Title: pytorch 深度学习之线性代数

(0)