用numpy实现tensorflow式的深度学习框架similarflow

2023年5月25日上午1:21 • 人工智能 • 阅读 71

这块主要想围绕着用numpy实现tensorflow形式和pytorch形式的框架展开，主要是想践行numpy->mmcv/lightning->算法paper->mmdet/mmclas…这条路，进一步还是加强对于深度学习一些偏全局的理解，更好的对业务问题进行解耦和建模。

1.计算图

通过链式法则，我们逐节点的计算偏导数，在网络backward时，需要用链式求导法则求出网络最后输出的梯度，然后再对网络进行优化。类似上图的表达形式就是tensorflow以及pytorch的基本计算模型。总结而言，计算图模型由节点和线组成，节点表示操作符operator，或者称之为算子，线表示计算间的依赖，实线表示有数据传递依赖，传递的数据即张量，虚线通常可以表示控制依赖，即执行先后顺序。计算图从本质上说，是tensorflow在内存中构建的程序逻辑图，计算图可以被分割成多个块，并且可以并行地运行在多个不同的cpu或gpu上，这被称为并行计算。

tensorflow中的计算图有三种，分别是静态计算图，动态计算图以及autograph，tf2默认采用动态计算图，即每使用一个算子后，该算子会被动态加入到隐含的默认计算图中立即执行得到结果，每次当我们搭建完一个计算图，然后在反向传播结束之后，整个计算图就在内存中被释放了，如下示例，第二次loss.backward()就直接报错了，这也是pytorch的计算方式。动态图不区分计算图的定义和执行，定义后立即执行，称之为eager excution。

a = torch.tensor([3.0, 1.0], requires_grad=True)
b = a * a
loss = b.mean()

loss.backward() # &#x6B63;&#x5E38;
loss.backward() # RuntimeError

较早使用静态图的方法分为两步，第一步定义计算图，第二步在会话session中执行计算图，如下是在tf1和tf2中的写法，本文用numpy实现的简单similarflow的就是这种静态图方式，静态图相比动态图是有一定效率优势的，动态图会有多次python进程和tf的c++进程之间的通信，静态图构建完成之后几乎全部再tf内核上用c++执行，效率高。

import tensorflow as tf
TensorFlow1.0
#&#x5B9A;&#x4E49;&#x8BA1;&#x7B97;&#x56FE;
g = tf.Graph()
with g.as_default():
    #placeholder&#x4E3A;&#x5360;&#x4F4D;&#x7B26;&#xFF0C;&#x6267;&#x884C;&#x4F1A;&#x8BDD;&#x65F6;&#x5019;&#x6307;&#x5B9A;&#x586B;&#x5145;&#x5BF9;&#x8C61;
    x = tf.placeholder(name='x', shape=[], dtype=tf.string)
    y = tf.placeholder(name='y', shape=[], dtype=tf.string)
    z = tf.string_join([x,y],name = 'join',separator=' ')
#&#x6267;&#x884C;&#x8BA1;&#x7B97;&#x56FE;
with tf.Session(graph = g) as sess:
    print(sess.run(fetches = z,feed_dict = {x:"hello",y:"world"}))

tf中海油一种autograph，动态图运行效率较低，可以用@tf.function装饰器将普通python函数转换成和tf1对应的静态计算图构建代码。

2.similarflow

整体的架构是有个计算图，Graph对象，它是存储节点operation和变量的一个对象，驱动计算图是session，核心是反向传播，反向传播是通过链式求导来实现，loss对每个节点的导数之积，要实现所有的operator的导数方法，除此之外对求导有梯度下降的优化，这些是基本架构，有了这些就可以设计线性分类器，softamx以及多层感知机了。

graph设计，有向图的核心是节点，定义好节点之后放在一个图中统一管理，前向传播靠的是session。graph本身是计算图，图由节点组成，operation和variable都是节点元素，placeholder是用户输入：

class Graph(object):
    """    computational graph
"""

    def __init__(self):
        self.operations = []
        self.placeholders = []
        self.variables = []
        self.constants = []

    def __enter__(self):
        global _default_graph
        self.graph = _default_graph
        _default_graph = self
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        global _default_graph
        _default_graph = self.graph

    def as_default(self):
        return self

class Operation(object):
    """&#x63A5;&#x53D7;&#x4E00;&#x4E2A;&#x6216;&#x8005;&#x66F4;&#x591A;&#x8F93;&#x5165;&#x8282;&#x70B9;&#x8FDB;&#x884C;&#x7B80;&#x5355;&#x8BA1;&#x7B97;
"""

    def __init__(self, *input_nodes):
        self.input_nodes = input_nodes
        self.output_nodes = []

        # &#x5C06;&#x5F53;&#x524D;&#x8282;&#x70B9;&#x7684;&#x5F15;&#x7528;&#x6DFB;&#x52A0;&#x5230;&#x8F93;&#x5165;&#x8282;&#x70B9;&#x7684;output_nodes&#xFF0C;&#x8FD9;&#x6837;&#x53EF;&#x4EE5;&#x5728;&#x8F93;&#x5165;&#x8282;&#x70B9;&#x4E2D;&#x627E;&#x5230;&#x5F53;&#x524D;&#x8282;&#x70B9;
        for node in input_nodes:
            node.output_nodes.append(self)

        # &#x5C06;&#x5F53;&#x524D;&#x8282;&#x70B9;&#x7684;&#x5F15;&#x7528;&#x6DFB;&#x52A0;&#x5230;&#x56FE;&#x4E2D;&#xFF0C;&#x65B9;&#x4FBF;&#x540E;&#x9762;&#x5BF9;&#x56FE;&#x4E2D;&#x7684;&#x8D44;&#x6E90;&#x8FDB;&#x884C;&#x56DE;&#x6536;&#x7B49;&#x64CD;&#x4F5C;
        _default_graph.operations.append(self)

    def compute(self):
        """&#x6839;&#x636E;&#x8F93;&#x5165;&#x8282;&#x70B9;&#x7684;&#x503C;&#x8BA1;&#x7B97;&#x5F53;&#x524D;&#x8282;&#x70B9;&#x7684;&#x8F93;&#x51FA;&#x503C;
"""
        pass

    def __add__(self, other):
        from .operations import add
        return add(self, other)

    def __neg__(self):
        from .operations import negative
        return negative(self)

    def __sub__(self, other):
        from .operations import add,negative
        return add(self, negative(other))

    def __mul__(self, other):
        from .operations import matmul
        return matmul(self, other)

class Placeholder(object):
    """&#x6CA1;&#x6709;&#x8F93;&#x5165;&#x8282;&#x70B9;&#xFF0C;&#x8282;&#x70B9;&#x6570;&#x636E;&#x662F;&#x901A;&#x8FC7;&#x56FE;&#x5EFA;&#x7ACB;&#x597D;&#x4EE5;&#x540E;&#x901A;&#x8FC7;&#x7528;&#x6237;&#x4F20;&#x5165;
"""

    def __init__(self):
        self.output_nodes = []

        _default_graph.placeholders.append(self)

    def __add__(self, other):
        from .operations import add
        return add(self, other)

    def __neg__(self):
        from .operations import negative
        return negative(self)

    def __sub__(self, other):
        from .operations import add, negative
        return add(self, negative(other))

    def __mul__(self, other):
        from .operations import matmul
        return matmul(self, other)

class Variable(object):
    """&#x6CA1;&#x6709;&#x8F93;&#x5165;&#x8282;&#x70B9;&#xFF0C;&#x8282;&#x70B9;&#x6570;&#x636E;&#x5728;&#x8FD0;&#x7B97;&#x8FC7;&#x7A0B;&#x4E2D;&#x662F;&#x53EF;&#x53D8;&#x5316;&#x7684;
"""

    def __init__(self, initial_value=None):
        self.value = initial_value
        self.output_nodes = []

        _default_graph.variables.append(self)

    def __add__(self, other):
        from .operations import add
        return add(self, other)

    def __neg__(self):
        from .operations import negative
        return negative(self)

    def __sub__(self, other):
        from .operations import add, negative
        return add(self, negative(other))

    def __mul__(self, other):
        from .operations import matmul
        return matmul(self, other)

class Constant(object):
    """&#x6CA1;&#x6709;&#x8F93;&#x5165;&#x8282;&#x70B9;&#xFF0C;&#x8282;&#x70B9;&#x6570;&#x636E;&#x5728;&#x8FD0;&#x7B97;&#x8FC7;&#x7A0B;&#x4E2D;&#x662F;&#x4E0D;&#x53EF;&#x53D8;&#x7684;
"""

    def __init__(self, value=None):
        self.value = value
        self.output_nodes = []

        _default_graph.constants.append(self)

    def __add__(self, other):
        from .operations import add
        return add(self, other)

    def __neg__(self):
        from .operations import negative
        return negative(self)

    def __sub__(self, other):
        from .operations import add, negative
        return add(self, negative(other))

    def __mul__(self, other):
        from .operations import matmul
        return matmul(self, other)

1.把运算符重载一波，这样，同节点可以直接+-*了，但是由于存在相互调用，所以每次都要from import一下。

2.多用numpy代替python自己的运算。

session：feedforward：需要一个session来对一个已经创建好的计算图进行计算，已创建的graph的节点其实只是创建了一个空节点，节点中并没有可以计算的数值，session先递归后序遍历拿到operation前的所有节点，调用节点的compute方法得到值。

import numpy as np
from .graph import Operation, Placeholder, Variable, Constant

class Session(object):
    """ feedforward
"""

    def __init__(self):
        self.graph = _default_graph

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        return self.close()

    def close(self):
        all_nodes = (self.graph.operations + self.graph.variables +
                     self.graph.constants + self.graph.placeholders)
        for node in all_nodes:
            node.output = None

    def run(self, operation, feed_dict=None):
        """   &#x8BA1;&#x7B97;&#x8282;&#x70B9;&#x7684;&#x8F93;&#x51FA;&#x503C;
        :param operation:
        :param feed_dict:
        :return:
"""
        nodes_postorder = traverse_postorder(operation)

        for node in nodes_postorder:
            if type(node) == Placeholder:
                node.output = feed_dict[node]
            elif (type(node) == Variable) or (type(node) == Constant):
                node.output = node.value
            else:  # Operation
                # &#x53D6;&#x51FA;&#x6BCF;&#x4E2A;&#x8282;&#x70B9;&#x7684;&#x503C;
                node.inputs = [input_node.output for input_node in node.input_nodes]
                # &#x62C6;&#x5305;&#xFF0C;&#x8C03;&#x7528;operation&#x7684;compute&#x8BA1;&#x7B97;&#x524D;&#x5411;&#x503C;
                node.output = node.compute(*node.inputs)

            if type(node.output) == list:
                node.output = np.array(node.output)
        return operation.output

def traverse_postorder(operation):
"""
    &#x901A;&#x8FC7;&#x540E;&#x5E8F;&#x904D;&#x5386;&#x83B7;&#x53D6;&#x4E00;&#x4E2A;&#x8282;&#x70B9;&#x6240;&#x9700;&#x7684;&#x6240;&#x6709;&#x8282;&#x70B9;&#x7684;&#x8F93;&#x51FA;&#x503C;&#xFF0C;&#x9012;&#x5F52;
    :param operation:
    :return:
"""
    nodes_postorder = []

    def recurse(node):
        if isinstance(node, Operation):
            for input_node in node.input_nodes:
                recurse(input_node)
        nodes_postorder.append(node)

    recurse(operation)
    return nodes_postorder

operation:算子中只实现了前向方法，反向传播的梯度计算放在单独的文件中。

import numpy as np
from .graph import Operation

class matmul(Operation):
    def __init__(self, x, y):
        super(matmul, self).__init__(x, y)

    def compute(self, x_value, y_value):
        """ x_value,y_value&#x662F;&#x5177;&#x4F53;&#x7684;&#x503C;&#xFF0C;&#x800C;&#x975E;&#x8282;&#x70B9;&#x4E2D;&#x7684;&#x7C7B;&#x578B;&#xFF0C;&#x5982;&#x679C;&#x76F4;&#x63A5;&#x7528;self.input_nodes&#x5C31;&#x662F;garph&#x4E2D;&#x7684;&#x8282;&#x70B9;,
        init&#x548C;compute&#x90FD;&#x4F20;&#x53C2;&#x786E;&#x5B9E;&#x770B;&#x8D77;&#x6765;&#x5F88;&#x4E11;&#xFF0C;==&#xFF01;
        :param x_value:
        :param y_value:
        :return:
"""
        return np.dot(x_value, y_value)

class add(Operation):
    def __init__(self, x, y):
        super(add, self).__init__(x, y)

    def compute(self, x_value, y_value):
        return np.add(x_value, y_value)

class negative(Operation):
    def __init__(self, x):
        super(negative, self).__init__(x)

    def compute(self, x_value):
        return -x_value

class multiply(Operation):
    def __init__(self, x, y):
        super(multiply, self).__init__(x, y)

    def compute(self, x_value, y_value):
        return np.multiply(x_value, y_value)

class sigmoid(Operation):
    def __init__(self, x):
        super(sigmoid, self).__init__(x)

    def compute(self, x_value):
        return 1 / (1 + np.exp(-x_value))

class softmax(Operation):
    def __init__(self, x):
        super(softmax, self).__init__(x)

    def compute(self, x_value):
        return np.exp(x_value) / np.sum(np.exp(x_value), axis=1)[:, None]

class log(Operation):
    def __init__(self, x):
        super(log, self).__init__(x)

    def compute(self, x_value):
        return np.log(x_value)

class square(Operation):
    def __init__(self, x):
        super(square, self).__init__(x)

    def compute(self, x_value):
        return np.square(x_value)

class reduce_sum(Operation):
    def __init__(self, A, axis=None):
        super(reduce_sum, self).__init__(A)
        self.axis = axis

    def compute(self, A_value):
        return np.sum(A_value, self.axis)

gradients：梯度计算

反向传播，矩阵梯度的计算是实现反向传播算法重要的一部分。在网络中基本都是求矩阵对矩阵的导数，不要直接计算一个矩阵对另一个矩阵的导数，可以利用loss这个标量做间接的维度求导，确定了维度，就好算梯度了。

算子求导数这块softmax要关注，另外用了注册器，后面在反向传播中直接用op_type在字典中调。

import numpy as np

_gradient_registry = {}

class RegisterGradient(object):
    def __init__(self, op_type):
        self._op_type = eval(op_type)

    def __call__(self, f):
        _gradient_registry[self._op_type] = f
        return f

@RegisterGradient("add")
def _add_gradient(op, grad):
    """   &#x6C42;&#x548C;&#x77E9;&#x9635;&#x6C42;&#x5BFC;&#xFF0C;&#x884C;&#x76F8;&#x52A0;&#xFF0C;&#x5217;&#x76F8;&#x52A0;
    :param op:
    :param grad:
    :return:
"""
    x, y = op.inputs[0], op.inputs[1]

    grad_wrt_x = grad
    while np.ndim(grad_wrt_x) > len(np.shape(x)):
        grad_wrt_x = np.sum(grad_wrt_x, axis=0)
    for axis, size in enumerate(np.shape(x)):
        if size == 1:
            grad_wrt_x = np.sum(grad_wrt_x, axis=axis, keepdims=True)

    grad_wrt_y = grad
    while np.ndim(grad_wrt_y) > len(np.shape(y)):
        grad_wrt_y = np.sum(grad_wrt_y, axis=0)
    for axis, size in enumerate(np.shape(y)):
        if size == 1:
            grad_wrt_y = np.sum(grad_wrt_y, axis=axis, keepdims=True)

    return [grad_wrt_x, grad_wrt_y]

@RegisterGradient("matmul")
def _matmul_gradient(op, grad):
    """ &#x6C42;x&#x7684;&#x68AF;&#x5EA6;&#xFF1A;y&#x7684;&#x8F6C;&#x7F6E;&#xFF0C;&#x6C42;y&#x7684;&#x68AF;&#x5EA6;&#xFF1A;x&#x7684;&#x8F6C;&#x7F6E;
    :param op:
    :param grad:
    :return:
"""
    x, y = op.inputs[0], op.inputs[1]
    return [np.dot(grad, np.transpose(y)), np.dot(np.transpose(x), grad)]

@RegisterGradient("sigmoid")
def _sigmoid_gradient(op, grad):
    sigmoid = op.output
    return grad * sigmoid * (1 - sigmoid)

@RegisterGradient("softmax")
def _softmax_gradient(op, grad):
    """ softmax &#x5012;&#x6570;
    https://stackoverflow.com/questions/40575841/numpy-calculate-the-derivative-of-the-softmax-function
    :param op:
    :param grad:
    :return:
"""
    softmax = op.output
    return (grad - np.reshape(np.sum(grad * softmax, 1), [-1, 1])) * softmax

@RegisterGradient("log")
def _log_gradient(op, grad):
    x = op.inputs[0]
    return grad / x

@RegisterGradient("multiply")
def _multiply_gradient(op, grad):
    x, y = op.inputs[0], op.inputs[1]
    return [grad * y, grad * x]

@RegisterGradient("negative")
def _negative_gradient(op, grad):
    return -grad

@RegisterGradient("square")
def _square_gradient(op, grad):
    x = op.inputs[0]
    return grad * np.multiply(2.0, x)

@RegisterGradient("reduce_sum")
def _reduce_sum_gradient(op, grad):
    x = op.inputs[0]

    output_shape = np.array(np.shape(x))
    output_shape[op.axis] = 1
    tile_scaling = np.shape(x) // output_shape
    grad = np.reshape(grad, output_shape)
    return np.tile(grad, tile_scaling)

反向传播：

import numpy as np
from queue import Queue

from .graph import Operation, Variable
from .gradients import _gradient_registry

def compute_gradients(loss):
    """ &#x5DF2;&#x77E5;&#x6BCF;&#x4E2A;&#x8282;&#x70B9;&#x4E2D;&#x8F93;&#x51FA;&#x5BF9;&#x8F93;&#x5165;&#x7684;&#x68AF;&#x5EA6;&#xFF0C;&#x4ECE;&#x540E;&#x5F80;&#x524D;&#x53CD;&#x5411;&#x641C;&#x7D22;&#x4E0E;&#x635F;&#x5931;&#x8282;&#x70B9;&#x76F8;&#x5173;&#x8054;&#x7684;&#x8282;&#x70B9;&#x8FDB;&#x884C;&#x53CD;&#x5411;&#x4F20;&#x64AD;&#x8BA1;&#x7B97;&#x68AF;&#x5EA6;&#x3002;
    &#x82E5;&#x6211;&#x4EEC;&#x9700;&#x8981;&#x8BA1;&#x7B97;&#x5176;&#x4ED6;&#x8282;&#x70B9;&#x5173;&#x4E8E;loss&#x7684;&#x68AF;&#x5EA6;&#x9700;&#x8981;&#x4EE5;&#x635F;&#x5931;&#x8282;&#x70B9;&#x4E3A;&#x542F;&#x52A8;&#x5BF9;&#x8BA1;&#x7B97;&#x56FE;&#x8FDB;&#x884C;&#x5E7F;&#x5EA6;&#x4F18;&#x5148;&#x641C;&#x7D22;&#xFF0C;&#x5728;&#x641C;&#x7D22;&#x8FC7;&#x7A0B;&#x4E2D;
    &#x9488;&#x5BF9;&#x6BCF;&#x4E2A;&#x8282;&#x70B9;&#x7684;&#x68AF;&#x5EA6;&#x8BA1;&#x7B97;&#x4FBF;&#x53EF;&#x4EE5;&#x4E00;&#x8FB9;&#x904D;&#x5386;&#x4E00;&#x8FB9;&#x8BA1;&#x7B97;&#x8BA1;&#x7B97;&#x8282;&#x70B9;&#x5BF9;&#x904D;&#x5386;&#x8282;&#x70B9;&#x7684;&#x68AF;&#x5EA6;&#xFF0C;&#x53EF;&#x4EE5;&#x7528;dict&#x5C06;&#x8282;&#x70B9;&#x4E0E;&#x68AF;&#x5EA6;&#x8FDB;&#x884C;&#x4FDD;&#x5B58;&#x3002;

    &#x4F7F;&#x7528;&#x4E00;&#x4E2A;&#x5148;&#x8FDB;&#x5148;&#x51FA;&#x7684;&#x961F;&#x5217;&#x63A7;&#x5236;&#x904D;&#x5386;&#x987A;&#x5E8F;&#xFF0C;&#x4E00;&#x4E2A;&#x96C6;&#x5408;&#x5BF9;&#x8C61;&#x5B58;&#x50A8;&#x5DF2;&#x8BBF;&#x95EE;&#x7684;&#x8282;&#x70B9;&#x9632;&#x6B62;&#x91CD;&#x590D;&#x8BBF;&#x95EE;&#xFF0C;&#x7136;&#x540E;&#x904D;&#x5386;&#x7684;&#x65F6;&#x5019;&#x8BA1;&#x7B97;&#x68AF;&#x5EA6;&#x5E76;&#x5C06;
    &#x68AF;&#x5EA6;&#x653E;&#x5230;grad_table&#x4E2D;
    :param loss:
    :return:
"""
    grad_table = {}  # &#x5B58;&#x653E;&#x8282;&#x70B9;&#x7684;&#x68AF;&#x5EA6;
    grad_table[loss] = 1

    visited = set()
    queue = Queue()
    visited.add(loss)
    queue.put(loss)

    while not queue.empty():
        node = queue.get()

        # &#x8BE5;&#x8282;&#x70B9;&#x4E0D;&#x662F;loss&#x8282;&#x70B9;&#xFF0C;&#x5148;&#x904D;&#x5386;&#x8FDB;queue
        if node != loss:
            grad_table[node] = 0

            for output_node in node.output_nodes:
                lossgrad_wrt_output_node_output = grad_table[output_node]

                output_node_op_type = output_node.__class__
                bprop = _gradient_registry[output_node_op_type]

                lossgrads_wrt_output_node_inputs = bprop(output_node, lossgrad_wrt_output_node_output)

                if len(output_node.input_nodes) == 1:
                    grad_table[node] += lossgrads_wrt_output_node_inputs
                else:
                    # &#x82E5;&#x4E00;&#x4E2A;&#x8282;&#x70B9;&#x6709;&#x591A;&#x4E2A;&#x8F93;&#x51FA;&#xFF0C;&#x5219;&#x591A;&#x4E2A;&#x68AF;&#x5EA6;&#x6C42;&#x548C;
                    node_index_in_output_node_inputs = output_node.input_nodes.index(node)
                    lossgrad_wrt_node = lossgrads_wrt_output_node_inputs[node_index_in_output_node_inputs]
                    grad_table[node] += lossgrad_wrt_node

        # &#x628A;&#x8282;&#x70B9;&#x5B58;&#x5165;&#x5230;&#x961F;&#x5217;&#x4E2D;
        if hasattr(node, "input_nodes"):
            for input_node in node.input_nodes:
                if input_node not in visited:
                    visited.add(input_node)
                    queue.put(input_node)

    return grad_table

GradientDescentOptimizer：梯度下降优化，实现了损失函数对其他节点梯度的计算，得到梯度的目的是为了能够优化参数，实现了一个梯度下降优化器来完成参数优化，以梯度的反方向作为每轮迭代的搜索方向然后根据设定的步长对局部最优值进行搜索：

class GradientDescentOptimizer(object):
    def __init__(self, learning_rate):
        self.learning_rate = learning_rate

    def minimize(self, loss):
        learning_rate = self.learning_rate

        class MinimizationOperation(Operation):
            def compute(self):
                grad_table = compute_gradients(loss)

                for node in grad_table:
                    if type(node) == Variable or type(node) == Constant:
                        grad = grad_table[node]
                        node.value -= learning_rate * grad

        return MinimizationOperation()

举个例子：线性分类

import numpy as np
import matplotlib.pylab as plt
import similarflow as sf

input_x = np.linspace(-1, 1, 100)
input_y = input_x * 3 + np.random.randn(input_x.shape[0]) * 0.5

x = sf.Placeholder()
y = sf.Placeholder()
w = sf.Variable([[1.0]])
b = sf.Variable(0.0)

linear = sf.add(sf.matmul(x, w), b)
linear = x * w + b

loss = sf.reduce_sum(sf.square(sf.add(linear, sf.negative(y))))
loss = sf.reduce_sum(sf.square(linear - y))

train_op = sf.train.GradientDescentOptimizer(learning_rate=0.005).minimize(loss)

feed_dict = {x: np.reshape(input_x, (-1, 1)), y: np.reshape(input_y, (-1, 1))}
feed_dict = {x: input_x, y: input_y}

with sf.Session() as sess:
    for step in range(20):
        # &#x524D;&#x5411;
        loss_value = sess.run(loss, feed_dict)
        mse = loss_value / len(input_x)
        print(f"step:{step},loss:{loss_value},mse:{mse}")
        # &#x53CD;&#x5411;&#x4F20;&#x64AD;
        sess.run(train_op, feed_dict)
    w_value = sess.run(w, feed_dict=feed_dict)
    b_value = sess.run(b, feed_dict=feed_dict)
    print(f"w:{w_value},b:{b_value}")

w_value = float(w_value)
max_x, min_x = np.max(input_x), np.min(input_x)
max_y, min_y = w_value * max_x + b_value, w_value * min_x + b_value

plt.plot([max_x, min_x], [max_y, min_y], color='r')
plt.scatter(input_x, input_y)
plt.show()

感知机：

import numpy as np
import similarflow as sf
import matplotlib.pyplot as plt

Create red points centered at (-2, -2)
red_points = np.random.randn(50, 2) - 2 * np.ones((50, 2))

Create blue points centered at (2, 2)
blue_points = np.random.randn(50, 2) + 2 * np.ones((50, 2))

X = sf.Placeholder()
y = sf.Placeholder()
W = sf.Variable(np.random.randn(2, 2))
b = sf.Variable(np.random.randn(2))

p = sf.softmax(sf.add(sf.matmul(X, W), b))

loss = sf.negative(sf.reduce_sum(sf.reduce_sum(sf.multiply(y, sf.log(p)), axis=1)))

train_op = sf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(loss)

feed_dict = {
    X: np.concatenate((blue_points, red_points)),
    y: [[1, 0]] * len(blue_points) + [[0, 1]] * len(red_points)
}

with sf.Session() as sess:
    for step in range(100):
        loss_value = sess.run(loss, feed_dict)
        if step % 10 == 0:
            print(f"step:{step},loss:{loss_value}")
        sess.run(train_op, feed_dict)

    # Print final result
    W_value = sess.run(W)
    print("Weight matrix:\n", W_value)
    b_value = sess.run(b)
    print("Bias:\n", b_value)

Plot a line y = -x
x_axis = np.linspace(-4, 4, 100)
y_axis = -W_value[0][0] / W_value[1][0] * x_axis - b_value[0] / W_value[1][0]
plt.plot(x_axis, y_axis)

Add the red and blue points
plt.scatter(red_points[:, 0], red_points[:, 1], color='red')
plt.scatter(blue_points[:, 0], blue_points[:, 1], color='blue')
plt.show()

多层感知机

import numpy as np
import similarflow as sf
import matplotlib.pyplot as plt

Create two clusters of red points centered at (0, 0) and (1, 1), respectively.

red_points = np.concatenate((
    0.2 * np.random.randn(25, 2) + np.array([[0, 0]] * 25),
    0.2 * np.random.randn(25, 2) + np.array([[1, 1]] * 25)
))

Create two clusters of blue points centered at (0, 1) and (1, 0), respectively.

blue_points = np.concatenate((
    0.2 * np.random.randn(25, 2) + np.array([[0, 1]] * 25),
    0.2 * np.random.randn(25, 2) + np.array([[1, 0]] * 25)
))

Plot them
plt.scatter(red_points[:, 0], red_points[:, 1], color='red')
plt.scatter(blue_points[:, 0], blue_points[:, 1], color='blue')
plt.show()

X = sf.Placeholder()
y = sf.Placeholder()
W_hidden = sf.Variable(np.random.randn(2, 2))
b_hidden = sf.Variable(np.random.randn(2))
p_hidden = sf.sigmoid(sf.add(sf.matmul(X, W_hidden), b_hidden))

W_output = sf.Variable(np.random.randn(2, 2))
b_output = sf.Variable(np.random.rand(2))
p_output = sf.softmax(sf.add(sf.matmul(p_hidden, W_output), b_output))

loss = sf.negative(sf.reduce_sum(sf.reduce_sum(sf.multiply(y, sf.log(p_output)), axis=1)))

train_op = sf.train.GradientDescentOptimizer(learning_rate=0.03).minimize(loss)

feed_dict = {
    X: np.concatenate((blue_points, red_points)),
    y: [[1, 0]] * len(blue_points) + [[0, 1]] * len(red_points)
}

with sf.Session() as sess:
    for step in range(100):
        loss_value = sess.run(loss, feed_dict)
        if step % 10 == 0:
            print(f"step:{step},loss:{loss_value}")
        sess.run(train_op, feed_dict)

    # Print final result
    W_hidden_value = sess.run(W_hidden)
    print("Hidden layer weight matrix:\n", W_hidden_value)
    b_hidden_value = sess.run(b_hidden)
    print("Hidden layer bias:\n", b_hidden_value)
    W_output_value = sess.run(W_output)
    print("Output layer weight matrix:\n", W_output_value)
    b_output_value = sess.run(b_output)
    print("Output layer bias:\n", b_output_value)

Visualize classification boundary
xs = np.linspace(-2, 2)
ys = np.linspace(-2, 2)
pred_classes = []
for x in xs:
    for y in ys:
        pred_class = sess.run(p_output, feed_dict={X: [[x, y]]})[0]
        pred_classes.append((x, y, pred_class.argmax()))
xs_p, ys_p = [], []
xs_n, ys_n = [], []
for x, y, c in pred_classes:
    if c == 0:
        xs_n.append(x)
        ys_n.append(y)
    else:
        xs_p.append(x)
        ys_p.append(y)
plt.plot(xs_p, ys_p, 'ro', xs_n, ys_n, 'bo')
plt.show()

Original: https://blog.csdn.net/u012193416/article/details/122958900
Author: Kun Li
Title: 用numpy实现tensorflow式的深度学习框架similarflow

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/511196/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

RepVGG论文详解以及使用Pytorch进行模型复现

RepVGG: Making VGG-style ConvNets Great Again 是 2021 CVPR的一篇论文，正如他的名字一样，使用 structural re-p…

人工智能 2023年7月22日
0055
Hyperledger Fabric节点的动态添加和删除

前言在Hyperledger Fabric组织的动态添加和删除中，我们已经完成了在运行着的网络中动态添加和删除组织。本文将在其基础上，详细介绍了如何在 soft 组织上添加新的 …

人工智能 2023年6月4日
0065
【Linux】自动化构建工具-make/Makefile&&第一个小程序

大家好我是沐曦希💕 文章目录一.项目自动化构建工具-make/Makefile * 1.背景 2. 举例 3. 原理 4. 总结 5. 项目清理 6. 习题 – 习题…

人工智能 2023年7月30日
0040
使用MobileNetv2实现图像分类

简介目前的神经网络模型层出不穷，其中在图像识别的领域不仅非常高效快速，而且准确率也非常高。但我们在提高准确率的道路上是永不止步的，比较矛盾的是在提高精确率的同时也会带来消耗，需要…

人工智能 2023年7月14日
00102
5G难题–如此多的数据，运营商们如何分析所有的这些？图数仓成为关键技术

知识图谱技术已渐渐成为AI的风口，图数据库也站在了数据库领域的浪尖，AbutionGraph作为世界第一款时序动态图数据仓库（时序+图谱+数仓的一种全新数据库存储架构），来看看在…

人工智能 2023年6月1日
0056
基于图嵌入的降维算法——边界Fisher分析（MFA）

0、前言降维是计算机视觉、模式识别、机器学习等领域常见的数据分析和处理方法。在人脸识别、数据可视化等领域，通常需要从高维数据中提取有效的低维特征，以方便数据分析和处理。降维算法…

人工智能 2023年7月2日
0091
C++ Reference: Standard C++ Library reference: C Library: cwctype: wint_t

C++官网参考链接：https://cplusplus.com/reference/cwctype/wint_t/ 类型 Original: https://blog.csdn.n…

人工智能 2023年6月29日
0091
windows下载安装启动nexus

参考：https://blog.csdn.net/lovelife000/article/details/125880764https://blog.csdn.net/qq_362…

人工智能 2023年6月29日
0078
OpenCvSharp (C# OpenCV)实现纺织物缺陷检测-＞脏污、油渍、线条破损(详细步骤 + 源码)

点击下方卡片，关注” OpenCV与AI深度学习” 视觉/图像重磅干货，第一时间送达! 本文将介绍使用OpenCV实现纺织物缺陷检测(脏污、油渍、线条破损缺…

人工智能 2023年6月22日
0056
字节跳动多媒体实验室联合 ISCAS 举办第二届神经网络视频编码竞赛

近日，ISCAS 宣布将于 2023 年 5 月 21 日 – 5 月 25 日在美国加州蒙特雷举办。作为 IEEE 旗下电路与系统学会旗舰会议，本届 ISCAS 将继…

人工智能 2023年7月12日
0047
【Libtorch】YOLOV5在Window10的部署（一）

YOLOV5在Window10的部署（一）前言大家好，这是我的第一篇文章，以前都是在本站白嫖大佬们的文章，现在有机会，也分享一下自己的工作经历。本文禁止转载。部署环境本文是…

人工智能 2023年7月12日
0066
Tableau可视化技巧-一分钟制作箱线图

嵌入PHP网页中，您需要遵循以下步骤： 1. 创建一个 Server账户并上传您的文件。 2. 在 Server中设置您的文件的共享选项，以便在外部应用程序中嵌入。 3. 在P…

人工智能 2023年7月18日
0045
基于遗传算法求解TSP问题（旅游路径规划，Python实现，超详细，可视化，结果分析）

ps：作者是很用心写的，如果觉得不错，请给作者一点鼓励噢！（点赞收藏评论噢）基于遗传算法求解TSP问题摘要巡回旅行商问题（TSP）是组合优化中的经典问题。常见的TSP问题求解…

人工智能 2023年7月28日
0084
小程序跳转公众号

即用即走——这个是从微信小程序上线就开始打的概念。即用即走使得小程序可以代替许多APP，或是做APP的整体嫁接，或是作为阉割版功能的承载体。对用户使用上来说，确实方便，要用的时候…

人工智能 2023年5月30日
0074
MAE 代码实战详解

MAE 代码实战详解 if__name__==”main“ * model.forward – model.forward.encorder m…

人工智能 2023年6月24日
00116
六、分类问题和逻辑回归

文章目录 1、分类问题-classification 2、逻辑回归-Logistic Regression * 2.1 决策边界 THE END 1、分类问题-classific…

人工智能 2023年7月2日
0037

2024 年 4 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

用numpy实现tensorflow式的深度学习框架similarflow

大家都在看