OpenCV+TensorFlow简单的机器小车传统视觉寻迹

2023年5月24日下午6:45 • 人工智能 • 阅读 102

该文章适合OpenCv的初学者以及对计算机视觉有了简单认识的朋友。以下将根据不同的能力水平进行梯度的讲解。最后会附带完整代码。

小白需要知道的

什么是传统的视觉寻迹？

在我看来，传统的跟踪是通过记录轨迹的横坐标来判断的。例如：

[En]

In my opinion, the traditional tracking is judged by recording the Abscissa of the track. For example:

这幅画被认为是笔直的。但机器怎么能判断呢？

[En]

This picture is considered to be straight. But how can the machine judge?

我们可以通过将图片转换成矩阵来判断，然后通过记录这些黑点的所有时间的横坐标来判断，从而得到黑点的平均横坐标。

[En]

We can judge by converting the picture into a matrix, and then by recording the Abscissa of these black dots all the time, so as to obtain the average Abscissa of the black dot.

source = 0  # 记录黑点的横坐标
m = 0  # 记录黑点个数
y = 143  # 画面的横轴大小
x = 80  # 画面的纵轴大小
for i in range(y):
    for j in range(x):
        if pred[j, i] == 0:
            m += 1
            source += j
source /= m  # 获得平均横坐标
source -= x / 2  # 对比中轴差值
if abs(source)  0:
    print("左转")
else:
    print("右转")

上面代码pred为画面的矩阵。当平均横坐标与x中轴值大小偏差10以内就被认为”直行”。

同样，根据平均水平坐标值的计算，这两张图片被视为“左转”，机器被很好地识别出来。

[En]

By the same token, according to the calculation of the average horizontal coordinate value, the two pictures are regarded as “turning left” and the machine is well identified.

这是一样的事情，这被机器认为是右转。

[En]

It’s the same thing, which is regarded as a right turn by the machine.

寻迹的思想

通过上面的简要介绍，我们对机器对轨迹的简单识别有了很好的了解。它也是一种传统的视觉方法。

[En]

From the above brief introduction, we have a good understanding of the simple recognition of the machine for the trajectory. It is also a traditional visual method.

接下来，我们将介绍跟踪的想法。

[En]

Next, we will introduce the idea of tracking.

有朋友可能会问：上面的图片都是黑白的，轨迹是黑色的，不符合实际。那么我们需要怎么做呢？

[En]

A friend may ask: the above pictures are all black and white, and the track is black, which is not consistent with reality. So how do we need to do this?

总的来说分为：获取图片(RGB) —> 转为灰度图 —–> 转为二值图 —-> 转为矩阵

THRESHOLD = 80 # 设置阈值为80
ret, frame = capture.read()  # 一帧一帧读取视频
gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)  # 对每一帧做处理,设置为灰度图
retval, black_Write = cv.threshold(gray, THRESHOLD, 255, cv.THRESH_BINARY)  # 将灰度图二值化
data = np.array(black_Write)  # 把这个数据通过 numpy 转换成多维度的张量

在转化为灰度图的时候，会将RGB三颜色通道转换为0~255的灰度图，0表示纯黑，255表示纯白。

转化为二值图的先决条件就是要转换为灰度图，然后通过设置阈值将灰度图所有的像素点分割为0和255两种颜色。上述代码第三行的THRESHOULD是就是阈值，大小在0~255之间。我这里设置的就是80。

进阶知识

矩阵的处理—缩小矩阵

最开始测试的时候，我用的是10块钱二手买来的USB免驱摄像头。拍摄分辨率为640*480，在遍历时我们大多数会选择二重for循环进行，然后进行判断。但是这是十分消耗时间的。因此对于矩阵的处理显得尤为重要。接下来将介绍方法一：缩小矩阵。

X_LEFT_CUT_NUM = 199
X_RIGHT_CUT_NUM = 439
Y_HIGH_CUT_NUM = 439
Y_LOW_CUT_NUM = 239
data_after = data[X_LEFT_CUT_NUM:X_RIGHT_CUT_NUM, Y_LOW_CUT_NUM:Y_HIGH_CUT_NUM]  # 裁剪图像，获得一个200*240

data为经过二值化后得到的矩阵。

你通过种植得到的是橙色区域。对于车辆的跟踪，正好这块区域是最需要关注的，离车很近，是下一步会到达的区域。与遍历480到640的矩阵相比，遍历该矩阵的时间大大减少。

[En]

What you get by cropping is the orange area. For the car tracking, just this area is the most need for concern, very close to the car, is the next step will arrive in the area. Compared to traversing a matrix of 480 to 640, the time to traverse this matrix is greatly reduced.

但是，依旧非常的消耗时间，我的测试平台是NVIDIA Jetson nano，得到的帧数竟然只有2帧。对于这块板子来说是非常不理想的，于是就分析了原因。nano这块板子强于树莓派就在于它具有128个CUDA,处理图像以及做流式计算会大幅度加快。但是因为还没有用到张量的运算，因此这里还是取决于CPU的处理速度。

在我添加了时间戳之后，一切都变得清晰起来，而且遍历矩阵的速度太慢了。

[En]

Everything became clear after I added the timestamp, and the traversal of the matrix was too slow.

平均采样

对于上述问题，我可以想到一个解决方案，平均抽样。当我们处理一张图片时，我们真的不需要判断每一行。例如，现在我需要遍历200行，而我只需要遍历其中的20行就可以得到这个趋势。

[En]

For the above problems, I can think of a solution, average sampling. We don’t really need to judge each line when we deal with a picture. For example, now I need to traverse 200 rows, and I only need to traverse 20 of them to get this trend.

source = 0  # 记录黑点的横坐标
m = 0  # 记录黑点个数
y = int(200 / 20)  # 画面的纵轴大小
x = 240  # 画面的横轴大小
for i in range(y):
    for j in range(x):
        if pred[j, i * 10] == 0:
            m += 1
            source += j

这里我选择了每20行取一行，果然帧数上升了。成功到了八帧。。。

右边的窗口是图像。但仅仅寻找痕迹是不够的。现在我们已经到了矩阵的末尾，我们必须考虑其他方法。

[En]

The window on the right is the image. But it’s not enough to look for traces. Now that we’ve come to the end of the matrix, we have to think of other ways.

池化操作

前面有谈到，到目前为止，因为没有使用矩阵的运算，因此没有发挥出NVIDIA Jetson nano这块开发板的优势，所以我们可以想想如何往这方面靠靠。碰巧我今年暑假接触过一点计算机视觉的网上课程。我就想到了池化的操作。

池化操作分为最大池化和平均池化。如上图所示，其核心是用卷积核来缩小图像的尺寸，并用局部特征来代替整体的一种–中国方法。最大池操作的操作速度会更快，但会丢失一些细节。平均池化是基于卷积内核中所有值的平均值，而不是整个值，因此操作会稍微慢一些，但细节将被保留。

[En]

Pooling operations are divided into maximum pooling and average pooling. As shown in the above figure, the core is to use convolution cores to reduce the size of the image and use local features to replace the overall one-China approach. The operation speed of the maximum pooling operation will be faster, but some details will be lost. The average pooling is based on the average of all the values in the convolution kernel instead of the whole, so the operation will be a little slower, but the details will be retained.

拼合运算不同于卷积运算，拼合卷积核可以看作是具有相同权值的卷积。池化只会减小图片的大小并转换尺寸。如果我们走得太远，我们在这里只使用池化方法。要理解意思，你只需要知道这是一种使用局部特征而不是整体特征的方式，这可以缩小画面的大小，但也可以很好地代表画面。

[En]

Pooling operation is different from convolution, pooled convolution kernel can be regarded as convolution with the same weight. Pooling only reduces the size of the picture and converts the dimensions. If we go too far, we only use the pooling method here. To understand the meaning, you just need to know that it is a way to use local features instead of the whole, which can reduce the size of the picture, but it can also represent the picture very well.

STRIDE = 3  # &#x6B65;&#x6570;
data_after = tf.convert_to_tensor(data, tf.float32, name='data_after')
data_after = tf.expand_dims(data_after, axis=0)  # &#x62D3;&#x5C55;&#x4E3A;&#x4E09;&#x7EF4;&#x5F20;&#x91CF;
data_after = tf.expand_dims(data_after, axis=3)  # &#x62D3;&#x5C55;&#x4E3A;&#x4E09;&#x7EF4;&#x5F20;&#x91CF;
out = tf.nn.avg_pool(data_after, [1, STRIDE, STRIDE, 1], [1, STRIDE, STRIDE, 1], 'SAME')  # &#x5E73;&#x5747;&#x6C60;&#x5316;&#xFF0C;&#x540E;&#x9762;&#x7684;&#x6570;&#x7EC4;&#x4E3A;&#x8DF3;&#x8F6C;&#x6B65;&#x6570;

data_after为上面经过二值化得到的矩阵。shape为[240,200]但是进行池化操作需要进行维度扩展，否则无法使用卷积核进行池化操作。进行了第3、4行的操作后shape为[1,240,200,1]。就可以进行池化操作了。

池化操作的第1个参数需要处理的矩阵；

第2个参数为卷积核，数组的第一个和最后一个为必须为1，中间两个数表示卷积核大小，例如这里就是3*3的卷积核；

第3个参数为移动的步长，卷积核在图像上移动才能得到处理后的图像。与上面一样数组里第1个和最后一个为1，表示会遍历所有的维度，中间两个数为横向移动和纵向移动的步长。

第四个参数选择为”VALID”和”SAME”分别表示为是否需要边界填充。对于新手来说，如果搞不清楚直接选择”SAME”方便计算池化后的大小。

通过上述操作，我使用3*3的卷积核，步长为3进行池化操作得到的处理后的图片x轴缩小了3倍，y轴缩小了3倍。平均取样间隔该到了10行一次，此时我在运行时，以及达到了30帧+了。（其实以及达到了上限，后来我将平均采样该到了5行一次，依然能保持24帧+）

算法优化（重点）

240200的矩阵卷积之后变成了8067的矩阵，由于每10行取样一次，就可能造成得到的值不能很好的表现当前的情况。因此我们需要想办法优化算法。

二次池化

前面提到，80-67的矩阵是通过收割和拼接获得的，接受视野很窄，不能用来判断稍远一点的情况。最直观的变化是，车速不可能很快，一旦速度超过前一帧，那么视觉跟踪就变得毫无意义。

[En]

It is mentioned earlier that the matrix of 80mm 67 is obtained by cropping and pooling, and the receptive field is very narrow and can not be used to judge the situation a little further away. The most intuitive change is that the speed of the car can not be very fast, once the speed exceeds the previous frame, then visual tracking will become meaningless.

在此，就提出了二次池化，用池化代替裁剪。就可以获得了更大的感受野。此时我也将摄像头改为了板载摄像头。以下是板载摄像头的打开方式，CSI摄像头用ViserCapture是无法打开的，需要进行配置。

def gstreamer_pipeline(
        capture_width=1280,
        capture_height=720,
        display_width=1280,
        display_height=720,
        framerate=60,
        flip_method=0,
):
    return (
            "nvarguscamerasrc ! "
            "video/x-raw(memory:NVMM), "
            "width=(int)%d, height=(int)%d, "
            "format=(string)NV12, framerate=(fraction)%d/1 ! "
            "nvvidconv flip-method=%d ! "
            "video/x-raw, width=(int)%d, height=(int)%d, format=(string)BGRx ! "
            "videoconvert ! "
            "video/x-raw, format=(string)BGR ! appsink"
            % (
                capture_width,
                capture_height,
                framerate,
                flip_method,
                display_width,
                display_height,
            )
    )

capture = cv.VideoCapture(gstreamer_pipeline(flip_method=0), cv.CAP_GSTREAMER)  # &#x521B;&#x5EFA;&#x4E00;&#x4E2A;VideoCapture&#x5BF9;&#x8C61;

此时窗口大小变为了1280720，进行了两次33卷积核，步长为3的池化操作，窗口大小变为了142*80，平均采样为10行采样一行。

当时我在动，可以看到感觉场很大，在这么复杂的环境下还能保持在30帧左右。(后来，当平均采样为每行5行时，帧的数量保持在24帧左右。)到目前为止，画面的接受场和表达都做得很好。

[En]

At that time, I was moving, and I could see that the sensory field was very large, and it could still be maintained at about 30 frames in such a complex environment. (later, when the average sampling was 5 lines per row, the number of frames was maintained at around 24 frames.) so far, the receptive field and the expression of the picture have been done well.

有的朋友会问，为什么要经过两次池化，而不是一次池化原则更大的卷积核以及步长呢？回答这个问题，首先卷积核池化一次只会输出一个值即一个特征，那么就意味着在该值下的其他特征就会被覆盖。大的卷积核得到的图片就不会有二次卷积得到的图片包含的特征细腻。其次，著名AlexNet在当年的人工智能大赛获得冠军就证明了连续多个小型卷积核的组合效果优于大型卷积核。

判断优化

对于”直行”来说左侧空白处，是不需要判断的，进行判断十分浪费时间，并且没有意义。

同理对于”右转”来说亦是如此。

然后，您可以通过上一次图像处理获得的命令来选择当前图像的处理范围。它很好地解决了上述问题。

[En]

Then you can select the processing range of the current image through the command obtained from the previous image processing. It solves the problems mentioned above very well.

然后我们再想，赛道右侧的空白部分是不是也没有必要去判断呢？

[En]

Then we think again, whether the blank part on the right side of the track is also unnecessary to judge?

在这里，我设计了一个算法，当扫描矩阵的每一行时，得到轨迹后，如果连续有多个白点，则直接跳出当前循环。

[En]

Here I have designed an algorithm, when scanning each row of the matrix, after getting the trajectory, if there are multiple white dots in succession, then directly jump out of the current loop.

STRIDE = 3
CAPTURE_WIDTH = 1280  # 摄像头捕捉画面宽
CAPTURE_HEIGHT = 720  # 摄像头捕捉画面高
order = ""  # 保存上一条指令
RANGE_NUM = 10  # 直行允许波动范围
while True:
    source = 0  # 记录黑点的横坐标
    black_point_num = 0  # 代表黑点个数
    x = int(CAPTURE_WIDTH / STRIDE / STRIDE)
    y = int(CAPTURE_HEIGHT / STRIDE / STRIDE)
    opt = 0
    if order == "前进":
        opt = int(x / 4)
    elif order == "右转":
        opt = int(x / 3)
    else:
        opt = 0
    for i in range(y):
        row = 0  # 记录当前行的黑点数量
        row_white_point_num = 0  # 代表行白点个数
        for j in range(opt, x):
            if pred[i, j] == 0:
                row += 1
                row_white_point_num = 0
                source += j
            else:
                row_white_point_num += 1
                if row > 5 and row_white_point_num > 5:
                    break
        black_point_num += row
    source /= black_point_num  # 获得平均横坐标
    source -= x / 2
    if abs(source)  0:
        print("右转")
        order = "右转"
    else:
        print("左转")
        order = "左转"

为了验证这种效率，我取消了平均抽样。目前的情况是遍历142-80的矩阵。

[En]

To verify this efficiency, I canceled the average sampling. The current condition is to traverse the matrix of 142-80.

在没有判断和优化的情况下，可以看到帧的数量在6.5帧左右。

[En]

Without judgment and optimization, you can see that the number of frames is about 6.5.

经过优化判断，帧数在12帧左右，效率几乎翻了一番。

[En]

After the optimization judgment, the number of frames is about 12 frames, and the efficiency is almost doubled.

完整代码

优化前

&#x5F00;&#x53D1;&#x4F5C;&#x8005;   &#xFF1A;Tian.Z.L
&#x5F00;&#x53D1;&#x65F6;&#x95F4;   &#xFF1A;2022/1/9  21:33
&#x6587;&#x4EF6;&#x540D;&#x79F0;   &#xFF1A;vision.PY
&#x5F00;&#x53D1;&#x5DE5;&#x5177;   &#xFF1A;PyCharm
import time
import Adafruit_SSD1306
import cv2 as cv
import numpy as np
import tensorflow as tf

from PIL import Image

THRESHOLD = 100  # &#x4E8C;&#x503C;&#x5316;&#x9608;&#x503C;
STRIDE = 3
X_LEFT_CUT_NUM = 199
X_RIGHT_CUT_NUM = 439
Y_HIGH_CUT_NUM = 439
Y_LOW_CUT_NUM = 239
RANGE_NUM = 10

&#x6CE8;&#x610F; &#x4F7F;&#x7528;&#x7684;&#x662F;&#x54EA;&#x7EC4;i2c&#x7684;&#x63A5;&#x53E3;&#xFF0C;&#x5BF9;&#x5E94;&#x8C03;&#x6574;i2c_bus&#x53D6;&#x503C;
OLED = Adafruit_SSD1306.SSD1306_128_64(rst=None, i2c_bus=1, gpio=1)
OLED.begin()  # &#x521D;&#x59CB;&#x5316;&#x5C4F;&#x5E55;&#x5E76;&#x6E05;&#x5C4F;
OLED.clear()
OLED.display()
def gstreamer_pipeline(
        capture_width=1280,
        capture_height=720,
        display_width=1280,
        display_height=720,
        framerate=60,
        flip_method=0,
):
    return (
            "nvarguscamerasrc ! "
            "video/x-raw(memory:NVMM), "
            "width=(int)%d, height=(int)%d, "
            "format=(string)NV12, framerate=(fraction)%d/1 ! "
            "nvvidconv flip-method=%d ! "
            "video/x-raw, width=(int)%d, height=(int)%d, format=(string)BGRx ! "
            "videoconvert ! "
            "video/x-raw, format=(string)BGR ! appsink"
            % (
                capture_width,
                capture_height,
                framerate,
                flip_method,
                display_width,
                display_height,
            )
    )

capture = cv.VideoCapture(gstreamer_pipeline(flip_method=0), cv.CAP_GSTREAMER)  # &#x521B;&#x5EFA;&#x4E00;&#x4E2A;VideoCapture&#x5BF9;&#x8C61;
start_time = time.time()
while (True):
    ret, frame = capture.read()  # &#x4E00;&#x5E27;&#x4E00;&#x5E27;&#x8BFB;&#x53D6;&#x89C6;&#x9891;
    gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)  # &#x5BF9;&#x6BCF;&#x4E00;&#x5E27;&#x505A;&#x5904;&#x7406;,&#x8BBE;&#x7F6E;&#x4E3A;&#x7070;&#x5EA6;&#x56FE;
    retval, black_Write = cv.threshold(gray, THRESHOLD, 255, cv.THRESH_BINARY)  # &#x5C06;&#x7070;&#x5EA6;&#x56FE;&#x4E8C;&#x503C;&#x5316;
    data = np.array(black_Write)  # &#x628A;&#x8FD9;&#x4E2A;&#x6570;&#x636E;&#x901A;&#x8FC7; numpy &#x8F6C;&#x6362;&#x6210;&#x591A;&#x7EF4;&#x5EA6;&#x7684;&#x5F20;&#x91CF;
    # data_after = data[X_LEFT_CUT_NUM:X_RIGHT_CUT_NUM, Y_LOW_CUT_NUM:Y_HIGH_CUT_NUM]  # &#x88C1;&#x526A;&#x56FE;&#x50CF;&#xFF0C;&#x83B7;&#x5F97;&#x4E00;&#x4E2A;200*240
    data_after = tf.convert_to_tensor(data, tf.float32, name='data_after')
    data_after = tf.expand_dims(data_after, axis=0)  # &#x62D3;&#x5C55;&#x4E3A;&#x4E09;&#x7EF4;&#x5F20;&#x91CF;
    data_after = tf.expand_dims(data_after, axis=3)  # &#x62D3;&#x5C55;&#x4E3A;&#x4E09;&#x7EF4;&#x5F20;&#x91CF;
    # out = tf.nn.avg_pool(data_after, [1, STRIDE, STRIDE, 1], [1, STRIDE, STRIDE, 1], 'SAME')  # &#x5E73;&#x5747;&#x6C60;&#x5316;&#xFF0C;&#x540E;&#x9762;&#x7684;&#x6570;&#x7EC4;&#x4E3A;&#x8DF3;&#x8F6C;&#x6B65;&#x6570;
    out = tf.nn.max_pool(data_after, [1, STRIDE, STRIDE, 1], [1, STRIDE, STRIDE, 1], 'SAME')  # &#x6700;&#x5927;&#x6C60;&#x5316;&#xFF0C;&#x540E;&#x9762;&#x7684;&#x6570;&#x7EC4;&#x4E3A;&#x8DF3;&#x8F6C;&#x6B65;&#x6570;
    out = tf.nn.max_pool(out, [1, STRIDE, STRIDE, 1], [1, STRIDE, STRIDE, 1], 'SAME')
    # print(out)
    out = tf.squeeze(out)  # &#x7F29;&#x5C0F;&#x5F20;&#x91CF;&#x7EF4;&#x5EA6;&#xFF0C;&#x5C06;&#x6240;&#x6709;&#x7EF4;&#x5EA6;&#x4E3A;1&#x7684;&#x53BB;&#x9664;&#x6389;&#xFF0C;&#x5728;&#x8FD9;&#x91CC;&#x5C31;&#x8868;&#x73B0;&#x53EA;&#x5269;&#x4E0B;&#x4E8C;&#x503C;&#x56FE;
    pred = np.array(out, np.uint8)
    cv.imshow('123', pred)  # &#x663E;&#x793A;&#x7ED3;&#x679C;
    source = 0  # &#x8BB0;&#x5F55;&#x9ED1;&#x70B9;&#x7684;&#x6A2A;&#x5750;&#x6807;
    m = 0   # &#x8BB0;&#x5F55;&#x9ED1;&#x70B9;&#x4E2A;&#x6570;
    y = 143  # &#x753B;&#x9762;&#x7684;&#x6A2A;&#x8F74;&#x5927;&#x5C0F;
    x = 80  # &#x753B;&#x9762;&#x7684;&#x7EB5;&#x8F74;&#x5927;&#x5C0F;
    for i in range(y):
        for j in range(x):
            if pred[j, i] == 0:
                m += 1
                source += j
    if cv.waitKey(1) & 0xFF == ord('q'):  # &#x6309;q&#x505C;&#x6B62;
        break
    if (time.time() - start_time) != 0:  # &#x5B9E;&#x65F6;&#x663E;&#x793A;&#x5E27;&#x6570;
        # print((time.time() - start_time))
        print("FPS: ", 1 / (time.time() - start_time))
    try:
        source /= m  # &#x83B7;&#x5F97;&#x5E73;&#x5747;&#x6A2A;&#x5750;&#x6807;
        source -= x / 2
        if abs(source) <= 10: print("前进") elif source> 0:
            print("&#x5DE6;&#x8F6C;")
        else:
            print("&#x53F3;&#x8F6C;")
        start_time = time.time()
    except:
        start_time = time.time()
        print("&#x505C;&#x6B62;")
        continue
capture.release()  # &#x91CA;&#x653E;cap,&#x9500;&#x6BC1;&#x7A97;&#x53E3;
cv.destroyAllWindows()
</=>

优化后

&#x5F00;&#x53D1;&#x4F5C;&#x8005;   &#xFF1A;Tian.Z.L
&#x5F00;&#x53D1;&#x65F6;&#x95F4;   &#xFF1A;2022/1/11  17:15
&#x6587;&#x4EF6;&#x540D;&#x79F0;   &#xFF1A;visionByOptimization.PY
&#x5F00;&#x53D1;&#x5DE5;&#x5177;   &#xFF1A;PyCharm
import time
import cv2 as cv
import numpy as np
import tensorflow as tf

THRESHOLD = 100  # &#x4E8C;&#x503C;&#x5316;&#x9608;&#x503C;
STRIDE = 3
CAPTURE_WIDTH = 1280  # &#x6444;&#x50CF;&#x5934;&#x6355;&#x6349;&#x753B;&#x9762;&#x5BBD;
CAPTURE_HEIGHT = 720  # &#x6444;&#x50CF;&#x5934;&#x6355;&#x6349;&#x753B;&#x9762;&#x9AD8;
FRAMERATE = 60  # &#x6444;&#x50CF;&#x5934;&#x6355;&#x6349;&#x753B;&#x9762;&#x5E27;&#x6570;
RANGE_NUM = 10
order = ""  # &#x8BB0;&#x5F55;&#x4E0A;&#x4E00;&#x6761;&#x547D;&#x4EE4;

&#x8C03;&#x53D6;&#x534A;&#x8F7D;&#x6444;&#x50CF;&#x5934;
def gstreamer_pipeline(
        capture_width=CAPTURE_WIDTH,
        capture_height=CAPTURE_HEIGHT,
        display_width=CAPTURE_WIDTH,
        display_height=CAPTURE_HEIGHT,
        framerate=FRAMERATE,
        flip_method=0,
):
    return (
            "nvarguscamerasrc ! "
            "video/x-raw(memory:NVMM), "
            "width=(int)%d, height=(int)%d, "
            "format=(string)NV12, framerate=(fraction)%d/1 ! "
            "nvvidconv flip-method=%d ! "
            "video/x-raw, width=(int)%d, height=(int)%d, format=(string)BGRx ! "
            "videoconvert ! "
            "video/x-raw, format=(string)BGR ! appsink"
            % (
                capture_width,
                capture_height,
                framerate,
                flip_method,
                display_width,
                display_height,
            )
    )

capture = cv.VideoCapture(gstreamer_pipeline(flip_method=0), cv.CAP_GSTREAMER)  # &#x521B;&#x5EFA;&#x4E00;&#x4E2A;VideoCapture&#x5BF9;&#x8C61;
start_time = time.time()
while (True):
    ret, frame = capture.read()  # &#x4E00;&#x5E27;&#x4E00;&#x5E27;&#x8BFB;&#x53D6;&#x89C6;&#x9891;
    gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)  # &#x5BF9;&#x6BCF;&#x4E00;&#x5E27;&#x505A;&#x5904;&#x7406;,&#x8BBE;&#x7F6E;&#x4E3A;&#x7070;&#x5EA6;&#x56FE;
    retval, black_Write = cv.threshold(gray, THRESHOLD, 255, cv.THRESH_BINARY)  # &#x5C06;&#x7070;&#x5EA6;&#x56FE;&#x4E8C;&#x503C;&#x5316;
    data = np.array(black_Write)  # &#x628A;&#x8FD9;&#x4E2A;&#x6570;&#x636E;&#x901A;&#x8FC7; numpy &#x8F6C;&#x6362;&#x6210;&#x591A;&#x7EF4;&#x5EA6;&#x7684;&#x5F20;&#x91CF;
    data_after = tf.convert_to_tensor(data, tf.float32, name='data_after')
    data_after = tf.expand_dims(data_after, axis=0)  # &#x62D3;&#x5C55;&#x7EF4;&#x5EA6;
    data_after = tf.expand_dims(data_after, axis=3)  # &#x62D3;&#x5C55;&#x7EF4;&#x5EA6;
    # out = tf.nn.avg_pool(data_after, [1, STRIDE, STRIDE, 1], [1, STRIDE, STRIDE, 1], 'SAME')  # &#x5E73;&#x5747;&#x6C60;&#x5316;&#xFF0C;&#x540E;&#x9762;&#x7684;&#x6570;&#x7EC4;&#x4E3A;&#x8DF3;&#x8F6C;&#x6B65;&#x6570;
    out = tf.nn.max_pool(data_after, [1, STRIDE, STRIDE, 1], [1, STRIDE, STRIDE, 1], 'SAME')  # &#x6700;&#x5927;&#x6C60;&#x5316;&#xFF0C;&#x540E;&#x9762;&#x7684;&#x6570;&#x7EC4;&#x4E3A;&#x8DF3;&#x8F6C;&#x6B65;&#x6570;
    out = tf.nn.max_pool(out, [1, STRIDE, STRIDE, 1], [1, STRIDE, STRIDE, 1], 'SAME')
    # print(out)
    out = tf.squeeze(out)  # &#x7F29;&#x5C0F;&#x5F20;&#x91CF;&#x7EF4;&#x5EA6;&#xFF0C;&#x5C06;&#x6240;&#x6709;&#x7EF4;&#x5EA6;&#x4E3A;1&#x7684;&#x53BB;&#x9664;&#x6389;&#xFF0C;&#x5728;&#x8FD9;&#x91CC;&#x5C31;&#x8868;&#x73B0;&#x53EA;&#x5269;&#x4E0B;&#x4E8C;&#x503C;&#x56FE;
    pred = np.array(out, np.uint8)
    cv.imshow('123', pred)  # &#x663E;&#x793A;&#x7ED3;&#x679C;

    source = 0  # &#x8BB0;&#x5F55;&#x9ED1;&#x70B9;&#x7684;&#x6A2A;&#x5750;&#x6807;
    black_point_num = 0  # &#x4EE3;&#x8868;&#x9ED1;&#x70B9;&#x4E2A;&#x6570;
    x = int(CAPTURE_WIDTH / STRIDE / STRIDE)
    y = int(CAPTURE_HEIGHT / STRIDE / STRIDE)
    opt = 0
    if order == "&#x524D;&#x8FDB;":
        opt = int(x / 4)
    elif order == "&#x53F3;&#x8F6C;":
        opt = int(x / 3)
    else:
        opt = 0
    for i in range(y):
        row = 0  # &#x8BB0;&#x5F55;&#x5F53;&#x524D;&#x884C;&#x7684;&#x9ED1;&#x70B9;&#x6570;&#x91CF;
        row_white_point_num = 0  # &#x4EE3;&#x8868;&#x884C;&#x767D;&#x70B9;&#x4E2A;&#x6570;
        for j in range(opt, x):
            if pred[i, j] == 0:
                row += 1
                row_white_point_num = 0
                source += j
            else:
                row_white_point_num += 1
                if row > 5 and row_white_point_num > 5:
                    break
        black_point_num += row
    if cv.waitKey(1) & 0xFF == ord('q'):  # &#x6309;q&#x505C;&#x6B62;
        break
    if (time.time() - start_time) != 0:  # &#x5B9E;&#x65F6;&#x663E;&#x793A;&#x5E27;&#x6570;
        # print((time.time() - start_time))
        print("FPS: ", 1 / (time.time() - start_time))
    try:
        source /= black_point_num  # &#x83B7;&#x5F97;&#x5E73;&#x5747;&#x6A2A;&#x5750;&#x6807;
        source -= x / 2
        if abs(source) <= range_num: print("前进") order="&#x524D;&#x8FDB;" elif source> 0:
            print("&#x53F3;&#x8F6C;")
            order = "&#x53F3;&#x8F6C;"
        else:
            print("&#x5DE6;&#x8F6C;")
            order = "&#x5DE6;&#x8F6C;"
        start_time = time.time()
    except:
        start_time = time.time()
        print("&#x505C;&#x6B62;")
        continue
capture.release()  # &#x91CA;&#x653E;cap,&#x9500;&#x6BC1;&#x7A97;&#x53E3;
cv.destroyAllWindows()
</=>

Original: https://blog.csdn.net/qq_42500340/article/details/122447748
Author: 叫我田小霖啦
Title: OpenCV+TensorFlow简单的机器小车传统视觉寻迹

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/508897/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Linux系统中如何使用tslib库实现触摸功能

大家好，今天主要和大家聊一聊，如何使用tslib库来完成对应的功能。目录第一：tslib库基本简介第二：tslib安装目录下的文件夹介绍第三：在开发板上测试tslib库第…

人工智能 2023年6月29日
0074
Lego_loam运行及结果保存——遇到问题及解决

由于笔记本16系统总出问题，在台式电脑安装了18，跑lego_loam的时候，在run.launch之后出现问题，当时也是傻傻的去搜，被搜到的信息迷惑了，一直在尝试各种办法解决。 …

人工智能 2023年6月2日
00166
Collaborativ

问题描述 Collaborative filtering（协同过滤）是一种常用的推荐系统算法，旨在通过分析用户行为和兴趣，找到相似用户或相似物品，并推荐给用户可能感兴趣的物品。本文…

人工智能 2024年1月2日
0033
YOLOv4网络详解

论文名称：YOLOv4: Optimal Speed and Accuracy of Object Detection论文下载地址：https://arxiv.org/abs/20…

人工智能 2023年7月28日
0076
Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测

一、前言由于YOLOv5在Xavier上对实时画面的检测速度较慢，需要采用TensorRT对其进行推理加速。接下来记录一下我的实现过程。二、环境准备如果还没有搭建YOLOv5…

人工智能 2023年7月22日
0053
深度学习：算法优化之动量算法（Momentum）

1.原理运用物理学上的动量思想，在梯度下降的问题中引入动量项 m m m 和折扣因子 γ \gamma γ，公式为：m t = γ m t + 1 m_t=\gamma m_{t…

人工智能 2023年5月26日
0052
为树莓派安装Pytorch等库（使用whl包方法）

最近因为有计算量较大的Python程序要跑，然而主机又得有一些”重要的事情”要办，所以不得不把目光放在了手边这块闲置的树莓派4B上。虽然小破核性能不太行，但…

人工智能 2023年7月22日
0096
气象类Python编程实战案例项目汇总

1.气象数据科学语言教程 (1)Python 基础 (2)Numpy教程 (3)Pandas教程 (4)Xarray实例 (5)Dask教程 2.气象数据读取/数据处理/数据分析/…

人工智能 2023年6月19日
0065
GAEAT: 面向知识图谱补全的图自动编码注意网络

核心问题：现有方法专注于独立处理三元组中的实体和关系，无法捕获三元组周围局部邻域内固有的隐藏的复杂信息，文章提出了一种既能封装实体特征又能封装关系特征的知识图谱补全方法.具体…

人工智能 2023年6月1日
0067
MATLAB图像处理边缘检测

最近正好在做APMCM2019的A题，发现了matlab一些比较好用的函数，做一个学习笔记的作用如果大家有去了解这道题目的话，会发现它其实是需要你通过图像处理等方式，将114张SI…

人工智能 2023年6月17日
00111
RNA 21. SCI 文章中单基因富集分析

点击关注，桓峰基因桓峰基因生物信息分析，SCI文章撰写及生物信息基础知识学习：R语言学习，perl基础编程，linux系统命令，Python遇见更好的你 88篇原创内容公众号…

人工智能 2023年7月16日
0056
计量经济学之格兰杰因果关系检验（Granger causality test）

格兰杰因果检验是用在时间序列数据上的一种计量方法。格兰杰因果关系的内涵：若在包含了变量X、Y的过去信息的条件下，对变量Y的预测效果要优于只单独由Y的过去信息对Y进行的预测效果，…

人工智能 2023年6月18日
00140
Python案例实操3-电影数据分析

Python案例实操3-电影数据分析一、读取数据二、数据处理 * 1.索引重命名 2.合并数据集 3.选取子集 4.缺失值处理 5.数据格式转换三、数据分析及可视化 * 1….

人工智能 2023年7月29日
00112
SpringBoot整合JPA+SQLite

文章目录背景介绍 SQLite安装以及生成db库创建SpringBoot项目 * POM文件所需要的依赖配置数据源配置JAP 测试效果 * 创建pojo类创建Reposi…

人工智能 2023年6月26日
00127
动手实现深度学习（8）：基于计算图构建的神经网络

SoftMax函数会将函数正则话以后在输出，比如在手写体识别中，输出层一般是softmax。考虑到里面也使用了交叉熵(cross entroy error)，所以命名为softm…

人工智能 2023年6月4日
0068
深度学习目标检测之SSD网络（超级详细）

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年5月26日
0075

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

OpenCV+TensorFlow简单的机器小车传统视觉寻迹

小白需要知道的

什么是传统的视觉寻迹？

寻迹的思想

进阶知识

矩阵的处理—缩小矩阵

平均采样

池化操作

算法优化（重点）

二次池化

判断优化

完整代码

优化前

优化后

大家都在看