目标检测的Tricks | 【Trick11】label的缩放与显示

2023年7月12日上午8:16 • 人工智能 • 阅读 57

如有错误，恳请指出。

在之前的内容中，基本已经把trick与处理流程讲了一遍，这两个函数只是进行缩放与画框，也谈不上是一个技巧，也可以说是一个工具。

所以下面介绍两个函数：

sacle_coords函数：将当前的label坐标位置尺寸还原为原图的坐标位置尺寸
draw_box函数：将当前的预测边界框显示在原图上

文章目录

1. sacle_coords函数
2. draw_box函数
sacle_coords函数
*主要思路：

获取缩放后的图像与缩放前图像的一个比例，根据比例计算出上下左右pad的数目。然后对边界框减去pad的数目，再根据比例缩放，就获得了一个缩放后的边界框。最后进行一个边界截断，以免溢出。返回缩放处理好的的预测框。

*参考代码

def scale_coords(img1_shape, coords, img0_shape, ratio_pad=None):
"""
    将预测的坐标信息转换回原图尺度
    :param img1_shape: 缩放后的图像尺度
    :param coords: 预测的box信息
    :param img0_shape: 缩放前的图像尺度
    :param ratio_pad: 缩放过程中的缩放比例以及pad
    :return:
"""

    if ratio_pad is None:
        gain = max(img1_shape) / max(img0_shape)
        pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2
    else:
        gain = ratio_pad[0][0]
        pad = ratio_pad[1]

    coords[:, [0, 2]] -= pad[0]
    coords[:, [1, 3]] -= pad[1]
    coords[:, :4] /= gain
    clip_coords(coords, img0_shape)
    return coords

def clip_coords(boxes, img_shape):

    boxes[:, 0].clamp_(0, img_shape[1])
    boxes[:, 1].clamp_(0, img_shape[0])
    boxes[:, 2].clamp_(0, img_shape[1])
    boxes[:, 3].clamp_(0, img_shape[0])

其中的重点是根据比例来推断出上下左右的pad数目，边界框的数值需要减去这个pad偏移，才可以对应真正的原图上的大小。

draw_box函数

如图所示，对于这些检测出来的边界框，其实就是需要根据预测好的预测框坐标然后在原图上画出来，才会得到这些一个个的框，每个类别对于一种颜色，其中还需要显示类别名称与置信度的大小。

这里还会进行最后一轮的过滤，当nms处理后的预测框，当置信度还存在低于阈值0.1的预测框，这里会直接忽视不显示。然后有个小方法，对于每个预测框构建两个字典，一个是用来存储要显示的字符串，一个是用来存储要显示的框颜色（框颜色是根据类别来选择的）。如下所示：
目标检测的Tricks | 【Trick11】label的缩放与显示

之后需要的做的事情，就是遍历字典中的每一个预测框，根据坐标信息与文本信息（类别名称：置信度）来在相应的位置绘制出来。重点函数如下所示：


for box, color in box_to_color_map.items():

    xmin, ymin, xmax, ymax = box
    (left, right, top, bottom) = (xmin * 1, xmax * 1, ymin * 1, ymax * 1)

    draw.line([(left, top), (left, bottom), (right, bottom), (right, top), (left, top)],
              width=line_thickness, fill=color)

    draw_text(draw, box_to_display_str_map, box, left, right, top, bottom, color)

整个draw_box代码如下所示，我做好了注释的：

import collections
from PIL import Image
import PIL.ImageDraw as ImageDraw
import PIL.ImageFont as ImageFont
import numpy as np

STANDARD_COLORS = [
    'AliceBlue', 'Chartreuse', 'Aqua', 'Aquamarine', 'Azure', 'Beige', 'Bisque',
    'BlanchedAlmond', 'BlueViolet', 'BurlyWood', 'CadetBlue', 'AntiqueWhite',
    'Chocolate', 'Coral', 'CornflowerBlue', 'Cornsilk', 'Crimson', 'Cyan',
    'DarkCyan', 'DarkGoldenRod', 'DarkGrey', 'DarkKhaki', 'DarkOrange',
    'DarkOrchid', 'DarkSalmon', 'DarkSeaGreen', 'DarkTurquoise', 'DarkViolet',
    'DeepPink', 'DeepSkyBlue', 'DodgerBlue', 'FireBrick', 'FloralWhite',
    'ForestGreen', 'Fuchsia', 'Gainsboro', 'GhostWhite', 'Gold', 'GoldenRod',
    'Salmon', 'Tan', 'HoneyDew', 'HotPink', 'IndianRed', 'Ivory', 'Khaki',
    'Lavender', 'LavenderBlush', 'LawnGreen', 'LemonChiffon', 'LightBlue',
    'LightCoral', 'LightCyan', 'LightGoldenRodYellow', 'LightGray', 'LightGrey',
    'LightGreen', 'LightPink', 'LightSalmon', 'LightSeaGreen', 'LightSkyBlue',
    'LightSlateGray', 'LightSlateGrey', 'LightSteelBlue', 'LightYellow', 'Lime',
    'LimeGreen', 'Linen', 'Magenta', 'MediumAquaMarine', 'MediumOrchid',
    'MediumPurple', 'MediumSeaGreen', 'MediumSlateBlue', 'MediumSpringGreen',
    'MediumTurquoise', 'MediumVioletRed', 'MintCream', 'MistyRose', 'Moccasin',
    'NavajoWhite', 'OldLace', 'Olive', 'OliveDrab', 'Orange', 'OrangeRed',
    'Orchid', 'PaleGoldenRod', 'PaleGreen', 'PaleTurquoise', 'PaleVioletRed',
    'PapayaWhip', 'PeachPuff', 'Peru', 'Pink', 'Plum', 'PowderBlue', 'Purple',
    'Red', 'RosyBrown', 'RoyalBlue', 'SaddleBrown', 'Green', 'SandyBrown',
    'SeaGreen', 'SeaShell', 'Sienna', 'Silver', 'SkyBlue', 'SlateBlue',
    'SlateGray', 'SlateGrey', 'Snow', 'SpringGreen', 'SteelBlue', 'GreenYellow',
    'Teal', 'Thistle', 'Tomato', 'Turquoise', 'Violet', 'Wheat', 'White',
    'WhiteSmoke', 'Yellow', 'YellowGreen'
]

def filter_low_thresh(boxes, scores, classes, category_index, thresh, box_to_display_str_map, box_to_color_map):
"""
    1、过滤掉scores低于thresh的anchor;
    2、为每个anchor生成显示信息和框框颜色并分别保存在box_to_display_str_map和box_to_color_map中
    :param boxes: 最终预测结果 (anchor_nums, x1+y1+x2+y2)=(7, 4) (相对原图的预测结果) 分类别且按score从大到小排列
    :param scores: 所有预测anchors的得分 (7) 分类别且按score从大到小排列
    :param classes: 所有预测anchors的类别 (7) 分类别且按score从大到小排列
    :param category_index: 所有类别的信息（从data/pascal_voc_classes.json中读出）
    :param thresh: 设置阈值（默认0.1），过滤掉score太低的anchor
    :param box_to_display_str_map: 拿来存放每个anchor的显示信息（list） 每个anchor: tuple(box) = list[显示信息]
    :param box_to_color_map: 拿来存放每个anchor的框框颜色
"""

    for i in range(boxes.shape[0]):

        if scores[i] > thresh:

            box = tuple(boxes[i].tolist())

            if classes[i] in category_index.keys():
                class_name = category_index[classes[i]]
            else:
                class_name = 'N/A'

            display_str = str(class_name)
            display_str = '{}: {}%'.format(display_str, int(100 * scores[i]))

            box_to_display_str_map[box].append(display_str)
            box_to_color_map[box] = STANDARD_COLORS[
                classes[i] % len(STANDARD_COLORS)]

        else:
            break

def draw_text(draw, box_to_display_str_map, box, left, right, top, bottom, color):
"""
    :param draw: 一个可以在给定图像(image)上绘图的对象
    :param box_to_display_str_map: 每个anchor的显示信息
    :param box: 当前anchor的预测信息 (xyxy)
    :param left: anchor的left
    :param right: anchor的right
    :param top: anchor的top
    :param bottom: anchor的bottom
    :param color: 当前anchor的信息颜色/anchor框框颜色
    :return:
"""

    try:
        font = ImageFont.truetype('arial.ttf', 20)
    except IOError:
        font = ImageFont.load_default()

    display_str_heights = [font.getsize(ds)[1] for ds in box_to_display_str_map[box]]

    total_display_str_height = (1 + 2 * 0.05) * sum(display_str_heights)

    if top > total_display_str_height:
        text_bottom = top
    else:
        text_bottom = bottom + total_display_str_height

    for display_str in box_to_display_str_map[box][::-1]:

        text_width, text_height = font.getsize(display_str)
        margin = np.ceil(0.05 * text_height)

        draw.rectangle([(left, text_bottom - text_height - 2 * margin),
                        (left + text_width, text_bottom)], fill=color)

        draw.text((left + margin, text_bottom - text_height - margin),
                  display_str,
                  fill='black',
                  font=font)
        text_bottom -= text_height - 2 * margin

def draw_box(image, boxes, classes, scores, category_index, thresh=0.1, line_thickness=3):
"""
    :param image: 原图 RGB (375, 500, 3) HWC  numpy格式(array)    img_o[:, :, ::-1]:BGR=>RGB
    :param boxes: 最终预测结果 (anchor_nums, x1+y1+x2+y2)=(7, 4) (相对原图的预测结果)
                  按score从大到小排列  numpy格式(array)
    :param classes: 所有预测anchors的类别 (7) 分类别且按score从大到小排列 numpy格式(array)
    :param scores: 所有预测anchors的得分 (7) 分类别且按score从大到小排列  numpy格式(array)
    :param category_index: 所有类别的信息（从data/pascal_voc_classes.json中读出）
    :param thresh: 设置阈值（默认0.1），过滤掉score太低的anchor
    :param line_thickness: 框框直线厚度
    :return:
"""

    box_to_display_str_map = collections.defaultdict(list)
    box_to_color_map = collections.defaultdict(str)

    filter_low_thresh(boxes, scores, classes, category_index, thresh, box_to_display_str_map, box_to_color_map)

    if isinstance(image, np.ndarray):
        image = Image.fromarray(image)
    draw = ImageDraw.Draw(image)

    for box, color in box_to_color_map.items():

        xmin, ymin, xmax, ymax = box
        (left, right, top, bottom) = (xmin * 1, xmax * 1, ymin * 1, ymax * 1)

        draw.line([(left, top), (left, bottom), (right, bottom), (right, top), (left, top)],
                  width=line_thickness, fill=color)

        draw_text(draw, box_to_display_str_map, box, left, right, top, bottom, color)
    return image

如果实在看不懂也没有关系，根据参数传入相对于的数据就可以直接使用了。需要传入的数据有图像本身，边界框信息，类别信息，置信度信息，字典对信息（类别索引与类别名称的键值对）。剩下的阈值与框粗细自行选择设置。

所以，总的来说，以上两个函数也可以看成是一个工具来使用。

Original: https://blog.csdn.net/weixin_44751294/article/details/124365129
Author: Clichong
Title: 目标检测的Tricks | 【Trick11】label的缩放与显示

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/687273/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

【pyecharts | 颜色配置】关于pyecharts中自定义颜色问题详解

前言最近微信上经常有小伙伴问到 pyecharts颜色配置的问题，其实 pyecharts颜色配置很简单，不过由于可以配置的方式有点多，经常让人混淆，所以本文汇总一下在pyech…

人工智能 2023年7月15日
0078
Tomcat安装步骤及详细配置教程（2022最新版）

网上的tomcat安装及配置教程一大堆，但是好多都过时了，根本不适用现在的版本，今天凯歌整理一篇Tomcat安装步骤及详细配置教程，2022年最新版~ Tomcat安装及配置教程主…

人工智能 2023年7月29日
0087
机器学习算法——概率类模型评估指标1（布里尔分数Brier Score）

概率预测的准确程度被称为”校准程度”，是衡量算法预测出的概率和真实结果的差异的一种方式。一种常用的指标叫做布里尔分数，它被计算为是概率预测相对于测试样本的…

人工智能 2023年6月15日
0090
PyTorch—-激活函数

什么是激活函数？在神经网络中我们经常使用线性运算来解决分类问题，这就需要激活函数来解决非线性问题 传统的&amp…

人工智能 2023年6月16日
0067
[渝粤教育] 西南科技大学管理运筹学与系统分析在线考试复习资料

管理运筹学与系统分析——在线考试复习资料一、单选题1.下列那种方法不适用于网络时间的计算( )A.修正分配法B.表格计算法C.图上计算法D.矩阵计算法 2.在运输方案中出现退化现象…

人工智能 2023年7月2日
0093
大淘宝技术斩获NTIRE视频增强和超分比赛冠军（内含夺冠方案）

近日，NTIRE比赛结果公布，大淘宝技术视频增强算法团队STaoVideo表现出色，获得视频超分辨率与质量增强挑战赛两个赛道冠军🎉。 NTIRE赛事介绍 2022年CVPR NT…

人工智能 2023年7月14日
0070
【Pytorch神经网络理论篇】 23 对抗神经网络：概述流程 + WGAN模型 + WGAN-gp模型 + 条件GAN + WGAN-div + W散度

同学你好！本文章于2021年末编写，获得广泛的好评！故在2022年末对本系列进行填充与更新，欢迎大家订阅最新的专栏，获取基于Pytorch1.10版本的理论代码(2023版)实现…

人工智能 2023年7月22日
0061
基础知识：协程基础元素

Kotlin 协程的基础元素：Continuation、SafeContinuation、CoroutineContext、CombinedContext、Cancellation…

人工智能 2023年6月30日
0053
海康线阵相机调试指导

前段时间应公司结构要求，需评估结构和硬件，主要围绕线阵相机图像质量上，在此记录下调试过程中的一些经验，希望能给同行一些方向，互相学习。 * 确认外界光源是否满足评估光源是否满足的…

人工智能 2023年7月28日
00121
PointNet++分割预测结果可视化

目前网上对于PointNet++的预测结果可视化的资料比较少，一般都是直接可视化数据集。下面介绍一种我利用Matplotlib可视化预测的代码，希望能够对大家有所帮助。原理：简…

人工智能 2023年7月22日
0070
【Ubuntu20.04+ROS Noetic】ROS解决BUG日志【一】

前言：疫情又开始啦！隔离在家整一下N手电脑装ROS的各种BUG【真的是枯了】从最最最基本的安装开始错误不断可惜社区和网站上的各种解决方式大多不是自己需要的，这个blog就当做记录…

人工智能 2023年6月11日
0072
java中实现创建目录、创建文件的操作

一、创建目录 mkdir()——仅创建一层目录,返回true或false. mkdirs()——创建一层或多层目录,返回true或false. 也就是，在通常情况下，使用mkdir…

人工智能 2023年6月6日
0067
寻迹Arduino智能小车

在智能小车项目中，我们通过控制直流电机的正反转、刹车和转速来控制小车的行动。在这台小车中，对直流电机发出的控制指令来自于Arduino主控板，而Arduino主控板的外部输入则来…

人工智能 2023年7月28日
0054
sklearn做文本聚类分析

对文本Kmeans聚类分析前言 * 背景目的与思路数据预处理分词处理 * 采用jieba分词停用词处理 – 获取停用词表去除停用词生成tf-idf矩阵 K…

人工智能 2023年6月2日
0070
[机器学习与scikit-learn-26]：算法-聚类-KMeans寻找最佳轮廓系数

作者主页(文火冰糖的硅基工坊)：文火冰糖（王文兵）的博客_文火冰糖的硅基工坊_CSDN博客本文网址：https://blog.csdn.net/HiWangWenBing/art…

人工智能 2023年6月2日
0069
机器学习——逻辑回归案例——泰坦尼克号乘客生还

提示：本案例使用逻辑回归模型，进行二分类目录一、逻辑回归是什么？二、使用步骤 1.需要引入库 2.首先导入读取数据模块pandas读入数据 3、查看不同属性的生还情况 4、数…

人工智能 2023年6月19日
0068

2024 年 4 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

目标检测的Tricks | 【Trick11】label的缩放与显示

文章目录

大家都在看