【目标检测】数据增强：DOTA数据集

2023年6月17日上午8:48 • 人工智能 • 阅读 94

前言

之前对于xml格式的YOLO数据集，之前记录过如何用imgaug对其进行数据增强。不过DOTA数据集采用的是txt格式的旋转框标注，因此不能直接套用，只能另辟蹊径。

DOTA数据集简介

DOTA数据集全称：Dataset for Object deTection in Aerial images
DOTA数据集v1.0共收录2806张4000 × 4000的图片，总共包含188282个目标。

DOTA数据集论文介绍：https://arxiv.org/pdf/1711.10398.pdf
数据集官网：https://captain-whu.github.io/DOTA/dataset.html

DOTA数据集总共有3个版本

; DOTAV1.0

类别数目：15
类别名称：plane, ship, storage tank, baseball diamond, tennis court, basketball court, ground track field, harbor, bridge, large vehicle, small vehicle, helicopter, roundabout, soccer ball field , swimming pool

DOTAV1.5

类别数目：16
类别名称：plane, ship, storage tank, baseball diamond, tennis court, basketball court, ground track field, harbor, bridge, large vehicle, small vehicle, helicopter, roundabout, soccer ball field, swimming pool , container crane

DOTAV2.0

类别数目：18
类别名称：plane, ship, storage tank, baseball diamond, tennis court, basketball court, ground track field, harbor, bridge, large vehicle, small vehicle, helicopter, roundabout, soccer ball field, swimming pool, container crane, airport , helipad

DOTAV2.0版本，备份在我的GitHub上。
https://github.com/zstar1003/Dataset

本实验演示选择的是DOTA数据集中的一张图片，原图如下：

标签如下：

每个对象有10个数值，前8个代表一个矩形框四个角的坐标，第9个表示对象类别，第10个表示识别难易程度，0表示简单，1表示困难。

使用JDet中的 visualization.py将标签可视化，效果如下图所示：

注：由于可视化代码需要坐标点输入为整数，因此后续给输出对象坐标点进行了取整操作，这会损失一些精度。

; 数据增强及可视化

数据增强代码主要参考的是这篇博文：目标识别小样本数据扩增

调整亮度

这里通过 skimage.exposure.adjust_gamma来调整亮度：


def changeLight(img, inputtxt, outputiamge, outputtxt):

    flag = random.uniform(0.5, 1.5)
    label = round(flag, 2)
    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_" + str(label) + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_" + str(label) + extension)

    ima_gamma = exposure.adjust_gamma(img, 0.5)

    shutil.copyfile(inputtxt, outputtxt)
    cv.imwrite(outputiamge, ima_gamma)

添加高斯噪声

添加高斯噪声需要先将像素点进行归一化(/255）


def gasuss_noise(image, inputtxt, outputiamge, outputtxt, mean=0, var=0.01):
    '''
    mean : 均值
    var : 方差
    '''
    image = np.array(image / 255, dtype=float)
    noise = np.random.normal(mean, var ** 0.5, image.shape)
    out = image + noise

    if out.min() < 0:
        low_clip = -1.

    else:
        low_clip = 0.

    out = np.clip(out, low_clip, 1.0)
    out = np.uint8(out * 255)

    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_gasunoise_" + str(mean) + "_" + str(var) + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_gasunoise_" + str(mean) + "_" + str(var) + extension)

    shutil.copyfile(inputtxt, outputtxt)
    cv.imwrite(outputiamge, out)

调整对比度


def ContrastAlgorithm(rgb_img, inputtxt, outputiamge, outputtxt):
    img_shape = rgb_img.shape
    temp_imag = np.zeros(img_shape, dtype=float)
    for num in range(0, 3):

        in_image = rgb_img[:, :, num]

        Imax = np.max(in_image)
        Imin = np.min(in_image)

        Omin, Omax = 0, 255

        a = float(Omax - Omin) / (Imax - Imin)
        b = Omin - a * Imin

        out_image = a * in_image + b

        out_image = out_image.astype(np.uint8)
        temp_imag[:, :, num] = out_image
    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_contrastAlgorithm" + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_contrastAlgorithm" + extension)
    shutil.copyfile(inputtxt, outputtxt)
    cv.imwrite(outputiamge, temp_imag)

旋转

旋转本质上是利用仿射矩阵，这里预留了接口参数，可以自定义旋转角度


def rotate_img_bbox(img, inputtxt, temp_outputiamge, temp_outputtxt, angle, scale=1):
    nAgree = angle
    size = img.shape
    w = size[1]
    h = size[0]
    for numAngle in range(0, len(nAgree)):
        dRot = nAgree[numAngle] * np.pi / 180
        dSinRot = math.sin(dRot)
        dCosRot = math.cos(dRot)

        nw = (abs(np.sin(dRot) * h) + abs(np.cos(dRot) * w)) * scale
        nh = (abs(np.cos(dRot) * h) + abs(np.sin(dRot) * w)) * scale

        (filepath, tempfilename) = os.path.split(inputtxt)
        (filename, extension) = os.path.splitext(tempfilename)
        outputiamge = os.path.join(temp_outputiamge + "/" + filename + "_rotate_" + str(nAgree[numAngle]) + ".png")
        outputtxt = os.path.join(temp_outputtxt + "/" + filename + "_rotate_" + str(nAgree[numAngle]) + extension)

        rot_mat = cv.getRotationMatrix2D((nw * 0.5, nh * 0.5), nAgree[numAngle], scale)
        rot_move = np.dot(rot_mat, np.array([(nw - w) * 0.5, (nh - h) * 0.5, 0]))
        rot_mat[0, 2] += rot_move[0]
        rot_mat[1, 2] += rot_move[1]

        rotat_img = cv.warpAffine(img, rot_mat, (int(math.ceil(nw)), int(math.ceil(nh))), flags=cv.INTER_LANCZOS4)
        cv.imwrite(outputiamge, rotat_img)

        save_txt = open(outputtxt, 'w')
        f = open(inputtxt)
        for line in f.readlines():
            line = line.split(" ")
            x1 = float(line[0])
            y1 = float(line[1])
            x2 = float(line[2])
            y2 = float(line[3])
            x3 = float(line[4])
            y3 = float(line[5])
            x4 = float(line[6])
            y4 = float(line[7])
            category = str(line[8])
            dif = str(line[9])

            point1 = np.dot(rot_mat, np.array([x1, y1, 1]))
            point2 = np.dot(rot_mat, np.array([x2, y2, 1]))
            point3 = np.dot(rot_mat, np.array([x3, y3, 1]))
            point4 = np.dot(rot_mat, np.array([x4, y4, 1]))
            x1 = round(point1[0], 3)
            y1 = round(point1[1], 3)
            x2 = round(point2[0], 3)
            y2 = round(point2[1], 3)
            x3 = round(point3[0], 3)
            y3 = round(point3[1], 3)
            x4 = round(point4[0], 3)
            y4 = round(point4[1], 3)

            string = str(int(x1)) + " " + str(int(y1)) + " " + str(int(x2)) + " " + str(int(y2)) + " " + str(
                int(x3)) + " " + str(int(y3)) + " " + str(int(x4)) + " " + str(int(y4)) + " " + category + " " + dif
            save_txt.write(string)

翻转图像

翻转图像使用 cv.flip进行


def filp_pic_bboxes(img, inputtxt, outputiamge, outputtxt):
    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    output_vert_flip_img = os.path.join(outputiamge + "/" + filename + "_vert_flip" + ".png")
    output_vert_flip_txt = os.path.join(outputtxt + "/" + filename + "_vert_flip" + extension)
    output_horiz_flip_img = os.path.join(outputiamge + "/" + filename + "_horiz_flip" + ".png")
    output_horiz_flip_txt = os.path.join(outputtxt + "/" + filename + "_horiz_flip" + extension)

    h, w, _ = img.shape

    vert_flip_img = cv.flip(img, 1)
    cv.imwrite(output_vert_flip_img, vert_flip_img)

    horiz_flip_img = cv.flip(img, 0)
    cv.imwrite(output_horiz_flip_img, horiz_flip_img)

    save_vert_txt = open(output_vert_flip_txt, 'w')
    save_horiz_txt = open(output_horiz_flip_txt, 'w')
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])

        vert_string = str(int(round(w - x1, 3))) + " " + str(int(y1)) + " " + str(int(round(w - x2, 3))) + " " + str(
            int(y2)) + " " + str(int(round(w - x3, 3))) + " " + str(int(y3)) + " " + str(
            int(round(w - x4, 3))) + " " + str(int(y4)) + " " + category + " " + dif
        horiz_string = str(int(x1)) + " " + str(int(round(h - y1, 3))) + " " + str(int(x2)) + " " + str(
            int(round(h - y2, 3))) + " " + str(int(x3)) + " " + str(int(round(h - y3, 3))) + " " + str(
            int(x4)) + " " + str(int(round(h - y4, 3))) + " " + category + " " + dif

        save_horiz_txt.write(horiz_string)
        save_vert_txt.write(vert_string)

平移图像

平移图像本质也是利用仿射矩阵


def shift_pic_bboxes(img, inputtxt, outputiamge, outputtxt):
    w = img.shape[1]
    h = img.shape[0]
    x_min = w
    x_max = 0
    y_min = h
    y_max = 0
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])

        x_min = min(x_min, x1, x2, x3, x4)
        y_min = min(y_min, y1, y2, y3, y4)
        x_max = max(x_max, x1, x2, x3, x4)
        y_max = max(y_max, y1, y2, y3, y4)

    d_to_left = x_min
    d_to_right = w - x_max
    d_to_top = y_min
    d_to_bottom = h - y_max

    x = random.uniform(-(d_to_left - 1) / 3, (d_to_right - 1) / 3)
    y = random.uniform(-(d_to_top - 1) / 3, (d_to_bottom - 1) / 3)

    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    if x >= 0 and y >= 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift_" + str(round(x, 3)) + "_" + str(round(y, 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift_" + str(round(x, 3)) + "_" + str(round(y, 3)) + extension)
    elif x >= 0 and y < 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift_" + str(round(x, 3)) + "__" + str(round(abs(y), 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift_" + str(round(x, 3)) + "__" + str(round(abs(y), 3)) + extension)
    elif x < 0 and y >= 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "_" + str(round(y, 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "_" + str(round(y, 3)) + extension)
    elif x < 0 and y < 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "__" + str(round(abs(y), 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "__" + str(round(abs(y), 3)) + extension)

    M = np.float32([[1, 0, x], [0, 1, y]])
    shift_img = cv.warpAffine(img, M, (img.shape[1], img.shape[0]))
    cv.imwrite(outputiamge, shift_img)

    save_txt = open(outputtxt, "w")
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])
        shift_str = str(int(round(x1 + x, 3))) + " " + str(int(round(y1 + y, 3))) + " " + str(int(round(x2 + x, 3))) + " " + str(int(round(y2 + y, 3))) + " " + str(int(round(x3 + x, 3))) + " " + str(int(round(y3 + y, 3))) + " " + str(int(round(x4 + x, 3))) + " " + str(int(round(y4 + y, 3))) + " " + category + " " + dif
        save_txt.write(shift_str)

裁剪图像

这里的裁剪图像指的是裁剪目标周围的背景，从外面向内收缩，直到碰到第一个对象框停止


def crop_img_bboxes(img, inputtxt, outputiamge, outputtxt):

    w = img.shape[1]
    h = img.shape[0]
    x_min = 0
    x_max = w
    y_min = 0
    y_max = h
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])

        x_min = min(x_min, x1, x2, x3, x4)
        y_min = min(y_min, y1, y2, y3, y4)
        x_max = max(x_max, x1, x2, x3, x4)
        y_max = max(y_max, y1, y2, y3, y4)

    d_to_left = x_min
    d_to_right = w - x_max
    d_to_top = y_min
    d_to_bottom = h - y_max

    crop_x_min = int(x_min - random.uniform(0, d_to_left))
    crop_y_min = int(y_min - random.uniform(0, d_to_top))
    crop_x_max = int(x_max + random.uniform(0, d_to_right))
    crop_y_max = int(y_max + random.uniform(0, d_to_bottom))

    crop_x_min = max(0, crop_x_min)
    crop_y_min = max(0, crop_y_min)
    crop_x_max = min(w, crop_x_max)
    crop_y_max = min(h, crop_y_max)
    crop_img = img[crop_y_min:crop_y_max, crop_x_min:crop_x_max]

    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_crop_" + str(crop_x_min) + "_" +
                               str(crop_y_min) + "_" + str(crop_x_max) + "_" +
                               str(crop_y_max) + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_crop_" + str(crop_x_min) + "_" +
                             str(crop_y_min) + "_" + str(crop_x_max) + "_" +
                             str(crop_y_max) + extension)
    cv.imwrite(outputiamge, crop_img)

    save_txt = open(outputtxt, "w")
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])
        crop_str = str(int(round(x1 - crop_x_min, 3))) + " " + str(int(round(y1 - crop_y_min, 3))) + " " + str(int(round(x2 - crop_x_min, 3))) + " " + str(int(round(y2 - crop_y_min, 3))) + " " + str(int(round(x3 - crop_x_min, 3))) + " " + str(int(round(y3 - crop_y_min, 3))) + " " + str(int(round(x4 - crop_x_min, 3))) + " " + str(int(round(y4 - crop_y_min, 3))) + " " + category + " " + dif
        save_txt.write(crop_str)

完整代码

图片放置结构如下图所示：

import time
import random
import cv2 as cv
import os
import math
import numpy as np
from skimage import exposure
import shutil

def changeLight(img, inputtxt, outputiamge, outputtxt):

    flag = random.uniform(0.5, 1.5)
    label = round(flag, 2)
    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_" + str(label) + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_" + str(label) + extension)

    ima_gamma = exposure.adjust_gamma(img, 0.5)

    shutil.copyfile(inputtxt, outputtxt)
    cv.imwrite(outputiamge, ima_gamma)

def gasuss_noise(image, inputtxt, outputiamge, outputtxt, mean=0, var=0.01):
    '''
    mean : 均值
    var : 方差
    '''
    image = np.array(image / 255, dtype=float)
    noise = np.random.normal(mean, var ** 0.5, image.shape)
    out = image + noise

    if out.min() < 0:
        low_clip = -1.

    else:
        low_clip = 0.

    out = np.clip(out, low_clip, 1.0)
    out = np.uint8(out * 255)

    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_gasunoise_" + str(mean) + "_" + str(var) + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_gasunoise_" + str(mean) + "_" + str(var) + extension)

    shutil.copyfile(inputtxt, outputtxt)
    cv.imwrite(outputiamge, out)

def ContrastAlgorithm(rgb_img, inputtxt, outputiamge, outputtxt):
    img_shape = rgb_img.shape
    temp_imag = np.zeros(img_shape, dtype=float)
    for num in range(0, 3):

        in_image = rgb_img[:, :, num]

        Imax = np.max(in_image)
        Imin = np.min(in_image)

        Omin, Omax = 0, 255

        a = float(Omax - Omin) / (Imax - Imin)
        b = Omin - a * Imin

        out_image = a * in_image + b

        out_image = out_image.astype(np.uint8)
        temp_imag[:, :, num] = out_image
    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_contrastAlgorithm" + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_contrastAlgorithm" + extension)
    shutil.copyfile(inputtxt, outputtxt)
    cv.imwrite(outputiamge, temp_imag)

def rotate_img_bbox(img, inputtxt, temp_outputiamge, temp_outputtxt, angle, scale=1):
    nAgree = angle
    size = img.shape
    w = size[1]
    h = size[0]
    for numAngle in range(0, len(nAgree)):
        dRot = nAgree[numAngle] * np.pi / 180
        dSinRot = math.sin(dRot)
        dCosRot = math.cos(dRot)

        nw = (abs(np.sin(dRot) * h) + abs(np.cos(dRot) * w)) * scale
        nh = (abs(np.cos(dRot) * h) + abs(np.sin(dRot) * w)) * scale

        (filepath, tempfilename) = os.path.split(inputtxt)
        (filename, extension) = os.path.splitext(tempfilename)
        outputiamge = os.path.join(temp_outputiamge + "/" + filename + "_rotate_" + str(nAgree[numAngle]) + ".png")
        outputtxt = os.path.join(temp_outputtxt + "/" + filename + "_rotate_" + str(nAgree[numAngle]) + extension)

        rot_mat = cv.getRotationMatrix2D((nw * 0.5, nh * 0.5), nAgree[numAngle], scale)
        rot_move = np.dot(rot_mat, np.array([(nw - w) * 0.5, (nh - h) * 0.5, 0]))
        rot_mat[0, 2] += rot_move[0]
        rot_mat[1, 2] += rot_move[1]

        rotat_img = cv.warpAffine(img, rot_mat, (int(math.ceil(nw)), int(math.ceil(nh))), flags=cv.INTER_LANCZOS4)
        cv.imwrite(outputiamge, rotat_img)

        save_txt = open(outputtxt, 'w')
        f = open(inputtxt)
        for line in f.readlines():
            line = line.split(" ")
            x1 = float(line[0])
            y1 = float(line[1])
            x2 = float(line[2])
            y2 = float(line[3])
            x3 = float(line[4])
            y3 = float(line[5])
            x4 = float(line[6])
            y4 = float(line[7])
            category = str(line[8])
            dif = str(line[9])

            point1 = np.dot(rot_mat, np.array([x1, y1, 1]))
            point2 = np.dot(rot_mat, np.array([x2, y2, 1]))
            point3 = np.dot(rot_mat, np.array([x3, y3, 1]))
            point4 = np.dot(rot_mat, np.array([x4, y4, 1]))
            x1 = round(point1[0], 3)
            y1 = round(point1[1], 3)
            x2 = round(point2[0], 3)
            y2 = round(point2[1], 3)
            x3 = round(point3[0], 3)
            y3 = round(point3[1], 3)
            x4 = round(point4[0], 3)
            y4 = round(point4[1], 3)

            string = str(int(x1)) + " " + str(int(y1)) + " " + str(int(x2)) + " " + str(int(y2)) + " " + str(
                int(x3)) + " " + str(int(y3)) + " " + str(int(x4)) + " " + str(int(y4)) + " " + category + " " + dif
            save_txt.write(string)

def filp_pic_bboxes(img, inputtxt, outputiamge, outputtxt):
    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    output_vert_flip_img = os.path.join(outputiamge + "/" + filename + "_vert_flip" + ".png")
    output_vert_flip_txt = os.path.join(outputtxt + "/" + filename + "_vert_flip" + extension)
    output_horiz_flip_img = os.path.join(outputiamge + "/" + filename + "_horiz_flip" + ".png")
    output_horiz_flip_txt = os.path.join(outputtxt + "/" + filename + "_horiz_flip" + extension)

    h, w, _ = img.shape

    vert_flip_img = cv.flip(img, 1)
    cv.imwrite(output_vert_flip_img, vert_flip_img)

    horiz_flip_img = cv.flip(img, 0)
    cv.imwrite(output_horiz_flip_img, horiz_flip_img)

    save_vert_txt = open(output_vert_flip_txt, 'w')
    save_horiz_txt = open(output_horiz_flip_txt, 'w')
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])

        vert_string = str(int(round(w - x1, 3))) + " " + str(int(y1)) + " " + str(int(round(w - x2, 3))) + " " + str(
            int(y2)) + " " + str(int(round(w - x3, 3))) + " " + str(int(y3)) + " " + str(
            int(round(w - x4, 3))) + " " + str(int(y4)) + " " + category + " " + dif
        horiz_string = str(int(x1)) + " " + str(int(round(h - y1, 3))) + " " + str(int(x2)) + " " + str(
            int(round(h - y2, 3))) + " " + str(int(x3)) + " " + str(int(round(h - y3, 3))) + " " + str(
            int(x4)) + " " + str(int(round(h - y4, 3))) + " " + category + " " + dif

        save_horiz_txt.write(horiz_string)
        save_vert_txt.write(vert_string)

def shift_pic_bboxes(img, inputtxt, outputiamge, outputtxt):
    w = img.shape[1]
    h = img.shape[0]
    x_min = w
    x_max = 0
    y_min = h
    y_max = 0
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])

        x_min = min(x_min, x1, x2, x3, x4)
        y_min = min(y_min, y1, y2, y3, y4)
        x_max = max(x_max, x1, x2, x3, x4)
        y_max = max(y_max, y1, y2, y3, y4)

    d_to_left = x_min
    d_to_right = w - x_max
    d_to_top = y_min
    d_to_bottom = h - y_max

    x = random.uniform(-(d_to_left - 1) / 3, (d_to_right - 1) / 3)
    y = random.uniform(-(d_to_top - 1) / 3, (d_to_bottom - 1) / 3)

    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    if x >= 0 and y >= 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift_" + str(round(x, 3)) + "_" + str(round(y, 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift_" + str(round(x, 3)) + "_" + str(round(y, 3)) + extension)
    elif x >= 0 and y < 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift_" + str(round(x, 3)) + "__" + str(round(abs(y), 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift_" + str(round(x, 3)) + "__" + str(round(abs(y), 3)) + extension)
    elif x < 0 and y >= 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "_" + str(round(y, 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "_" + str(round(y, 3)) + extension)
    elif x < 0 and y < 0:
        outputiamge = os.path.join(
            outputiamge + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "__" + str(round(abs(y), 3)) + ".png")
        outputtxt = os.path.join(
            outputtxt + "/" + filename + "_shift__" + str(round(abs(x), 3)) + "__" + str(round(abs(y), 3)) + extension)

    M = np.float32([[1, 0, x], [0, 1, y]])
    shift_img = cv.warpAffine(img, M, (img.shape[1], img.shape[0]))
    cv.imwrite(outputiamge, shift_img)

    save_txt = open(outputtxt, "w")
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])
        shift_str = str(int(round(x1 + x, 3))) + " " + str(int(round(y1 + y, 3))) + " " + str(int(round(x2 + x, 3))) + " " + str(int(round(y2 + y, 3))) + " " + str(int(round(x3 + x, 3))) + " " + str(int(round(y3 + y, 3))) + " " + str(int(round(x4 + x, 3))) + " " + str(int(round(y4 + y, 3))) + " " + category + " " + dif
        save_txt.write(shift_str)

def crop_img_bboxes(img, inputtxt, outputiamge, outputtxt):

    w = img.shape[1]
    h = img.shape[0]
    x_min = 0
    x_max = w
    y_min = 0
    y_max = h
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])

        x_min = min(x_min, x1, x2, x3, x4)
        y_min = min(y_min, y1, y2, y3, y4)
        x_max = max(x_max, x1, x2, x3, x4)
        y_max = max(y_max, y1, y2, y3, y4)

    d_to_left = x_min
    d_to_right = w - x_max
    d_to_top = y_min
    d_to_bottom = h - y_max

    crop_x_min = int(x_min - random.uniform(0, d_to_left))
    crop_y_min = int(y_min - random.uniform(0, d_to_top))
    crop_x_max = int(x_max + random.uniform(0, d_to_right))
    crop_y_max = int(y_max + random.uniform(0, d_to_bottom))

    crop_x_min = max(0, crop_x_min)
    crop_y_min = max(0, crop_y_min)
    crop_x_max = min(w, crop_x_max)
    crop_y_max = min(h, crop_y_max)
    crop_img = img[crop_y_min:crop_y_max, crop_x_min:crop_x_max]

    (filepath, tempfilename) = os.path.split(inputtxt)
    (filename, extension) = os.path.splitext(tempfilename)
    outputiamge = os.path.join(outputiamge + "/" + filename + "_crop_" + str(crop_x_min) + "_" +
                               str(crop_y_min) + "_" + str(crop_x_max) + "_" +
                               str(crop_y_max) + ".png")
    outputtxt = os.path.join(outputtxt + "/" + filename + "_crop_" + str(crop_x_min) + "_" +
                             str(crop_y_min) + "_" + str(crop_x_max) + "_" +
                             str(crop_y_max) + extension)
    cv.imwrite(outputiamge, crop_img)

    save_txt = open(outputtxt, "w")
    f = open(inputtxt)
    for line in f.readlines():
        line = line.split(" ")
        x1 = float(line[0])
        y1 = float(line[1])
        x2 = float(line[2])
        y2 = float(line[3])
        x3 = float(line[4])
        y3 = float(line[5])
        x4 = float(line[6])
        y4 = float(line[7])
        category = str(line[8])
        dif = str(line[9])
        crop_str = str(int(round(x1 - crop_x_min, 3))) + " " + str(int(round(y1 - crop_y_min, 3))) + " " + str(int(round(x2 - crop_x_min, 3))) + " " + str(int(round(y2 - crop_y_min, 3))) + " " + str(int(round(x3 - crop_x_min, 3))) + " " + str(int(round(y3 - crop_y_min, 3))) + " " + str(int(round(x4 - crop_x_min, 3))) + " " + str(int(round(y4 - crop_y_min, 3))) + " " + category + " " + dif
        save_txt.write(crop_str)

if __name__ == '__main__':
    inputiamge = "dataset/ceshi/images"
    inputtxt = "dataset/ceshi/labelTxt"
    outputiamge = "dataset/ceshi_aug/images"
    outputtxt = "dataset/ceshi_aug/labelTxt"
    angle = [30, 60, 90, 120, 150, 180]
    tempfilename = os.listdir(inputiamge)
    for file in tempfilename:
        (filename, extension) = os.path.splitext(file)
        input_image = os.path.join(inputiamge + "/" + file)
        input_txt = os.path.join(inputtxt + "/" + filename + ".txt")

        img = cv.imread(input_image)

        changeLight(img,input_txt,outputiamge,outputtxt)

        gasuss_noise(img, input_txt, outputiamge, outputtxt, mean=0, var=0.001)

        ContrastAlgorithm(img, input_txt, outputiamge, outputtxt)

        rotate_img_bbox(img, input_txt, outputiamge, outputtxt, angle)

        filp_pic_bboxes(img, input_txt, outputiamge, outputtxt)

        shift_pic_bboxes(img, input_txt, outputiamge, outputtxt)

        crop_img_bboxes(img, input_txt, outputiamge, outputtxt)
    print("finished!")

Original: https://blog.csdn.net/qq1198768105/article/details/127027065
Author: zstar-_
Title: 【目标检测】数据增强：DOTA数据集

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/629404/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Centos安装python3导入ssl时解决 ModuleNotFoundError: No module named ‘_ssl‘问题

当装好python3导入ssl模块时报以下错误： ModuleNotFoundError: No module named ‘_ssl’ import _s…

人工智能 2023年7月5日
0083
HTML：1.初识web

认识WEB 「网页」主要是由 文字、 图像和 超&#x94FE…

人工智能 2023年6月4日
0078
安装spaCy（最简单的教程）

如果不是折腾了一下午 + 半个晚上，谁想发小水文 T_T，但是网上真的没有搜到我遇到的问题哎（枯）… 说白了，问题就是我们的 python找不到 model，官网给的下…

人工智能 2023年5月27日
00102
【PAT甲级 – C++题解】1101 Quick Sort

✍个人博客：https://blog.csdn.net/Newin2020?spm=1011.2415.3001.5343📚专栏地址：PAT题解集合📝原题地址：题目详情 &#821…

人工智能 2023年6月27日
0059
2022-4-21 vrep深度相机Kinect 远程c++（qtcreator） opencv 保存

从模型库里拉出来一个Kinect相机放在合适位置：设置好像素，不是标准像素值vrep有警告（可能数据有误），忽略即可同样的像素值，在c++ 端： int w = 640, h …

人工智能 2023年7月20日
0042
使用RGBD相机实现YOLOv3目标识别并测距，获取物体三维坐标

设备环境：Ubuntu18.04 + ros melodic 相机：乐视相机（乐视遗产，和奥比中光的Astra Pro 同方案，便宜）首先要安装一部分依赖 sudo apt in…

人工智能 2023年7月9日
0066
realsense D455深度相机+YOLO V5结合实现目标检测（一）

realsense D455深度相机+YOLO V5结合实现目标检测（一） 1.代码来源 2.环境配置 3.代码分析： * 3.1如何用realsense在python下面调用的问…

人工智能 2023年6月17日
0093
使用Mongoose populate实现多表关联存储与查询，内附完整代码

文章目录使用Mongoose populate实现多表关联与查询 * 一、数据模型创建 – 1. 创建一个PersonSchema 2. 创建一个StorySche…

人工智能 2023年7月29日
0056
【数据挖掘】2022年2023届秋招奇虎360机器学习算法工程师笔试题

公司：奇虎360 岗位：机器学习算法工程师笔试时间：2022年10月9号 1 选择题 1、E ( X 2 ) E(X^2)E (X 2 )的计算 P{X=1} = 2/3, P{…

人工智能 2023年6月19日
0076
因子分析后如何进行聚类分析？

一、案例说明 1.案例背景研究短视频平台用户行为的分类情况，调查搜集了200份数据其中20项可分为品牌活动，品牌代言人，社会责任感，品牌赞助和购买意愿品牌五个维度。案例数据中还包…

人工智能 2023年5月31日
00117
NLP Prompt范式，两种主要类型：填充文本字符串空白的完形填空（Cloze）prompt，和用于延续字符串前缀的前缀 (Prefix) prompt。

从 2017-2019 年开始，NLP 模型发生了翻天覆地的变化，这种全监督范式发挥的作用越来越小。具体而言，研究重点开始转向预训练、微调范式。在这一范式下，一个具有固定架构的模型…

人工智能 2023年5月30日
0077
神经网络是一种算法吗,神经网络包括哪些算法

1、神经网络的历史是什么？沃伦·麦卡洛克和沃尔特·皮茨（1943）基于数学和一种称为阈值逻辑的算法创造了一种神经网络的计算模型。这种模型使得神经网络的研究分裂为两种不同研究思路。…

人工智能 2023年7月13日
0072
手把手教你用pytorch实现k折交叉验证，解决类别不平衡

在用深度学习做分类的时候，常常需要进行交叉验证，目前pytorch没有通用的一套代码来实现这个功能。可以借助 sklearn中的 StratifiedKFold，KFold来实现，…

人工智能 2023年7月20日
0073
ROS运行管理之launch文件

ROS是多进程(节点)的分布式框架，一个完整的ROS系统实现：可能包含多台主机；每台主机上又有多个工作空间(workspace)；每个的工作空间中又包含多个功能包(package…

人工智能 2023年6月2日
0089
CS224W摘要11.Reasoning over Knowledge Graphs

文章目录 Reasoning over Knowledge Graphs * Predictive Queries on KG 从One-hop query到path querie…

人工智能 2023年6月1日
00148
pytorch正确的安装torch_geometric,无bug、多种类版本

网络差就第一种离线安装，但是我只收录了这2个版本；网络好就第二个方案在线安装，更全面一定要注意版本对应版本对应目录表格，找到每个 torch版本，点进去就可以找到对应的注意：安装…

人工智能 2023年7月24日
0058

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31