深度学习：目标分割｜UNet网络模型及案例实现

2023年7月13日下午5:33 • 人工智能 • 阅读 73

1 UNet网络架构

UNet网络由左编码部分，右解码部分和下两个卷积+激活层组成

编码部分
从图中可知：架构中是由4个重复结构组成：2个3×3卷积层，非线形ReLU层和一个stride为2的2×2 max pooling层（图中的蓝箭头，红箭头）
每一次下采样特征通道的数量加倍
解码部分
和编码层类似，反卷积也有4个重复结构组成
每个重复结构前先使用反卷积，每次反卷积后特征通道数量减半，特征图大小加倍（绿箭头）
反卷积之后，反卷积的结果和编码部分对应步骤的特征图拼接起来（白/蓝块）
如果编码部分的特征图尺寸较大，需要进行裁剪后再拼接（左边深蓝色的虚线）
拼接后的特征图再进行2次3×3的卷积（右侧蓝箭头）
最后一层的卷积核为1×1 的卷积核，将64通道的特征图转化为特定类别数量（分类数量）的结果（青色箭头）

; 2 模型构建

2.1 数据集获取

from PIL import ImageOps
from tensorflow import keras
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, Conv2DTranspose
from tensorflow.keras.layers import MaxPooling2D, Cropping2D, Concatenate
from tensorflow.keras.layers import Lambda, Activation, BatchNormalization, Dropout
from tensorflow.keras.models import Model
import random
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'

input_dir = 'segdata/images/'

input_img_path = sorted([os.path.join(input_dir, fname)
                         for fname in os.listdir(input_dir) if fname.endswith('.jpg')])

target_dir = 'segdata/annotations/trimaps/'

target_img_path = sorted(os.path.join(target_dir, fname) for fname in os.listdir(
    target_dir) if fname.endswith('.png') and not fname.startswith('.'))

img_size = (160, 160)
batch_size = 32
num_classes = 4

使用的数据集是Oxford-IIIT Pet Dataset宠物图像分割数据集，包含37种宠物类别，其中有12种猫的类别和25种狗的类别，每个类别大约有200张图片，所有图像都具有品种，头部ROI和像素级分割的标注。

2.2 构建数据集生成器


class OxfordPets(keras.utils.Sequence):

    def __init__(self, batch_size, img_size, input_img_paths, target_img_paths):

        self.batch_size = batch_size

        self.img_size = img_size

        self.input_img_paths = input_img_paths

        self.target_img_paths = target_img_paths

    def __len__(self):
        return len(self.target_img_paths) // self.batch_size

    def __getitem__(self, idx):

        i = idx * self.batch_size

        batch_input_img_paths = self.input_img_paths[i:i + self.batch_size]
        batch_target_img_paths = self.target_img_paths[i:i + self.batch_size]

        x = np.zeros((batch_size,) + self.img_size + (3,), dtype='float32')
        for j, path in enumerate(batch_input_img_paths):
            img = load_img(path, target_size=self.img_size)
            x[j] = img

        y = np.zeros((batch_size,) + self.img_size + (1,), dtype='uint8')
        for j, path in enumerate(batch_target_img_paths):
            img = load_img(path, target_size=self.img_size,
                           color_mode='grayscale')
            y[j] = np.expand_dims(img, 2)
        return x, y

2.3 编码部分

编码部分的特点是：

架构中是由4个重复结构组成：2个3×3卷积层，非线形ReLU层和一个stride为2的2×2 max pooling层
每一次下采样后我们都把特征通道的数量加倍
每次重复都有两个输出：一个用于编码部分进行特征提取，一个用于解码部分的特征融合


def downsampling_block(input_tensor, filters):

    x = Conv2D(filters, kernel_size=3, padding='same')(input_tensor)

    x = BatchNormalization()(x)

    x = Activation('relu')(x)

    x = Conv2D(filters, kernel_size=3, padding='same')(x)

    x = BatchNormalization()(x)

    x = Activation('relu')(x)

    return MaxPooling2D(pool_size=2)(x), x

2.4 解码部分

和编码层类似，反卷积也有4个重复结构组成
每个重复结构前先使用反卷积，每次反卷积后特征通道数量减半，特征图大小加倍（绿箭头）
反卷积之后，反卷积的结果和编码部分对应步骤的特征图拼接起来（白/蓝块）
如果编码部分的特征图尺寸较大，需要进行裁剪后再拼接（左边深蓝色的虚线）
拼接后的特征图再进行2次3×3的卷积（右侧蓝箭头）
最后一层的卷积核为1×1 的卷积核，将64通道的特征图转化为特定类别数量（分类数量）的结果（青色箭头）


def upsampling_block(input_tensor, skip_tensor, filters):

    x = Conv2DTranspose(filters, kernel_size=2, strides=2,
                        padding='same')(input_tensor)

    _, x_height, x_width, _ = x.shape

    _, s_height, s_width, _ = skip_tensor.shape

    h_crop = s_height - x_height
    w_crop = s_width - x_width

    if h_crop == 0 and w_crop == 0:
        y = skip_tensor
    else:

        cropping = ((h_crop // 2, h_crop - h_crop // 2),
                    (w_crop // 2, w_crop - w_crop // 2))
        y = Cropping2D(cropping=cropping)(skip_tensor)

    x = Concatenate()([x, y])

    x = Conv2D(filters, kernel_size=3, padding='same')(x)

    x = BatchNormalization()(x)

    x = Activation('relu')(x)

    x = Conv2D(filters, kernel_size=2, padding='same')(x)

    x = BatchNormalization()(x)

    x = Activation('relu')(x)
    return x

2.5 模型构建

将编码部分和解码部分组合一起，就可构建UNet网络，在这里UNet网络的深度通过depth进行设置，并设置第一个编码模块的卷积核个数通过filter进行设置，通过以下模块将编码和解码部分进行组合：


def unet(imagesize, classes, fetures=64, depth=3):

    inputs = keras.Input(shape=(imagesize + (3,)))
    x = inputs

    skips = []
    for i in range(depth):
        x, x0 = downsampling_block(x, fetures)
        skips.append(x0)
        fetures *= 2

    x = Conv2D(filters=fetures, kernel_size=3, padding='same')(x)

    x = BatchNormalization()(x)

    x = Activation('relu')(x)

    x = Conv2D(filters=fetures, kernel_size=3, padding='same')(x)

    x = Activation('relu')(x)

    for i in reversed(range(depth)):
        fetures //= 2

        x = upsampling_block(x, skips[i], fetures)

    x = Conv2D(filters=classes, kernel_size=1, padding='same')(x)

    outputs = Activation('softmax')(x)
    return keras.Model(inputs=inputs, outputs=outputs)

model = unet(img_size, num_classes)

2.6 模型训练

2.6.1 数据集划分

数据集中的图像是按顺序进行存储的，在这里我们将数据集打乱后，验证集的数量1200，剩余的为训练集，划分训练集和验证集：


val_samples = 1200

random.Random(1).shuffle(input_img_path)
random.Random(1).shuffle(target_img_path)

train_input_img_paths = input_img_path[:-val_samples]
train_target_img_paths = target_img_path[:-val_samples]

val_input_img_paths = input_img_path[-val_samples:]
val_target_img_paths = target_img_path[-val_samples:]

2.6.2 数据集获取

train_gen = OxfordPets(batch_size, img_size, train_input_img_paths, train_target_img_paths)
val_gen = OxfordPets(batch_size, img_size, val_input_img_paths, val_target_img_paths)

2.6.3 模型编译和训练

model.compile(optimize='rmprop', loss='sparse_categorical_crossentropy')
model.fit(train_gen, epochs=10, validation_data=val_gen)

2.7模型预测


val_gen = OxfordPets(batch_size, img_size, val_input_img_paths, val_target_img_paths)
val_preds = model.predict(val_gen)

def display_mask(i):

    mask = np.argmax(val_preds[i], axis=-1)

    mask = np.expand_dims(mask, axis=-1)

    img = PIL.ImageOps.autocontrast(keras.preprocessing.image.array_to_img(mask))
    display(img)

i = 10

display(Image(filename=val_input_img_paths[i]))


img = PIL.ImageOps.autocontrast(load_img(val_target_img_paths[i]))
display(img)


display_mask(i)

Original: https://blog.csdn.net/m0_58475958/article/details/119709627
Author: 示木007
Title: 深度学习：目标分割｜UNet网络模型及案例实现

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/690301/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Titanic数据分析

背景题目参见：Titanic实际就是根据乘客的性别、年龄、舱位等级等信息预测乘客的成活率。很多人也说这个数据集比较小，使用一些tricky的技术提高准群率意义不大，但是作为练手的…

人工智能 2023年7月7日
0088
多元线性回归超详细详解(一步一步手推公式)

上一篇我们详细的讲解了一元一次线性回归算法，今天我们接着上一篇，为大家讲解多元线性回归是怎么一回事。何为多元？当我们的输入x只有一维属性时，我们称之为一元。就像我们判断人胖瘦，只…

人工智能 2023年6月15日
00116
能耗在线监测系统应用场所及功能时丽花

安科瑞企业能源管理平台采用自动化、信息化技术和集中管理模式，对企业的生产、输配和消耗环节实行集中扁平化的动态监控和数据化管理，监测企业电、水、燃气、蒸汽及压缩空气等各类能源的消耗情…

人工智能 2023年6月28日
0087
Python A value is trying to be set on a copy of a slice from a DataFrame

这个错误是在对pandas模块不熟悉的时候会经常遇见的. 我现在的处理方法就是有赋值行为之前, 先避免使用切片方法. 不过本文作为一种整理, 有必要对这种报错的底层逻辑进行明晰. …

人工智能 2023年6月19日
00153
Tensorflow Lite Model Maker实现图像分类和目标检测迁移学习

博主的此时的环境配置见此前博客 Tensorflow Lite使用介绍_竹叶青lvye的博客-CSDN博客接着前面的博客系列讲，这里来介绍下Tensorflow LIte。Tens…

人工智能 2023年7月10日
00210
用python制作音乐_Python3使用PySynth制作音乐的方法

本人虽然五音不全，但是听歌还是很喜欢的。希望能利用机器自动制作音乐，本我发现了一个比较适合入门的有趣的开源音乐生成模块 PySynth ，文我们主要讲解下如何Python3使用Py…

人工智能 2023年5月27日
00105
旋变解码（RDC）芯片分析

1、旋转变压器介绍旋转变压器（Resolver）是一种电磁式传感器，它主要用于角度位置和角速度的测量。旋转变压器由安装时固定不动的定子和安装在轴上的转子组成。旋转变压器的工作原…

人工智能 2023年6月25日
00115
NLP知识总结和论文整理

词向量参考论文: Efficient Estimation of Word Representations in Vector Space CBOW (Continuous Ba…

人工智能 2023年6月4日
0096
多分类问题的“宏平均”（macro-average）与“微平均”(micro-average)

机器学习中的监督学习主要包括分类问题和回归问题，二分类问题是多分类问题的基础。对于二分类问题，在测试数据集上度量模型的预测性能表现时，常选择Precision（准确率）, Reca…

人工智能 2023年6月15日
0088
ENVI实现最小距离法、最大似然法、支持向量机遥感图像监督分类与分类后处理操作

本文介绍基于 ENVI软件，实现最小距离法、最大似然法与支持向量机三种遥感图像监督分类方法的具体操作，同时进行分类后处理操作，并对不同分类方法结果加以对比分析。 1 分类…

人工智能 2023年7月3日
00129
基于泰尔森回归的股票预测研究

基于泰尔森回归的股票预测研究绪论 * 背景目的流程主要内容 * 数据获取与数据存储数据调取以及案例数据分析模型比较分析 – 2.3.1 模型初始化 2.3….

人工智能 2023年6月18日
0094
论文那些事—Enhancing the Transferability of Adversarial Attacks through Variance Tuning

Enhancing the Transferability of Adversarial Attacks through Variance Tuning（CVPR2021） 1 、…

人工智能 2023年7月12日
0076
不平衡数据分类网络-Pytorch试验

不平衡数据分类网络-Pytorch试验注意：本试验在参考此代码的基础上。为方便起见，之后简称A 1.1 制作不平衡数据集 (下载的为平衡数据集) 脚本：cifar10_to_pn…

人工智能 2023年7月1日
0086
tensorflow 安装GPU版本，CUDA与cuDNN版本对应关系，RTX3050Ti （notebook）

前言安装Tensorflow-gpu 与 keras的时候，一定先要注意版本的对应，不然很容易出错，在看的时候，建议先看完整篇文章再上手。一、环境+配置本机环境显卡：RTX…

人工智能 2023年5月23日
00155
VoTr:Voxel Transformer for 3D Object Detection 论文解读

···abstract 本文提出来voxel-based transformer 主骨干，叫做Voxel Transformer （VoTr）。3D卷积在voxel-based（体…

人工智能 2023年6月10日
0098
TensorFlow v2.8.0 – Android NDK API – space_to_depth

1. tf.nn.space_to_depth SpaceToDepth for tensors of type T. 广度至深度。 tf.nn.space_to_depth( i…

人工智能 2023年5月25日
00105

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31