图像语义分割实战：TensorFlow Deeplabv3+ 训练自己数据集

2023年5月23日下午6:13 • 人工智能 • 阅读 73

文章目录

前言
一、环境配置
二、训练过程
*
1.引入库
2.数据集准备
–
3.训练前代码准备
4.主要训练参数
5.预训练模型
6.测试model_test.py
7.训练
8.可视化测试
9.验证
10.查看日志
11.导出模型
三、测试
总结

前言

本文是为了记录deeplabv3+训练过程。

一、环境配置

我的环境：

ubuntu 16.04
anaconda3
python 3.5
tensorflow-gpu 1.10.0

anaconda3的安装可以上网查。

二、训练过程

1.引入库

首先clone官方提供的tensorflow/models文件。

git clone https://github.com/tensorflow/models.git

如果下载速度慢，可以在慈云上下载。

[En]

If the download speed is slow, you can download it on Ciyun.

2.数据集准备

转换为 VOC 格式的数据集

标注数据，制作符合要求的mask图像。利用图像对应的json文件，将数据转换成voc格式，方便后续进一步转换成deeplab训练所需的灰度图格式。

将labelme项目下载到本地：

git clone https://github.com/wkentaro/labelme.git

找到目录/labelme/examples/semantic_segmentation，里面有一个进行转换的完整示例，对照着示例，将自己的数据（原始图片和对应json标注）放入data_annotated文件夹，制作自己的labels.txt，拷贝labelme2voc.py文件不需改动，如下：

打开当前目录下的终端，执行如下命令：

[En]

Open the terminal in the current directory and execute the following command:

python labelme2voc.py data_annotated data_dataset_voc --labels labels.txt

会生成 data_dataset_voc 文件夹，里面包含：

Convert to 灰度图

deeplab使用单通道的标注图，即灰度图，并且类别的像素标记应该是0,1,2,3…n（共计n+1个类别，包含1个背景类和n个目标类）。执行remove_gt_colormap.py将mask转换成需要的格式，路径根据需要修改。


python remove_gt_colormap.py \
  --original_gt_folder="/media/dell/2T/test/testlabel" \
  --output_dir="/media/dell/2T/test/mask"

original_gt_folder：原始标签图文件夹。
output_dir：要输出的标签图文件夹的位置。

Convert to tfrecord

制作tfrecord之前，需要将数据集分类成训练/测试/验证集。

数据集目录结构如下：

data
image
mask
index
- train.txt
- trainval.txt
- val.txt
tfrecord

iamge：存放所有的输入图片，包括训练、测试、验证集的图片。
mask：存放所有的labele（灰度）图片，和输入图片（即iamge）是一一对应的，文件名相同。
tfrecord：存放的是tfrecord格式的数据。
train.txt：所有训练集的文件名称（不包括后缀）
trainval.txt：所有验证集的文件名称（不包括后缀）
val.txt：所有测试集的文件名称（不包括后缀）

根据index下的txt文件运行 build_voc2012_data.py转换成tfrecord格式，终端下执行以下代码：


python ./build_voc2012_data.py \
  --image_folder="/home/dell/models/research/deeplab/data/image" \
  --semantic_segmentation_folder="/home/dell/models/research/deeplab/data/mask" \
  --list_folder="/home/dell/models/research/deeplab/data/index" \
  --image_format="png" \
  --output_dir="/home/dell/models/research/deeplab/data/tfrecord"

image_folder ：数据集image的文件目录地址
semantic_segmentation_folder：数据集中mask的文件目录地址
list_folder : 将数据集分类成训练集、验证集等的指示目录index的文件目录
image_format : 输入图片数据的格式，我的数据集是png格式
output_dir：制作的TFRecord存放的目录地址

3.训练前代码准备

修改 deeplab/datasets/data_generator.py
在100行左右添加自己的数据集描述：

_MYDATA = DatasetDescriptor(
    splits_to_sizes={
        'train':150 ,
        'trainval':168,
        'val': 18,
    },
    num_classes=3,
    ignore_label=255,
)

然后在代码110行左右注册数据集：

_DATASETS_INFORMATION = {
    'cityscapes': _CITYSCAPES_INFORMATION,
    'pascal_voc_seg': _PASCAL_VOC_SEG_INFORMATION,
    'ade20k': _ADE20K_INFORMATION,
    'mydata':_MYDATA,
}

修改 train_utils.py文件
在 train_utils.py中，先将大概209行的关于 exclude_list的设置修改，作用是在使用预训练权重时候，不加载该logit层：


exclude_list = ['global_step','logits']
if not initialize_last_layer:
exclude_list.extend(last_layers)

4.主要训练参数

训练文件train.py和common.py文件中包含了训练分割网络所需要的所有参数。

model_variant：Deeplab模型变量，可选值可见 core/feature_extractor.py。
当使用 mobilenet_v2时，设置变量 strous_rates=decoder_output_stride=None；
当使用 xception_65或resnet_v1时，设置 strous_rates=[6,12,18](output stride 16)， decoder_output_stride=4。
label_weights：此变量可以设置标签的权重值，当数据集中出现类别不均衡时，可通过此变量来指定每个类别标签的权重值，如label_weights=[0.1, 0.5]意味着标签0的权重是0.1，标签1的权重是0.5。如果该值为None，则所有的标签具有相同的权重1.0。
提示：我在训练期间没有修改这个变量，下次训练时我会尝试设置它。

[En]

Tip: I did not modify this variable during training and will try to set it the next time I train.
train_logdir：存放 checkpoint和 logs的路径。
log_steps：该值表示每隔多少步输出日志信息。
save_interval_secs：该值表示以秒为单位，每隔多长时间保存一次模型文件到硬盘。
optimizer：优化器，可选值[‘momentum’, ‘adam’]。默认为momentum。
learning_policy：学习率策略，可选值[‘poly’, ‘step’]。
base_learning_rate：基础学习率，默认值0.0001。
training_number_of_steps：模型训练的迭代次数。
train_batch_size：模型训练的批处理图像数量。
train_crop_size：模型训练时所使用的图像尺寸，默认’513, 513’。
tf_initial_checkpoint：预训练模型。
initialize_last_layer：是否初始化最后一层。
last_layers_contain_logits_only：是否只考虑逻辑层作为最后一层。
fine_tune_batch_norm：是否微调batch norm参数。
atrous_rates：默认值[6, 12, 18]。
output_stride：默认值16，输入和输出空间分辨率的比值
对于 xception_65，如果 output_stride=8，则使用 atrous_rates=[12, 24, 36]
如果 output_stride=16，则 atrous_rates=[6, 12, 18]
对于 mobilenet_v2，使用 None
注意：在训练和验证阶段可以使用不同的 atrous_rates和 output_stride。
dataset：所使用的分割数据集，此处与数据集注册时的名称一致。
train_split：使用哪个数据集来训练，可选值即数据集注册时的值，如train, trainval。
dataset_dir：数据集存放的路径。
根据培训参数，需要注意以下几点：

[En]

According to the training parameters, the following points need to be paid attention to:

1.关于是否加载预训练网络的权重问题
如果要微调另一个数据集上的网络，则需要注意以下参数：

[En]

If you want to fine-tune the network on another dataset, you need to pay attention to the following parameters:

使用预训练网络的权重，设置 initialize_last_layer=True
只使用网络的backbone，设置 initialize_last_layer=False和last_layers_contain_logits_only=False
使用所有的预训练权重，除了logits，因为如果是自己的数据集，对应的classes不同（这个我们前面已经设置不加载logits）设置 initialize_last_layer=False和 last_layers_contain_logits_only=True
由于我的数据集分类与默认类别数不同，因此采取的参数值是：

--initialize_last_layer=false
--last_layers_contain_logits_only=true

2.如果资源有限，想要训练自己数据集的几条建议：

设置output_stride=16或者甚至32（同时需要修改atrous_rates变量，例如，对于output_stride=32，atrous_rates=[3, 6, 9]）
尽可能多的使用GPU，更改 num_clone标志，并将 train_batch_size设置的尽可能大
调整 train_crop_size，可以将它设置的更小一些，例如 513x513（甚至 321x321），这样就可以使用更大的 batch_size
使用较小的网络主干，如mobilenet_v2

3.关于是否微调batch_norm

当训练使用的批处理大小 train_batch_size大于 12（最好大于 16）时，设置 fine_tune_batch_norm=True。否则，设置 fine_tune_batch_norm=False。

5.预训练模型

选择预训练模型，根据自己的情况进行下载，下载地址：https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/model_zoo.md
我选择的是xception_71。

6.测试model_test.py

测试环境配置是否成功。

[En]

Test whether the environment configuration is successful.

添加依赖库到PYTHONPATH，在目录/home/user/models/research/下：


export PYTHONPATH=$PYTHONPATH:pwd:pwd/slim
source ~/.bashrc

调用model_test.py测试：


python deeplab/model_test.py

7.训练

train.py：训练代码文件，训练时，需要指定提供的训练参数。


python deeplab/train.py \
    --logtostderr \
    --training_number_of_steps=80000 \
    --train_split="train" \
    --model_variant="xception_71" \
    --atrous_rates=6 \
    --atrous_rates=12 \
    --atrous_rates=18 \
    --output_stride=16 \
    --decoder_output_stride=4 \
    --train_crop_size="321,321"\
    --train_batch_size=8 \
    --fine_tune_batch_norm = False \
    --base_learning_rate=0.01 \
    --dataset="mydata" \
    --tf_initial_checkpoint='/home/dell/models/research/deeplab/backbone/xception_71/model.ckpt' \
    --train_logdir='/home/dell/models/research/deeplab/exp/mydata_train/train' \
    --dataset_dir='/home/dell/models/research/deeplab/data/tfrecord'

8.可视化测试

vis.py：可视化代码。


python deeplab/vis.py \
    --logtostderr \
    --vis_split="val" \
    --model_variant="xception_71" \
    --atrous_rates=6 \
    --atrous_rates=12 \
    --atrous_rates=18 \
    --output_stride=16 \
    --decoder_output_stride=4 \
    --vis_crop_size="512,512" \
    --dataset="mydata" \
    --colormap_type="pascal" \
    --checkpoint_dir='/home/dell/models/research/deeplab/exp/mydata_train/train/' \
    --vis_logdir='/home/dell/models/research/deeplab/exp/mydata_train/vis/' \
    --dataset_dir='/home/dell/models/research/deeplab/data/tfrecord/' \
    --max_number_of_iterations=1

9.验证

eval.py：验证代码，输出mIOU，用来评估模型的好坏。


python deeplab/eval.py \
    --logtostderr \
    --eval_split="val" \
    --model_variant="xception_71" \
    --atrous_rates=6 \
    --atrous_rates=12 \
    --atrous_rates=18 \
    --output_stride=16 \
    --decoder_output_stride=4 \
    --eval_crop_size="512,512" \
    --dataset="mydata" \
    --checkpoint_dir='/home/dell/models/research/deeplab/exp/mydata_train/train/' \
    --eval_logdir='/home/dell/models/research/deeplab/exp/mydata_train/eval/' \
    --dataset_dir='/home/dell/models/research/deeplab/data/tfrecord/' \
    --max_number_of_iterations=1

10.查看日志

使用Tensorboard检查培训和评估工作的进展。

tensorboard --logdir=${PATH_TO_LOG_DIRECTORY}

tensorboard --logdir="./train_logs"

可以使用以下命令运行:


tensorboard --logdir=/home/dell/models/research/deeplab/exp/mydata_train/eval --port 6007


tensorboard --logdir=/home/dell/models/research/deeplab/exp/mydata_train/train

11.导出模型

在培训过程中，将模型文件保存到硬盘中，如下所示：

[En]

During the training process, the model file is saved to the hard disk, as follows:

代码中提供了一个脚本（ export_model.py）可以将checkpoint转换为.pb格式。
在 ./models/research/下创建一个脚本 export_model.sh用来执行export_model.py，内容为 export_model.py的主要修改参数，代码如下：

export PYTHONPATH=$PYTHONPATH:pwd:pwd/slim
python deeplab/export_model.py \
    --logtostderr \
    --checkpoint_path="/media/dell/2T/models/research/deeplab/exp/mydata_train/train/model.ckpt-$1" \
    --export_path="/media/dell/2T/models/research/deeplab/exp/mydata_train/export/inference_graph-$1.pb" \
    --model_variant="xception_71" \
    --atrous_rates=6 \
    --atrous_rates=12 \
    --atrous_rates=18 \
    --output_stride=16 \
    --decoder_output_stride=4 \
    --num_classes=2 \
    --crop_size=3000 \
    --crop_size=3000 \
    --inference_scales=1.0

运行export_model.sh导出模型。


sh export_model.sh 80000

生成的.pb文件如下：

三、测试

您可以编写自己的测试代码。以下代码可以直接从其他文章复制以供参考，您可以自行修改和精简。以下是一些例子：

[En]

You can write your own test code. The following code can be directly copied from other articles for reference, modified and streamlined by yourself. Examples are as follows:


import os
from io import BytesIO
import tarfile
import tempfile
from six.moves import urllib
from matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
from PIL import Image
import tensorflow as tf
import scipy

LABEL_NAMES = np.asarray(["background", "class1", "class2"])

class DeepLabModel(object):
    """Class to load deeplab model and run inference."""

    INPUT_TENSOR_NAME = "ImageTensor:0"
    OUTPUT_TENSOR_NAME = "SemanticPredictions:0"
    INPUT_SIZE = 321
    FROZEN_GRAPH_NAME = "frozen_inference_graph"

    def __init__(self, modelname):
        """Creates and loads pretrained deeplab model."""
        self.graph = tf.Graph()
        graph_def = None

        with open(modelname, "rb") as fd:
            graph_def = tf.GraphDef.FromString(fd.read())

        if graph_def is None:
            raise RuntimeError("Cannot find inference graph in tar archive.")

        with self.graph.as_default():
            tf.import_graph_def(graph_def, name="")

        self.sess = tf.Session(graph=self.graph)

    def run(self, image):
        """Runs inference on a single image.

        Args:
        image: A PIL.Image object, raw input image.

        Returns:
        resized_image: RGB image resized from original input image.

        seg_map: Segmentation map of resized_image.

"""
        width, height = image.size
        resize_ratio = 1.0 * self.INPUT_SIZE / max(width, height)
        target_size = (int(resize_ratio * width), int(resize_ratio * height))
        resized_image = image.convert("RGB").resize(target_size, Image.ANTIALIAS)
        batch_seg_map = self.sess.run(
            self.OUTPUT_TENSOR_NAME,
            feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]},
        )
        seg_map = batch_seg_map[0]
        return resized_image, seg_map

def create_pascal_label_colormap():
    """Creates a label colormap used in PASCAL VOC segmentation benchmark.

    Returns:
        A Colormap for visualizing segmentation results.

"""
    colormap = np.zeros((256, 3), dtype=int)
    ind = np.arange(256, dtype=int)

    for shift in reversed(range(8)):
        for channel in range(3):
            colormap[:, channel] |= ((ind >> channel) & 1) << shift
        ind >>= 3

    return colormap

def label_to_color_image(label):
    """Adds color defined by the dataset colormap to the label.

    Args:
        label: A 2D array with integer type, storing the segmentation label.

    Returns:
        result: A 2D array with floating type. The element of the array
        is the color indexed by the corresponding element in the input label
        to the PASCAL color map.

    Raises:
        ValueError: If label is not of rank 2 or its value is larger than color
        map maximum entry.

"""
    if label.ndim != 2:
        raise ValueError("Expect 2-D input label")

    colormap = create_pascal_label_colormap()

    if np.max(label) >= len(colormap):
        raise ValueError("label value too large.")

    return colormap[label]

def vis_segmentation(image, seg_map, name):
    """Visualizes input image, segmentation map and overlay view."""
    plt.figure(figsize=(15, 5))
    grid_spec = gridspec.GridSpec(1, 4, width_ratios=[6, 6, 6, 1])

    plt.subplot(grid_spec[0])
    plt.imshow(image)
    plt.axis("off")
    plt.title("input image")

    plt.subplot(grid_spec[1])
    seg_image = label_to_color_image(seg_map).astype(np.uint8)
    plt.imshow(seg_image)
    plt.axis("off")
    plt.title("segmentation map")

    plt.subplot(grid_spec[2])
    plt.imshow(image)
    plt.imshow(seg_image, alpha=0.7)
    plt.axis("off")
    plt.title("segmentation overlay")

    unique_labels = np.unique(seg_map)
    ax = plt.subplot(grid_spec[3])
    plt.imshow(FULL_COLOR_MAP[unique_labels].astype(np.uint8), interpolation="nearest")
    ax.yaxis.tick_right()
    plt.yticks(range(len(unique_labels)), LABEL_NAMES[unique_labels])
    plt.xticks([], [])
    ax.tick_params(width=0.0)
    plt.grid("off")

    plt.savefig("./seg_map_result/" + name + ".png")

FULL_LABEL_MAP = np.arange(len(LABEL_NAMES)).reshape(len(LABEL_NAMES), 1)
FULL_COLOR_MAP = label_to_color_image(FULL_LABEL_MAP)

def main_test(filepath):

    modelname = "./datasets/quekou/export/inference_graph-80000.pb"
    MODEL = DeepLabModel(modelname)
    print("model loaded successfully!")

    filelist = os.listdir(filepath)
    for item in filelist:
        print("process image of ", item)
        name = item.split(".jpg", 1)[0]
        original_im = Image.open(filepath + item)
        resized_im, seg_map = MODEL.run(original_im)

        vis_segmentation(resized_im, seg_map, name)

if __name__ == "__main__":
    filepath = "./datasets/quekou/dataset/JPEGImages/"
    main_test(filepath)

参考文章：
原文链接：https://blog.csdn.net/malvas/article/details/90776327
原文链接：https://blog.csdn.net/ling620/article/details/105635780
原文链接：https://blog.csdn.net/zong596568821xp/article/details/83350820

总结

以上就是今天要讲的内容，本文记录了tensorflow下利用deeplabv3+对自己的数据集进行训练和测试。训练过程中可能还有其他问题，日后会继续补充。

Original: https://blog.csdn.net/qq_28262763/article/details/122695633
Author: 卖报的小王
Title: 图像语义分割实战：TensorFlow Deeplabv3+ 训练自己数据集

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/497139/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

如何利用 Python 批量合并 Excel？

大家好，今天分享一个利用 Pandas进行数据分析的小技巧，也是之前有粉丝在后台进行提问的，即如何将多个 pandas.dataframe保存到同一个 Excel中。喜欢记得收藏、…

人工智能 2023年6月19日
0087
【UAV】陀螺仪数据分析，以维特智能 JY901B 为例

文章目录简介 1 加速度 Acceleration 2 陀螺仪 Gyroscope 3 欧拉角 Euler angles 4 磁场 Magnetic 5 四元数 Quaterni…

人工智能 2023年7月16日
0078
Ubuntu18.04使用命令行调用外接摄像头__附在yolov5上如何调用外接摄像头

Ubuntu18.04使用命令行调用外接摄像头__附在yolov5上如何调用外接摄像头前言：网上由教程用ROS调用外接摄像头，需要调摄像头的一些参数，但其实不用这么麻烦，直接用软…

人工智能 2023年6月2日
0070
清华大佬手把手教你使用Python进行数据分析和可视化

Python是进行数据分析的一种很不错的语言，主要是因为以数据为中心的 python 库非常适合。 Pandas是其中的一种，使导入和分析数据更加容易。在本文中，我使用了来分析斯…

人工智能 2023年7月7日
0056
计算机视觉项目-银行卡卡号自动识别

😊😊😊 欢迎来到本博客😊😊😊本次博客内容将继续讲解关于OpenCV的相关知识，利用项目讲解继续巩固自己得基础知识。🎉 作者简介：⭐️⭐️⭐️ 目前计算机研究生在读。主要研究方向是人…

人工智能 2023年6月18日
0074
Opencv之图像滤波：1.图像卷积（cv2.filter2D）

写这些博客主要是记录自己学习Opencv的过程，也希望能帮助到大家。在OpenCV中，允许用户自定义卷积核实现卷积操作，使用自定义卷积核实现卷积操作的函数是cv2.filter2…

人工智能 2023年7月19日
0068
Netty（一）- Netty与BIO、NIO、AIO介绍

文章目录一、Netty的介绍二、Netty的应用场景 * 1. 互联网行业 2. 游戏行业 3. 大数据领域三、I/O模型 * 1. Java BIO – （1）…

人工智能 2023年7月31日
0068
Neo4j 数据建模中双向关系、定向关系的处理、规则——关于图创建、检索中双向关系的学习、思考

Neo4j 数据建模中双向关系、定向关系的处理、规则定向关系双向关系从关系世界过渡到美丽的图形世界需要转变对数据的思考。尽管图形通常比表格直观得多，但人们在第一次将数据建模为…

人工智能 2023年6月1日
0089
【论文笔记】KGAT: Knowledge Graph Attention Network for Recommendation

原文作者：Xiang Wang，Xiangnan He，Yixin Cao，Meng Liu，Tat-Seng Chua 原文标题：KGAT: Knowledge Graph At…

人工智能 2023年6月1日
0090
基于群智能的路径规划算法（五）——狼群算法

本系列文章主要记录学习基于群智能的路径规划算法过程中的一些关键知识点，并按照理解对其进行描述总结和进行相关思考。主要学习资料是来自小黎的Ally 的《第2期课程-基于群智能的…

人工智能 2023年7月27日
0072
Python 实现深度学习（3）: 神经网络的forward实现

写在最前, 我把代码和整理的文档放在github上了 Forward指的是神经网络推理，forward与感知机相比，多了一个激活函数的模块。因此本章需要实现激活函数，另外也需要重新…

人工智能 2023年6月4日
0061
Python爬虫爬取网页上的所有图片

一. 前言以该网页(链接)为例，上面有图片形式的PPT内容，我的目的是将所有图片下载下来保存到本地，如果鼠标一张一张点击下载效率很低，于是可以用爬虫批量爬取图片。采用爬虫爬取网…

人工智能 2023年7月3日
0093
TSTNN: TWO-STAGE TRANSFORMER BASED NEURAL NETWORK FOR SPEECH ENHANCEMENT IN THE TIME DOMAIN

[ICASSP 2021] Motivation 目前，LSTM和GRU等RNN常被用于基于顺序信息的长期序列建模。但基于RNN的模型的缺点是不能并行处理，计算复杂度较高。有作者提…

人工智能 2023年5月23日
0096
线性判别分析(LDA)详解

入门小菜鸟，希望像做笔记记录自己学的东西，也希望能帮助到同样入门的人，更希望大佬们帮忙纠错啦~侵权立删。目录一、LDA简介二、数学原理（以二分类为例子） 1、设定 2、每一类…

人工智能 2023年6月13日
0097
基于tensorflow的手写数字识别

基于tensorflow的手写数字识别数据准备 * 引入包加载数据查看数据信息查看一张图片数据预处理搭建网络模型模型的预测与评价 * 模型的展示对一张图片进行预测 …

人工智能 2023年5月26日
0092
Python的Numpy与Pandas包的使用

import numpy as np import random import matplotlib.pyplot as plt 安装 …

人工智能 2023年7月16日
0070

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

图像语义分割实战：TensorFlow Deeplabv3+ 训练自己数据集

文章目录

1.引入库

2.数据集准备

转换为 VOC 格式的数据集

Convert to 灰度图

Convert to tfrecord

3.训练前代码准备

4.主要训练参数

5.预训练模型

6.测试model_test.py

7.训练

8.可视化测试

9.验证

10.查看日志

11.导出模型

大家都在看