更复杂的体系结构能保证更好的模型吗？

2023年5月25日下午12:54 • 人工智能 • 阅读 55

使用的数据集和数据预处理

[En]

Data sets and data preprocessing used

我们将使用Kaggle的狗与猫数据集。它是根据知识共享许可证授权的，这意味着你可以免费使用它：

该数据集相当大——25000张图像均匀分布在不同的类中（12500张狗图像和12500张猫图像）。它应该足够大，以训练一个像样的图像分类器。

您可以根据上一篇文章创建适当的目录结构，并将其拆分为训练集、测试集和验证集：

[En]

You can create an appropriate directory structure according to the previous article and split it into training sets, test sets, and validation sets:

https://towardsdatascience.com/tensorflow-for-image-classification-top-3-prerequisites-for-deep-learning-projects-34c549c89e42

你还应该删除train/cat/666.jpg和train/dog/11702.jpg图像，这些已经损坏，你的模型将无法使用它们进行训练。

接下来，让我们看看如何使用TensorFlow加载图像。

如何使用TensorFlow加载图像数据

您今天将看到的模型将具有比以前文章中的模型更多的层。

[En]

The model you will see today will have more layers than the model in previous articles.

为了可读性，我们将从TensorFlow中导入单个类。如果你正在跟进，请确保有一个带有GPU的系统，或者至少使用Google Colab。

让我们把图书馆的重要性放在一边：

[En]

Let’s put aside the import of the library:

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

import warnings
warnings.filterwarnings('ignore')

import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense, Dropout
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.losses import categorical_crossentropy
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import BinaryAccuracy

tf.random.set_seed(42)
physical_devices = tf.config.list_physical_devices('GPU')

try:
    tf.config.experimental.set_memory_growth(physical_devices[0], True)
except:
    pass

这是很多，但模型将因此而显得格外干净。

[En]

This is a lot, but the model will look extra clean because of it.

我们现在将像往常一样加载图像数据——使用ImageDataGenerator类。

我们将把图像矩阵转换为0–1范围，使用用三个颜色通道，将所有图像调整为224×224。出于内存方面的考虑，我们将barch大小降低到32：

train_datagen = ImageDataGenerator(rescale=1/255.0)
valid_datagen = ImageDataGenerator(rescale=1/255.0)

train_data = train_datagen.flow_from_directory(
    directory='data/train/',
    target_size=(224, 224),
    class_mode='categorical',
    batch_size=32,
    shuffle=True,
    seed=42
)

valid_data = valid_datagen.flow_from_directory(
    directory='data/validation/',
    target_size=(224, 224),
    class_mode='categorical',
    batch_size=32,
    seed=42
)

以下是您应该看到的输出：

[En]

Here is the output you should see:

让我们鼓捣第一个模型！

向TensorFlow模型中添加层会有什么不同吗？

从头开始编写卷积模型总是一项棘手的任务。由于卷积模型训练时间长，需要检查的参数太多，所以网格搜索寻找最优结构是不可行的。事实上，你更有可能使用迁移学习。这是我们在不久的将来要讨论的话题。

[En]

Writing convolution models from scratch is always a tricky task. Grid search for the optimal architecture is not feasible because the convolution model takes a long time to train and there are too many parameters to check. In fact, you are more likely to use transfer learning. This is the subject we will discuss in the near future.

今天，这一切都是关于理解为什么在模型体系结构中激进是不值得的。我们用一个简单的模型获得了75%的准确率，所以这是我们必须超越的基线：

[En]

Today, it’s all about understanding why being aggressive in model architecture is not worth it. We got 75% accuracy with a simple model, so this is the baseline we have to go beyond:

https://towardsdatascience.com/tensorflow-for-computer-vision-how-to-train-image-classifier-with-convolutional-neural-networks-77f2fd6ed152

模型1-两个卷积块

我们将宣布第一个模型在某种程度上类似于VGG体系结构——两个卷积层，后面是一个池层。滤波器设置如下，第一个块32个，第二个块64个。

至于损失和优化器，我们将坚持基本原则——分类交叉熵和Adam。数据集中的类是完全平衡的，这意味着我们只需跟踪准确率即可：

model_1 = tf.keras.Sequential([
    Conv2D(filters=32, kernel_size=(3, 3), input_shape=(224, 224, 3), activation='relu'),
    Conv2D(filters=32, kernel_size=(3, 3), activation='relu'),
    MaxPool2D(pool_size=(2, 2), padding='same'),

    Conv2D(filters=64, kernel_size=(3, 3), activation='relu'),
    Conv2D(filters=64, kernel_size=(3, 3), activation='relu'),
    MaxPool2D(pool_size=(2, 2), padding='same'),

    Flatten(),
    Dense(units=128, activation='relu'),
    Dense(units=2, activation='softmax')
])

model_1.compile(
    loss=categorical_crossentropy,
    optimizer=Adam(),
    metrics=[BinaryAccuracy(name='accuracy')]
)
model_1_history = model_1.fit(
    train_data,
    validation_data=valid_data,
    epochs=10
)

以下是经过10个epoch后的训练结果：

看起来我们的表现并没有超过基线，因为验证的准确率仍然在75%左右。如果我们再加一块卷积块会怎么样？

[En]

It seems that our performance has not exceeded the baseline, because the accuracy of verification is still around 75%. What happens if we add another convolution block?

模型2-三个卷积块

我们将保持模型体系结构不变，唯一的区别是增加了一个包含128个过滤器的卷积块：

[En]

We will keep the model architecture the same, with the only difference being the addition of a convolution block containing 128 filters:

model_2 = Sequential([
    Conv2D(filters=32, kernel_size=(3, 3), input_shape=(224, 224, 3), activation='relu'),
    Conv2D(filters=32, kernel_size=(3, 3), activation='relu'),
    MaxPool2D(pool_size=(2, 2), padding='same'),

    Conv2D(filters=64, kernel_size=(3, 3), activation='relu'),
    Conv2D(filters=64, kernel_size=(3, 3), activation='relu'),
    MaxPool2D(pool_size=(2, 2), padding='same'),

    Conv2D(filters=128, kernel_size=(3, 3), activation='relu'),
    Conv2D(filters=128, kernel_size=(3, 3), activation='relu'),
    MaxPool2D(pool_size=(2, 2), padding='same'),

    Flatten(),
    Dense(units=128, activation='relu'),
    Dense(units=2, activation='softmax')
])

model_2.compile(
    loss=categorical_crossentropy,
    optimizer=Adam(),
    metrics=[BinaryAccuracy(name='accuracy')]
)
model_2_history = model_2.fit(
    train_data,
    validation_data=valid_data,
    epochs=10
)

日志如下：

效果变差了。虽然你可以随意调整batch大小和学习率，但效果可能仍然不行。第一个架构在我们的数据集上工作得更好，所以让我们试着继续调整一下。

模型3-带Dropout的卷积块

第三个模型的架构与第一个模型相同，唯一的区别是增加了一个全连接层和一个Dropout层。让我们看看这是否会有所不同：

model_3 = tf.keras.Sequential([
    Conv2D(filters=32, kernel_size=(3, 3), input_shape=(224, 224, 3), activation='relu'),
    Conv2D(filters=32, kernel_size=(3, 3), activation='relu'),
    MaxPool2D(pool_size=(2, 2), padding='same'),

    Conv2D(filters=64, kernel_size=(3, 3), activation='relu'),
    Conv2D(filters=64, kernel_size=(3, 3), activation='relu'),
    MaxPool2D(pool_size=(2, 2), padding='same'),

    Flatten(),
    Dense(units=512, activation='relu'),
    Dropout(rate=0.3),
    Dense(units=128),
    Dense(units=2, activation='softmax')
])

model_3.compile(
    loss=categorical_crossentropy,
    optimizer=Adam(),
    metrics=[BinaryAccuracy(name='accuracy')]
)

model_3_history = model_3.fit(
    train_data,
    validation_data=valid_data,
    epochs=10
)

以下是训练日志：

太可怕了，现在还不到70%！上一篇文章中的简单架构非常好。反而是数据质量问题限制了模型的预测能力。

结论

这证明，更复杂的模型体系结构不一定会产生性能更好的模型。也许你可以找到一种更适合猫狗数据集的架构，但它可能是徒劳的。

[En]

This proves that more complex model architectures do not necessarily produce better-performing models. Maybe you can find an architecture that is more suitable for cat and dog data sets, but it may be futile.

你应该将重点转移到提高数据集质量上。当然，有20K个训练图像，但我们仍然可以增加多样性。这就是数据增强的用武之地。

感谢阅读！

☆ END ☆

如果看到这里，说明你喜欢这篇文章，请转发、点赞。微信搜索「uncle_pn」，欢迎添加小编微信「 woshicver」，每日朋友圈更新一篇高质量博文。

↓ 扫描二维码添加小编↓

Original: https://blog.csdn.net/woshicver/article/details/124240185
Author: woshicver
Title: 更复杂的体系结构能保证更好的模型吗？

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/514192/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Yolov1-pytorch版论文、原理及代码实现

Yolov1-pytorch版论文、原理及代码实现 Yolov1 论文、原理、代码实现 * 1、论文 2、原理 – 2.1 目标检测方法 2.2 相关名词解释 2.3…

人工智能 2023年7月23日
0058
【机器学习】用特征量重要度(feature importance)解释模型靠谱么？怎么才能算出更靠谱的重要度？

【机器学习】用特征量重要度(feature importance)解释模型靠谱么？怎么才能算出更靠谱的重要度？我们用机器学习解决商业问题的时候，不仅需要训练一个高精度高泛化性的模…

人工智能 2023年6月15日
00103
【手部姿态估计】【论文精读】3D Hand Shape and Pose Estimation from a Single RGB Image

[Abstract] 本文的工作解决了一个新的和具有挑战性的问题，即从单一RGB图像估计完整的3D手的形状和姿势。目前基于单目RGB图像的手的三维分析方法主要集中于估计手关键点的三…

人工智能 2023年5月28日
0087
R语言dataframe分组数据汇总（aggregate and sum）：类似于excel的sumif函数

R语言dataframe分组数据汇总统计（aggregate and sum）：类似于excel的sumif函数目录 R语言dataframe分组数据汇总统计（aggregate…

人工智能 2023年7月18日
0058
多分类任务和 Softmax 回归

在我们解决多类线性可分问题的时候，常会遇到单标签二分类问题、单标签多分类问题、多标签算法问题，下面分别讨论。而前面讲的线性分类模型，原则上只能解决二分类问题，但通过一些技巧就可以解…

人工智能 2023年6月17日
0093
机器学习7—聚类算法之K-means算法

K-均值算法（K-means）前言 * 聚类算法模型常见的聚类算法一、K-means算法描述二、示例说明K-means算法流程三、K-means算法中Kmean()函数说…

人工智能 2023年5月31日
0097
100天精通Python（数据分析篇）——第57天：Pandas读写Excel（read_excel、to_excel）

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年7月3日
0067
基于协同过滤算法的课程推荐（采用皮尔逊相关系数) 。

解决问题：根据用户的选课情况，推荐相应的课程。对应的文件如下：相关文件：提取码：zxcv复制这段内容后打开百度网盘手机App，操作更方便哦) 协同过滤算法的基本步骤：1:收集用户…

人工智能 2023年7月8日
0081
使用TensorFlow构建MobileNet

在此之前，我已经讨论了MobileNet的体系结构:https://towardsdatascience.com/understanding-depthwise-separable…

人工智能 2023年5月25日
0086
【ROS机器人系统】自主导航+YOLO目标检测+语音播报

文章目录一、总体功能设计二、实验环境三、演示四、场景搭建、建图与导航模块 * 4.1 场景搭建 4.2 小车模型 4.3 导航模块 – （1）安装依赖（2）从…

人工智能 2023年6月17日
0073
动手学深度学习图像分类数据集(二) softmax回归的从零开始实现

动手学深度学习图像分类数据集(二) softmax回归的从零开始实现动手学深度学习图像分类数据系列: 动手学深度学习图像分类数据集(一) Fashion-MNIST的获取与…

人工智能 2023年6月18日
00163
看红帽巨佬解析⭐《一、G1垃圾回收期简介》⭐

笔者最近在看关于G1垃圾收集器，发现了一篇十分优秀的文章，来自红帽（Red Hat）大佬。笔者通过自己的理解后翻译后，有了本篇文章本篇是Part 1：⭐⭐原文地址⭐⭐ 序对于大多…

人工智能 2023年6月27日
00105
Yolov5训练自己的数据集（详细完整版）

最近在网上看到有与本博客一模一样的，连图片都一样。特此声明：这是原版，转载请附原文链接，谢谢。这次我将大部分图片添加了水印文章目录一. 环境（不能含有中文路径）二. 准备…

人工智能 2023年6月16日
0092
【python】pandas的excel处理：员工薪水分析

导入数据并把数据读出 import pandas as pd data = pd.read_csv("salaries.csv") data.head() 删除…

人工智能 2023年7月17日
0045
影像基础—–CT-MRI图像的特点和临床应用

CT图像是经数字转换的重建模拟图像，是由一定数目从黑到白不同灰度的像素按固有矩阵排列而成。这些像素的灰度反映的是相应体素的X线吸收系数。如同X线图像，CT图像亦是用灰度反映器官和组…

人工智能 2023年6月20日
0053
如何使用Pandas操作数据

Pandas是一个强大的分析结构化数据的工具集；它的使用基础是Numpy（提供高性能的矩阵运算）；用于数据挖掘和数据分析，同时也提供数据清洗功能。一、数据结构 pandas的主要…

人工智能 2023年7月6日
0069

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

更复杂的体系结构能保证更好的模型吗？

如何使用TensorFlow加载图像数据

向TensorFlow模型中添加层会有什么不同吗？

模型1-两个卷积块

模型2-三个卷积块

模型3-带Dropout的卷积块

结论

大家都在看