卷积神经网络之狗猫数据集的分类实验

2023年5月26日下午8:36 • 人工智能 • 阅读 82

文章目录

*
– 一、环境配置
– 二、猫狗数据集
–
+ （一）制作数据集
+ （二）卷积神经网络CNN
+
* 1. 网络模型搭建
* 2. 图像生成器读取文件中数据
* 3. 开始训练
* 4. 保存模型
* 5. 结果可视化
+ （三）根据基准模型进行调整
+
* 1. 图像增强方法
* 2. 模型调整

一、环境配置

配置tensorflow、keras

打开anaconda3命令行，用管理员身份运行
新建conda环境

conda create -n tensorflow python=3.7

激活环境

activate tendorflow

安装tensorflow和Keras

pip install tensorflow==1.14.0
pip install keras==2.2.5

注意tensorflow和Keras版本对应，参照博客：TensorFlow与Keras版本对应

打开对应环境的jupyter notebook
查看tensorflow和Keras版本

; 二、猫狗数据集

（一）制作数据集

从kaggle官网下载猫狗数据集
或：链接：https://pan.baidu.com/s/1JTyY259L58JfVLB98Iw7GQ
提取码：eaf4

图片分类

import os,shutil
original_dataset_dir='D:/py/kaggle_Dog&Cat/train/train'
base_dir='D:/py/kaggle_Dog&Cat/find_cats_and_dogs'
os.mkdir(base_dir)

train_dir=os.path.join(base_dir,'train')
os.mkdir(train_dir)
validation_dir=os.path.join(base_dir,'validaiton')
os.mkdir(validation_dir)
test_dir=os.path.join(base_dir,'test')
os.mkdir(test_dir)

train_cats_dir = os.path.join(train_dir, 'cats')
os.mkdir(train_cats_dir)

train_dogs_dir = os.path.join(train_dir, 'dogs')
os.mkdir(train_dogs_dir)

validation_cats_dir = os.path.join(validation_dir, 'cats')
os.mkdir(validation_cats_dir)

validation_dogs_dir = os.path.join(validation_dir, 'dogs')
os.mkdir(validation_dogs_dir)

test_cats_dir = os.path.join(test_dir, 'cats')
os.mkdir(test_cats_dir)

test_dogs_dir = os.path.join(test_dir, 'dogs')
os.mkdir(test_dogs_dir)

fnames = ['cat.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(train_cats_dir, fname)
    shutil.copyfile(src, dst)

fnames = ['cat.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(validation_cats_dir, fname)
    shutil.copyfile(src, dst)

fnames = ['cat.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(test_cats_dir, fname)
    shutil.copyfile(src, dst)

fnames = ['dog.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(train_dogs_dir, fname)
    shutil.copyfile(src, dst)

fnames = ['dog.{}.jpg'.format(i) for i in range(1000, 1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(validation_dogs_dir, fname)
    shutil.copyfile(src, dst)

fnames = ['dog.{}.jpg'.format(i) for i in range(1500, 2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(test_dogs_dir, fname)
    shutil.copyfile(src, dst)

统计图片数量

print('total training cat images:', len(os.listdir(train_cats_dir)))
print('total training dog images:', len(os.listdir(train_dogs_dir)))
print('total validation cat images:', len(os.listdir(validation_cats_dir)))
print('total validation dog images:', len(os.listdir(validation_dogs_dir)))
print('total test cat images:', len(os.listdir(test_cats_dir)))
print('total test dog images:', len(os.listdir(test_dogs_dir)))

猫狗训练图片各1000张，验证图片各500张，测试图片各500张

（二）卷积神经网络CNN

1. 网络模型搭建

from keras import layers
from keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.summary()

2. 图像生成器读取文件中数据

from keras import optimizers

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])
from keras.preprocessing.image import ImageDataGenerator

All images will be rescaled by 1./255
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')

3. 开始训练

for data_batch, labels_batch in train_generator:
    print('data batch shape:', data_batch.shape)
    print('labels batch shape:', labels_batch.shape)
    break

history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=30,
      validation_data=validation_generator,
      validation_steps=50)

4. 保存模型

model.save('cats_and_dogs_small_1.h5')

5. 结果可视化

import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

（三）根据基准模型进行调整

1. 图像增强方法

datagen = ImageDataGenerator(
      rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest')

from keras.preprocessing import image

fnames = [os.path.join(train_cats_dir, fname) for fname in os.listdir(train_cats_dir)]

We pick one image to "augment"
img_path = fnames[3]

Read the image and resize it
img = image.load_img(img_path, target_size=(150, 150))

Convert it to a Numpy array with shape (150, 150, 3)
x = image.img_to_array(img)

Reshape it to (1, 150, 150, 3)
x = x.reshape((1,) + x.shape)

The .flow() command below generates batches of randomly transformed images.

It will loop indefinitely, so we need to break the loop at some point!

i = 0
for batch in datagen.flow(x, batch_size=1):
    plt.figure(i)
    imgplot = plt.imshow(image.array_to_img(batch[0]))
    i += 1
    if i % 4 == 0:
        break

plt.show()

2. 模型调整

为了进一步防止过度拟合，我们在模型中添加一个Dropout层

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])

使用数据扩充和数据丢失来训练我们的网络：

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,)

Note that the validation data should not be augmented!

test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=32,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')

history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=100,
      validation_data=validation_generator,
      validation_steps=50)

保存模型

绘制结果：

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()

可以看出来，我们现在能够达到82%的精度，比非正则化模型相对提高了15%。

Original: https://blog.csdn.net/qq_43678923/article/details/117305118
Author: LUY-10
Title: 卷积神经网络之狗猫数据集的分类实验

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/521842/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Python使用pandas库进行数据清洗

对于清洗来说,没有绝对万能的通用模板,你会遇到各种问题,只能写好多种不同的版本根据不同的情况选择合适的程序来调用,比如初步清洗,二次清洗一.索引删除及基本统计运算 import …

人工智能 2023年7月7日
0077
OpenCV4.x图像处理实例-仿微信视频通话背景模糊效果

; 仿微信视频通话背景模糊效果新的微信视频通话支持背景模糊。本文将演示如何实现此功能。 [En] The new Wechat video call supports backg…

人工智能 2023年5月24日
0074
3D点云深度学习-浅谈点云分割

先说一点题外话研究生三年，开始学了一年的图像检测，还没学明白，然后实验室都开始做点云，就转到点云方向做了两年，没什么大成就，因为感觉我学了两年时间刚刚入门，而且大多数的学习都是跑…

人工智能 2023年6月17日
0089
pandas——sort_values()用法及各参数含义

1、sort_values功能及使用场景功能：对数据进行排序使用场景：合并/分组汇总/其他需要对时间等字段进行排列的场景 #示例，如下数据需要按照公司名称和时间进行排序，以便进行…

人工智能 2023年7月8日
00116
JVM 上数据处理语言的竞争：Kotlin, Scala 和 SPL

🍁作者简介：🏅云计算领域优质创作者🏅新星计划第三季python赛道TOP1🏅 阿里云ACE认证高级工程师🏅✒️个人主页：小鹏linux💊个人社区：小鹏linux（个人社区）欢迎您的…

人工智能 2023年5月30日
0074
Yolov5 安装详细教程及目标检测和识别

文章内容：1.在 Anaconda 环境下，进行目标检测程序（Yolov5）的下载及安装，实…

人工智能 2023年6月19日
0079
Python实现朴素贝叶斯分类器

朴素贝叶斯分类器文章目录朴素贝叶斯分类器一、贝叶斯分类器是什么？ * 贝叶斯判定准则朴素贝叶斯分类器举个栗子二、相关代码 * 1.数据处理 2.生成朴素贝叶斯表（字典）…

人工智能 2023年6月23日
00102
是否可以处理新用户和新物品问题

问题介绍在推荐系统中，当有新用户或新物品加入时，如何处理这些新的数据是一个重要的问题。传统的推荐算法往往不能直接处理新用户或新物品，因为它们需要依赖于用户的历史行为或物品的历史属…

人工智能 2024年1月2日
0026
5000张高清壁纸大图（手机用），用Python在法律的边缘又试探了一把

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年6月19日
0070
改变conda虚拟环境的默认路径

anaconda下指定虚拟环境的创建路径conda环境默认安装在用户目录C:\Users\username.conda\envs下，如果选择默认路径，那么之后创建虚拟环境，也是安装…

人工智能 2023年6月16日
0052
基于Python的循环神经网络的时空轨迹检测

资源下载地址：https://download.csdn.net/download/sheziqiong/85734378资源下载地址：https://download.csdn….

人工智能 2023年6月2日
0074
Inception V3

目录 0 回顾 1 介绍 2 设计原则 3 大filter size卷积的分解 * 3.1 分解为小卷积 3.2 分解为非对称卷积 4 辅助分类器的效用 5 feature map…

人工智能 2023年6月17日
0048
深度学习 | 适配tensorflow2.6的CUDA与cuDNN

前言近日琐事已了，又想在 coursera上继续选修课程，看好了一门帝国理工开设的tensorflow2.0专授课程 TensorFlow 2 for Deep Learning…

人工智能 2023年5月25日
0064
基于Matlab利用移动目标指示雷达抑制地面杂波（附源码）

目录一、构建雷达系统二、定义目标三、杂波四、仿真接收到的脉冲和匹配滤波器五、使用三脉冲消除器执行六、使用交错PRF模拟接收到的脉冲七、对交错的 PRF 执行八、总结…

人工智能 2023年6月26日
0086
pandas中的 loc的使用（pandas.DataFrame.loc）

api参考：其他：例子：一、获取值 >>> df = pd.DataFrame([[1, 2], [4, 5], [7, 8]], index=[‘cobr…

人工智能 2023年7月7日
0068
图神经网络-论文精读-“A Gentle Introduction to Graph Neural Networks“

目录文章链接：A Gentle Introduction to Graph Neural Networks 1、文章主要工作：解释了现代图神经网络 2、什么是图 3、什么样的数据…

人工智能 2023年7月13日
0057

2024 年 4 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30