图像分类：AlexNet网络、五分类 flower 数据集、pytorch

2023年7月1日上午12:42 • 人工智能 • 阅读 90

文章目录

*
– 一、代码结构
– 二、数据集的处理
–
+ 2.1 数据集的下载和切分：split_data.py
+ 2.2 数据集的加载：dataset.py
+ 2.3 数据集图片可视化：imgs_vasual.py
– 三、AlexNet介绍及网络搭建：model.py
–
+ 3.1 AlexNet网络结构
+ 3.2 AlexNet网络的亮点
+ 3.3 网络搭建
– 四、训练及保存精度最高的网络参数：train.py
– 五、用数据集之外的图片进行测试：predict.py

代码来源：
使用pytorch搭建AlexNet并训练花分类数据集

一、代码结构

; 二、数据集的处理

2.1 数据集的下载和切分：split_data.py

"""
视频教程：https://www.bilibili.com/video/BV1p7411T7Pc/?spm_id_from=333.788
flower数据集为5分类数据集，共有 {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4} 5个分类。

该程序用于将数据集切分为训练集和验证集，使用步骤如下：
（1）在"split_data.py"的同级路径下创建新文件夹"flower_data"
（2）点击链接下载花分类数据集 http://download.tensorflow.org/example_images/flower_photos.tgz
（3）解压数据集到flower_data文件夹下
（4）执行"split_data.py"脚本自动将数据集划分为训练集train和验证集val

切分后的数据集结构：
├── split_data.py
├── flower_data
       ├── flower_photos.tgz （下载的未解压的原始数据集）
       ├── flower_photos（解压的数据集文件夹，3670个样本）
       ├── train（生成的训练集，3306个样本）
       └── val（生成的验证集，364个样本）
"""""

import os
from shutil import copy, rmtree
import random

def mk_file(file_path: str):
    if os.path.exists(file_path):

        rmtree(file_path)
    os.makedirs(file_path)

def main():
    random.seed(0)

    split_rate = 0.1

    cwd = os.getcwd()
    data_path = os.path.join(cwd, "flower_data/flower_photos/flower_photos")
    data_root=os.path.join(cwd, "flower_data")
    origin_flower_path = os.path.join(data_path, "")
    assert os.path.exists(origin_flower_path), "path '{}' does not exist.".format(origin_flower_path)

    flower_class = [cla for cla in os.listdir(origin_flower_path)
                    if os.path.isdir(os.path.join(origin_flower_path, cla))]

    train_root = os.path.join(data_root, "train")
    mk_file(train_root)
    for cla in flower_class:

        mk_file(os.path.join(train_root, cla))

    val_root = os.path.join(data_root, "val")
    mk_file(val_root)
    for cla in flower_class:

        mk_file(os.path.join(val_root, cla))

    for cla in flower_class:
        cla_path = os.path.join(origin_flower_path, cla)
        images = os.listdir(cla_path)
        num = len(images)

        eval_index = random.sample(images, k=int(num*split_rate))
        for index, image in enumerate(images):
            if image in eval_index:

                image_path = os.path.join(cla_path, image)
                new_path = os.path.join(val_root, cla)
                copy(image_path, new_path)
            else:

                image_path = os.path.join(cla_path, image)
                new_path = os.path.join(train_root, cla)
                copy(image_path, new_path)
            print("\r[{}] processing [{}/{}]".format(cla, index+1, num), end="")
        print()

    print("processing done!")

if __name__ == '__main__':
    main()

2.2 数据集的加载：dataset.py

import os
import json
import torch
from torchvision import transforms, datasets

def dataset(batch_size):
    train_path = "flower_data/train"
    val_path = "flower_data/val"
    assert os.path.exists(train_path), "{} path does not exist.".format(train_path)

    nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8])
    print('Using {} dataloader workers every process'.format(nw))

"""
    数据预处理，训练集做随机裁剪和随机翻转用来数据增强
    RandomResizedCrop(224) 表示先随机裁剪为不同的大小和宽高比，然后缩放为(224,224)大小
    RandomHorizontalFlip() 表示随机水平翻转（即左右翻转），默认概率为 0.5
"""

    data_transform = {
        "train": transforms.Compose([transforms.RandomResizedCrop(224),
                                     transforms.RandomHorizontalFlip(),
                                     transforms.ToTensor(),
                                     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]),
        "val": transforms.Compose([transforms.Resize((224, 224)),
                                   transforms.ToTensor(),
                                   transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])}

"""
    torchvision.datasets.ImageFolder 适用于加载特定存储格式的数据集，具体使用可参考博客：
    https://blog.csdn.net/qq_39507748/article/details/105394808
"""

    train_dataset = datasets.ImageFolder(root=train_path,transform=data_transform["train"])
    train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size,
                                               shuffle=True, num_workers=nw)
    validate_dataset = datasets.ImageFolder(root=val_path, transform=data_transform["val"])
    valid_loader = torch.utils.data.DataLoader(validate_dataset, batch_size=batch_size,
                                               shuffle=True, num_workers=nw)
    train_num = len(train_dataset)
    val_num = len(validate_dataset)
    print(f"using {train_num} images for training, {val_num} images for valid.")

    flower_class_id = train_dataset.class_to_idx

    cla_dict = dict((val, key) for key, val in flower_class_id.items())

    json_str = json.dumps(cla_dict, indent=4)
"""
    json.dumps() 将 python对象转换成 json对象，生成一个字符串。
    indent=4 表示缩进4个空格，方便阅读。
    json_str的内容为：
        {
            "0": "daisy",
            "1": "dandelion",
            "2": "roses",
            "3": "sunflowers",
            "4": "tulips"
        }
"""

    with open('class_indices.json', 'w') as json_file:
        json_file.write(json_str)

    return train_loader,valid_loader,val_num

2.3 数据集图片可视化：imgs_vasual.py

"""
图片可视化函数，用于imshow多张图片，并输出每张图片对应的label
"""""

import os
import torch
from torchvision import transforms, datasets, utils
import matplotlib.pyplot as plt
import numpy as np

def imgs_imshow(batch_size):

    train_path = "flower_data/train"
    assert os.path.exists(train_path), "{} path does not exist.".format(train_path)
    tramsform=transforms.Compose([transforms.Resize((224, 224)),transforms.ToTensor(),
                                 transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
    train_dataset = datasets.ImageFolder(root=train_path, transform=tramsform)
    train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size,
                                               shuffle=True, num_workers=0)

    data_iter = iter(train_loader)
    image, label = data_iter.next()

    flower_list = train_dataset.class_to_idx
    cla_dict = dict((val, key) for key, val in flower_list.items())
    print('   '.join('%5s' % cla_dict[label[j].item()] for j in range(batch_size)))

    img = utils.make_grid(image)
    img = img / 2 + 0.5
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()

if __name__ == '__main__':
    imgs_imshow(batch_size=6)

三、AlexNet介绍及网络搭建：model.py

3.1 AlexNet网络结构

本程序中输入图片的尺寸是 224*224，输出为5分类而不是1000分类，其他数据均为图中的数据。

; 3.2 AlexNet网络的亮点

(1）首次利用GPU进行网络加速训练，作者用了两块GPU进行并行训练。

(2）使用了ReLU激活函数，而不是传统的Sigmoid激活函数以及Tanh激活函数。

(3) 使用了LRN局部响应归一化（Local Response Normalization）。本程序中没有用LRN，因为这个方法现在已经用的很少了。

(4）在全连接层的前两层中使用了Dropout随机失活神经元操作，以减少过拟合。

3.3 网络搭建

import torch.nn as nn

"""
本程序中没有使用LRN归一化，因为这个方法现在已经用的很少了。
"""

class AlexNet(nn.Module):
    def __init__(self,class_num=1000,init_weights=False):
        super(AlexNet,self).__init__()
        self.dropout=0.1

        self.features=nn.Sequential(
            nn.ZeroPad2d((2, 1, 2, 1)),

            nn.Conv2d(in_channels=3,out_channels=96,kernel_size=11,stride=4),

            nn.ReLU(inplace=True),

            nn.MaxPool2d(kernel_size=3,stride=2),

            nn.Conv2d(96,256,kernel_size=5,padding=2),

            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3,stride=2),
            nn.Conv2d(256,384,kernel_size=3,padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384,256,kernel_size=3,padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3,stride=2),
        )

        self.classifier=nn.Sequential(
            nn.Dropout(p=self.dropout),
            nn.Linear(in_features=9216,out_features=4096),
            nn.ReLU(inplace=True),
            nn.Dropout(p=self.dropout),
            nn.Linear(in_features=4096, out_features=4096),
            nn.ReLU(inplace=True),
            nn.Linear(in_features=4096, out_features=class_num),
        )

        if init_weights:
            self._initialize_weights()

    def forward(self,x):
        x=self.features(x)
        x=x.view(-1,256*6*6)
        x=self.classifier(x)
        return x

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)

"""
    _initialize_weights()方法的解释：
    self.modules():  Returns an iterator over all modules in the network，即遍历网络中的所有层，并返回一个迭代器。
    for m in self.modules()： 遍历网络中的每一层
    if isinstance(m, nn.Conv2d)： 判断m是否是 nn.Conv2d层
    其实并不需要用_initialize_weights()方法进行初始化，因为pytorch会默认以 nn.init.kaiming_normal_() 进行初始化。
"""

四、训练及保存精度最高的网络参数：train.py

import torch
import torch.nn as nn
import torch.optim as optim
from tqdm import tqdm

from model import AlexNet
from dataset import dataset

def train(batch_size, epochs, lr=0.001):
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    print("using {} device.".format(device))

    train_loader, valid_loader, val_num = dataset(batch_size=batch_size)
    model = AlexNet(class_num=5, init_weights=True)
    model.to(device)
    loss_function = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=lr)

    save_path = './AlexNet.pth'
    best_acc = 0.0
    train_steps = len(train_loader)
    for epoch in range(epochs):

        model.train()
        running_loss = 0.0
        train_bar = tqdm(train_loader)
        for step, (images, labels) in enumerate(train_bar):
            optimizer.zero_grad()
            outputs = model(images.to(device))
            loss = loss_function(outputs, labels.to(device))
            loss.backward()
            optimizer.step()
            running_loss += loss.item()
            train_bar.desc = f"train epoch [{epoch+1}/{epochs}]   loss= {loss:.3f}"

        model.eval()
        acc = 0.0
        with torch.no_grad():
            val_bar = tqdm(valid_loader)
            for val_data in val_bar:
                val_images, val_labels = val_data
                outputs = model(val_images.to(device))
                predict_y = torch.max(outputs, dim=1)[1]
                acc += torch.eq(predict_y, val_labels.to(device)).sum().item()

        val_accurate = acc / val_num
        print('[epoch %d]   train_loss= %.3f   val_accuracy= %.3f' %
              (epoch + 1, running_loss / train_steps, val_accurate))

        if val_accurate > best_acc:
            best_acc = val_accurate
            torch.save(model.state_dict(), save_path)

    print('Finished Training')

if __name__ == '__main__':
    train(batch_size=16, epochs=10, lr=0.0002)

训练结果（没有跑完）：

五、用数据集之外的图片进行测试：predict.py

import os
import json
import torch
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt

from model import AlexNet

def predict():
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    data_transform = transforms.Compose(
        [transforms.Resize((224, 224)),
         transforms.ToTensor(),
         transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

    img_path = "./tulip.png"
    assert os.path.exists(img_path), f"file: '{img_path}' dose not exist."
    img = Image.open(img_path)
    plt.imshow(img)
    img = data_transform(img)
    img = torch.unsqueeze(img, dim=0)

    json_path = './class_indices.json'
    assert os.path.exists(json_path), "file: '{}' dose not exist.".format(json_path)
    json_file = open(json_path, "r")
    class_indict = json.load(json_file)

    model = AlexNet(class_num=5).to(device)
    weights_path = "./AlexNet.pth"
    assert os.path.exists(weights_path), f"file: '{weights_path}' dose not exist."
    model.load_state_dict(torch.load(weights_path))

    model.eval()
    with torch.no_grad():
        output = torch.squeeze(model(img.to(device))).cpu()

        predict = torch.softmax(output, dim=0)

        predict_cla = torch.argmax(predict).item()

    img_class = class_indict[str(predict_cla)]
    img_preb=predict[predict_cla].item()
    print_res = f"class: {img_class}    prob: {img_preb:.3}"
    plt.title(print_res)
    for i in range(len(predict)):
        print(f"class: {class_indict[str(i)]:12}   prob: {predict[i].item():.3}")
    plt.show()

if __name__ == '__main__':
    predict()

测试结果：

class: daisy          prob: 0.00238
class: dandelion      prob: 0.000163
class: roses          prob: 0.199
class: sunflowers     prob: 0.00173
class: tulips         prob: 0.797

测试图片及类别预测：

Original: https://blog.csdn.net/qq_43799400/article/details/123555090
Author: ctrl A_ctrl C_ctrl V
Title: 图像分类：AlexNet网络、五分类 flower 数据集、pytorch

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/662386/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

深度学习-基于(Pytorch)卷积神经网络对多分类实验分析

1．1主要研究内容本次实验利用CNN对类别数据集进行分类，并掌握卷积神网络搭建的过程，了解卷积模块，池化模块，Batch Normalization模块，激活函数等各个模块的原理…

人工智能 2023年7月13日
0087
ModuleNotFoundError: No module named ‘torch_geometric‘如何解决（已解决）

在Anaconda的命令环境下查看，先输入python,然后输入以下命令 Import torch torch.__version__ #查&#x77…

人工智能 2023年7月21日
0068
合肥二手房房价分析（多元线性回归）

啊哦~你想找的内容离你而去了哦内容不存在，可能为如下原因导致： ① 内容还在审核中 ② 内容以前存在，但是由于不符合新的规定而被删除 ③ 内容地址错误 ④ 作者删除了内容。可…

人工智能 2023年7月17日
0052
金融风控实战——可解释人工智能技术

可解释的基本概念机器学习/人工智能可解释性（简称 XAI）正变得越来越流行。随着算法在金融、医疗保健和保险等行业的高风险决策中变得越来越普遍，对可解释性的需求持续增长。关于&#8…

人工智能 2023年6月17日
0049
【AC.HASH】OpenHarmony啃论文俱乐部——哈希技术：综述和分类(译)

本文出自 AC.HASH 团队，AC 产出本文的成员：中原工学院大二在校生（昵称:莫凡）我们在 OpenHarmony成长&…

人工智能 2023年7月2日
0088
Qt编译OpenCv详细记录（MinGW-32编译）

Qt编译OpenCv详细记录（MinGW-32编译32位OpenCV）一、环境：Qt5.14.1 OpenCv 4.54 * 1、安装Qt 2、安装Cmake 二、编译流程 * …

人工智能 2023年7月20日
0067
3DResNet 学习记录

近期同时在进行的两个深度学习项目都需要用到3DResNet模型，本着不做调包侠的心态，还是要好好把模型的原理看一看的。 1、ResNet结构理解首先先理解一下二维的ResNet吧…

人工智能 2023年7月28日
0067
【Unity】脚本：UI界面实现基本按键操作物体移动

点击UI界面摁按钮，实现对应移动等功能 private void Update() { time = Time.time; if(time_go == 1) { this.tran…

人工智能 2023年6月4日
00105
R语言实例：基于Boston数据集的数据分析报告——用 logistic 回归、LDA（线性判别法）、K 临近法（k=1 和 k=5）构建分类模型。目的是预测一个区域的犯罪率是否高于所有犯罪率的中位数

文章目录问题 Boston 数据集 * 查看数据集数据描述构建分类模型 * 数据可视化 logistic 分类模型 – 构建分类模型的因变量构建三个不同自变量的…

人工智能 2023年7月15日
0096
裂缝分割与裂缝宽度计算（正交骨架线法）

基于图像的裂缝分割与裂缝宽度计算（正交骨架线法） From https://subce.gitee.io/htmls/essays/crack_width_calculation….

人工智能 2023年5月26日
0058
盘点两种使用Python读取.nc文件的方法

点击上方” Python爬虫与数据挖掘“，进行关注回复” 书籍“即可获赠Python从入门到进阶共10本电子书今日鸡汤啼…

人工智能 2023年7月5日
0093
语音识别(ASR)论文优选：挑战ASR规模极限Scaling ASR Improves Zero and Few Shot Learning

声明：平时看些文章做些笔记分享出来，文章中难免存在错误的地方，还望大家海涵。搜集一些资料，方便查阅学习：http://yqli.tech/page/speech.html。语音合成…

人工智能 2023年5月25日
0079
猿创征文｜一文带你了解前端开发者工具

前端开发者工具目录 * – 一、前言 – 二、前端开发者工具——编译器（含插件） – + 1、VS Code + 2、VS Code 必备插件 …

人工智能 2023年6月29日
0055
通过Django实现图像识别

提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档文章目录一、训练模型二、使用模型并预测三、通过django实现图像识别该项目需要用到的标准库有： ope…

人工智能 2023年5月26日
0092
卷积在图像处理中的应用

卷积运算 1、卷积定义我们称 ( f*g )(n) 为 f,g 的卷积。这两个式子有一个共同的特征：如果遍历这些直线，就好比，把毛巾沿着角卷起来。观察上面两个式子你会知道，所谓…

人工智能 2023年6月22日
0079
filterin

问题概述本文将解决关于filtering（滤波）的问题。滤波是信号处理中的一项重要技术，用于去除信号中的噪声或干扰，从而提取出我们所关心的有效信息。我们将介绍滤波的基本原理、算法…

人工智能 2024年1月5日
0031

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31