垃圾图像分类 ResNet34 python

2023年7月17日上午9:35 • 人工智能 • 阅读 66

数据下载链接 https://pan.baidu.com/s/1wr3h2Wc720uqUeIroTCIJA 百度网盘为您提供文件的网络备份、同步和分享服务。空间大、速度快、安全稳固，支持教育网加速，支持手机端。注册使用百度网盘即可享受免费存储空间垃圾图像分类 ResNet34 python https://pan.baidu.com/s/1wr3h2Wc720uqUeIroTCIJA ;

提取码:mqic

为什么要进行垃圾分类？
当废物处理不当 – 时，就会发生回收污染 – ，就像回收带有油的比萨盒（堆肥）一样。或者当废物得到正确处理但未正确准备 – 时，例如回收未冲洗过的果酱罐。

污染是回收行业的一个巨大问题，可以通过自动化废物分类来缓解。只是为了踢球，我想我会尝试制作一个图像分类器的原型来对垃圾和可回收物进行分类 – 这个分类器可以在光学分拣系统中得到应用。

构建图像分类器
在这个项目中，我将训练一个卷积神经网络，使用 fastai 库（构建在 PyTorch 上）将图像分类为

waste_types = ['hazardous_waste_dry_battery',
 'hazardous_waste_expired_drugs',
 'hazardous_waste_ointment',
 'kitchen_waste_bone',
 'kitchen_waste_eggshell',
 'kitchen_waste_fish_bone',
 'kitchen_waste_fruit_peel',
 'kitchen_waste_meal',
 'kitchen_waste_pulp',
 'kitchen_waste_tea',
 'kitchen_waste_vegetable',
 'other_garbage_bamboo_chopsticks',
 'other_garbage_cigarette',
 'other_garbage_fast_food_box',
 'other_garbage_flowerpot',
 'other_garbage_soiled_plastic',
 'other_garbage_toothpick',
 'recyclables_anvil',
 'recyclables_bag',
 'recyclables_bottle',
 'recyclables_can',
 'recyclables_cardboard',
 'recyclables_cosmetic_bottles',
 'recyclables_drink_bottle',
 'recyclables_edible_oil_barrel',
 'recyclables_glass_cup',
 'recyclables_metal_food_cans',
 'recyclables_old_clothes',
 'recyclables_paper_bags',
 'recyclables_pillow',
 'recyclables_plastic_bowl',
 'recyclables_plastic_hanger',
 'recyclables_plug_wire',
 'recyclables_plush_toys',
 'recyclables_pot',
 'recyclables_powerbank',
 'recyclables_seasoning_bottle',
 'recyclables_shampoo_bottle',
 'recyclables_shoes',
 'recyclables_toys']

我的建模管道：
下载并提取图像
将图像组织到不同的文件夹中
训练模型
做出和评估测试预测
下一步

一些基本的准备

%reload_ext autoreload
%autoreload 2
%matplotlib inline

%config InlineBackend.figure_format = 'retina'

from fastai.vision import *
from fastai.metrics import error_rate
from pathlib import Path
from glob2 import glob
from sklearn.metrics import confusion_matrix

import pandas as pd
import numpy as np
import os
import zipfile as zf
import shutil
import re
import seaborn as sns

提取数据
首先，我们需要提取”train.zip”的内容。

files = zf.ZipFile("dataset-resized.zip",'r')
files.extractall()
files.close()

解压缩后，数据集调整大小的文件夹有40个子文件夹：

os.listdir(os.path.join(os.getcwd(),"dataset-resized"))

将图片整理到不同的文件夹中
现在我们已经提取了数据，我将按照 50-25-25 的比例将图像分成训练、验证和测试图像文件夹。首先，我定义了一些有助于我快速构建它的函数。如果你对构建数据集不感兴趣，则可以直接运行忽略它。

## helper functions ##

## splits indices for a folder into train, validation, and test indices with random sampling
    ## input: folder path
    ## output: train, valid, and test indices
def split_indices(folder,seed1,seed2):
    n = len(os.listdir(folder))
    full_set = list(range(1,n+1))

    ## train indices
    random.seed(seed1)
    train = random.sample(list(range(1,n+1)),int(.5*n))

    ## temp
    remain = list(set(full_set)-set(train))

    ## separate remaining into validation and test
    random.seed(seed2)
    valid = random.sample(remain,int(.5*len(remain)))
    test = list(set(remain)-set(valid))

    return(train,valid,test)

## gets file names for a particular type of trash, given indices
    ## input: waste category and indices
    ## output: file names
def get_names(waste_type,indices):
    file_names = [waste_type+str(i)+".jpg" for i in indices]
    return(file_names)

## moves group of source files to another folder
    ## input: list of source files and destination folder
    ## no output
def move_files(source_files,destination_folder):
    for file in source_files:
        shutil.move(file,destination_folder)

之后训练集和验证集里面各有四十个文件夹

## paths will be train/cardboard, train/glass, etc...

subsets = ['train','valid']
waste_types = waste_types = ['hazardous_waste_dry_battery',
 'hazardous_waste_expired_drugs',
 'hazardous_waste_ointment',
 'kitchen_waste_bone',
 'kitchen_waste_eggshell',
 'kitchen_waste_fish_bone',
 'kitchen_waste_fruit_peel',
 'kitchen_waste_meal',
 'kitchen_waste_pulp',
 'kitchen_waste_tea',
 'kitchen_waste_vegetable',
 'other_garbage_bamboo_chopsticks',
 'other_garbage_cigarette',
 'other_garbage_fast_food_box',
 'other_garbage_flowerpot',
 'other_garbage_soiled_plastic',
 'other_garbage_toothpick',
 'recyclables_anvil',
 'recyclables_bag',
 'recyclables_bottle',
 'recyclables_can',
 'recyclables_cardboard',
 'recyclables_cosmetic_bottles',
 'recyclables_drink_bottle',
 'recyclables_edible_oil_barrel',
 'recyclables_glass_cup',
 'recyclables_metal_food_cans',
 'recyclables_old_clothes',
 'recyclables_paper_bags',
 'recyclables_pillow',
 'recyclables_plastic_bowl',
 'recyclables_plastic_hanger',
 'recyclables_plug_wire',
 'recyclables_plush_toys',
 'recyclables_pot',
 'recyclables_powerbank',
 'recyclables_seasoning_bottle',
 'recyclables_shampoo_bottle',
 'recyclables_shoes',
 'recyclables_toys']

## create destination folders for data subset and waste type
for subset in subsets:
    for waste_type in waste_types:
        folder = os.path.join('data',subset,waste_type)
        if not os.path.exists(folder):
            os.makedirs(folder)

if not os.path.exists(os.path.join('data','test')):
    os.makedirs(os.path.join('data','test'))

## move files to destination folders for each waste type
for waste_type in waste_types:
    source_folder = os.path.join('train',waste_type)
    train_ind, valid_ind, test_ind = split_indices(source_folder,1,1)

    ## move source files to train
    train_names = get_names(waste_type,train_ind)
    train_source_files = [os.path.join(source_folder,name) for name in train_names]
    train_dest = "data/train/"+waste_type
    move_files(train_source_files,train_dest)

    ## move source files to valid
    valid_names = get_names(waste_type,valid_ind)
    valid_source_files = [os.path.join(source_folder,name) for name in valid_names]
    valid_dest = "data/valid/"+waste_type
    move_files(valid_source_files,valid_dest)

    ## move source files to test
    test_names = get_names(waste_type,test_ind)
    test_source_files = [os.path.join(source_folder,name) for name in test_names]
    ## I use data/test here because the images can be mixed up
    move_files(test_source_files,"data/test")

为了可重复性，我将两个随机样本的种子设置为 1。现在数据已经组织好，我们可以开始模型训练了。

## get a path to the folder with images
path = Path(os.getcwd())/"data"
path

out:PosixPath('/home/jupyter/data')

tfms = get_transforms(do_flip=True,flip_vert=True)
data = ImageDataBunch.from_folder(path,test="test",ds_tfms=tfms,bs=16)
data    #可以把data打印出来看看

undefined

data.show_batch(rows=4,figsize=(10,8))    #&#x663E;&#x793A;&#x56FE;&#x7247;

训练模型！

什么是resnet34？
残差神经网络是具有很多层的卷积神经网络 (CNN)。特别是，resnet34 是一个 34 层的 CNN，已经在 ImageNet 数据库上进行了预训练。预训练的 CNN 将在新的图像分类任务上表现得更好，因为它已经学习了一些视觉特征并且可以将这些知识转移（因此是转移学习）。

由于它们能够描述更多的复杂性，理论上深度神经网络在训练数据上应该比浅层网络表现得更好。但实际上，深度神经网络在经验上的表现往往比浅层神经网络差。

创建 Resnets 是为了使用一种称为快捷连接的黑客来规避这个故障。如果某个层中的某些节点具有次优值，则可以调整权重和偏差；如果一个节点是最优的（它的残差为 0），为什么不把它放在一边？仅根据需要对节点进行调整（当存在非零残差时）。

当需要调整时，快捷连接应用恒等函数将信息传递给后续层。这在可能的情况下缩短了神经网络，并允许 resnet 具有深层架构并表现得更像浅层神经网络。 resnet34中的34只是指层数。

Anand Saha 在这里给出了更深入的解释。

learn = create_cnn(data,models.resnet34,metrics=error_rate)
learn.model

learn.lr_find(start_lr=1e-6,end_lr=1e1)
learn.recorder.plot()

learn.fit_one_cycle(20,max_lr=5.13e-03)

我的模型运行了 20 个 epoch。这种拟合方法的酷炫之处在于，学习率随着每个 epoch 的增长而降低，让我们越来越接近最佳状态。在 8.6% 时，验证错误看起来非常好……让我们看看它在测试数据上的表现如何。

首先，我们可以看看哪些图像分类错误最多。

interp = ClassificationInterpretation.from_learner(learn)
losses,idxs = interp.top_losses()
interp.plot_top_losses(9, figsize=(15,11))

doc(interp.plot_top_losses)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)

interp.most_confused(min_val=2)

现在就到了最激动人心的时刻

对测试数据做出新的预测
要了解这种模式的真正表现，我们需要对测试数据进行预测。首先，我将使用 learner.get_preds() 方法对测试数据进行预测。

注意：learner.predict() 只对单个图像进行预测，而 learner.get_preds() 对一组图像进行预测。我强烈建议阅读文档以了解有关 predict() 和 get_preds() 的更多信息。

preds = learn.get_preds(ds_type=DatasetType.Test)

get_preds(ds_type) 中的 ds_type 参数采用 DataSet 参数。示例值为 DataSet.Train、DataSet.Valid 和 DataSet.Test。我提到这一点是因为我错误地传入了实际数据 (learn.data.test_ds)，这给了我错误的输出并且花了很长时间进行调试。不要犯这个错误！不要传入数据——传入数据集类型！

print(preds[0].shape)
preds[0]

结果就在yhat里面！

## saves the index (0 to 5) of most likely (max) predicted class for each image
max_idxs = np.asarray(np.argmax(preds[0],axis=1))

yhat = []
for max_idx in max_idxs:
    yhat.append(data.classes[max_idx])
yhat

…..

recyclables_plush_toys
kitchen_waste_eggshell
recyclables_edible_oil_barrel
hazardous_waste_expired_drugs
kitchen_waste_eggshell
recyclables_shoes
recyclables_plug_wire
kitchen_waste_vegetable
recyclables_shoes
recyclables_toys
recyclables_seasoning_bottle
recyclables_bag
kitchen_waste_fruit_peel
other_garbage_cigarette
recyclables_can
recyclables_anvil
other_garbage_cigarette
recyclables_shoes
recyclables_paper_bags
kitchen_waste_fish_bone
other_garbage_bamboo_chopsticks
other_garbage_bamboo_chopsticks
other_garbage_flowerpot
recyclables_bottle
kitchen_waste_vegetable
kitchen_waste_pulp
recyclables_edible_oil_barrel
recyclables_plastic_bowl
other_garbage_fast_food_box
recyclables_pot
recyclables_cardboard
recyclables_glass_cup
recyclables_plastic_hanger
recyclables_paper_bags
recyclables_seasoning_bottle
kitchen_waste_bone
recyclables_seasoning_bottle
recyclables_powerbank
recyclables_drink_bottle
kitchen_waste_fruit_peel
recyclables_seasoning_bottle
recyclables_powerbank
recyclables_plush_toys
recyclables_plush_toys
recyclables_seasoning_bottle
recyclables_bottle
recyclables_edible_oil_barrel
recyclables_edible_oil_barrel
recyclables_plastic_bowl
recyclables_metal_food_cans
other_garbage_cigarette
hazardous_waste_expired_drugs
kitchen_waste_fish_bone
recyclables_shoes
kitchen_waste_pulp
recyclables_plastic_bowl
recyclables_powerbank

….

或者可以看另外两篇图像分类的

最简单的直接用训练好的模型

猫狗图像分类：

CNN 猫狗图像分类_long_songs的博客-CSDN博客导入基本要的库import torchimport torch.nn as nnimport torch.nn.functional as Fimport torchvisionimport torchvision.datasets as dsetimport torchvision.transforms as transformsimport torch.optim as optimimport torchvision.models as modelsimpor… 垃圾图像分类 ResNet34 python https://blog.csdn.net/long_songs/article/details/122104681?spm=1001.2014.3001.5501 ;

Original: https://blog.csdn.net/long_songs/article/details/122127217
Author: long_songs
Title: 垃圾图像分类 ResNet34 python

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/698359/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

opencv骨架提取(深入分析算法步骤)

前言个人感觉骨架提取提取的就是开运算过程的不可逆。一.算法步骤 1.算法步骤首先上一下比较官方的算法步骤： 1.获得原图像的首地址及图像的宽和高，并设置循环标志1 2.用结构…

人工智能 2023年6月17日
0093
超分辨率（深度学习）

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年7月12日
0082
【一篇解决】Python图片处理：去水印/加水印—这几个方法你一定要学会，太神奇了~（建议保留）

🌄导语哈喽盆友们泥萌好，这个夏天我们一起去看海啊~ 嘻嘻嘻。广州的天气简直说变就变啊🌧🌦🌥最近一直在疯狂下雨。好不容易来一趟广州，啥地方都没去呢！昨天还有小伙伴儿催我（顺便说一…

人工智能 2023年7月19日
0056
华为AI计算框架昇思MindSpore零基础快速入门 (上)

一、基础介绍门槛最低的深度学习引导 – 知乎 (zhihu.com)https://zhuanlan.zhihu.com/p/463019160 ; MindSpor…

人工智能 2023年6月25日
0067
在 Windows 安装 RASA X 以及一些错误总结 (MissingDependencyException)

文章目录 1. 错误描述 2. 解决方法 * 2.1 创建环境 2.2 安装rasa – 安装完成后 2.4 安装 rasa x – 检查 2.5 可能存在…

人工智能 2023年5月27日
0092
pytorch基础操作

处理数据（Torch）文章目录 * – 处理数据（Torch） – + 数据初始化 – tensor的索引、切片、连接、变异操作 &#8211…

人工智能 2023年7月22日
0053
字节跳动Data数据平台/数据分析招聘

一、团队介绍「Data-数据平台」，支持今日头条、抖音、西瓜、电商、教育、游戏等业务，同时支持ToB业务，提供企业技术服务。解决EB级大数据问题，数据赋能驱动业务增长，打造业界领…

人工智能 2023年7月16日
0054
京东智联云&贪心科技：图卷积神经网络在推荐系统的应用

PDF下载：https://download.csdn.net/download/qq_40507857/14933374 B站视频：https://www.bilibili.co…

人工智能 2023年6月10日
0086
LightGBM 二元分类、多类分类、 Python的回归和分类器应用

LightGBM是一个梯度提升框架，它使用基于树的学习算法。与其他提升算法相比，它被设计为分布式且高效。可以用于比较的模型是 XGBoost，它也是一种提升方法，与其他算法相比，它…

人工智能 2023年6月17日
00138
知识图谱上的图神经网络

本文节选自《图神经网络：基础与前沿》一书！ —— 正文 —— 几乎所有早期的知识图谱嵌入的经典方法都是在对每个三元组打分，在实体和关系的表示中并没有完全考虑到整幅图的结构。…

人工智能 2023年6月1日
0081
什么是过拟合和欠拟合？如何解决这些问题

问题：什么是过拟合和欠拟合？过拟合和欠拟合是机器学习中两个常见的问题。当我们训练一个模型时，我们希望它能够在新的未见过的数据上良好地表现。然而，过拟合和欠拟合可能导致模型在新数据…

人工智能 2024年1月2日
0056
[CV] 下采样与走样问题

下采样与走样问题下采样如果一幅图像的尺寸过大，如何降低尺寸呢？一种朴素的想法是，丢弃一些行和列，但是这样会造成走样问题，因为我们采集的都是离散信号。走样在对连续信号进行采样…

人工智能 2023年6月4日
0074
深度学习：YOLOV5算法多目标检测系统

文件编号：C447 文件大小：95M 代码行数：128行(主程序) 开发环境：Python3.8、OpenCV4.5、YoloV5 点击下载：点击下载简要概述：YOLOV5算法多…

人工智能 2023年7月9日
00109
【神经网络】(10) Resnet18、34 残差网络复现，附python完整代码

各位同学好，今天和大家分享一下 TensorFlow 深度学习中如何搭载 Resnet18 和 Resnet34 残差神经网络，残差网络利用 shotcut 的方法成功解决了网络…

人工智能 2023年7月4日
0067
Raki的读paper小记：PromptBERT: Improving BERT Sentence Embeddings with Prompts

Abstract & Introduction & Related Work 研究任务 sentence embedding 已有方法和相关工作 ConSERT S…

人工智能 2023年5月28日
00110
PointNet介绍

论文：PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation代码：https://…

人工智能 2023年6月16日
00115

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

垃圾图像分类 ResNet34 python

大家都在看