目录
一、数据集简介
宝可梦数据集(共1168张图像):bulbasaur(妙蛙种子,234)、charmander(小火龙,238)、mewtwo(超梦,239)、pikachu(皮卡丘,234)、squirtle(杰尼龟,223)。
二、数据预处理
通过pokmon.py批量读取图像路径,根据不同路径生成每张图像的路径和标签并打乱顺序。
import  os, glob
import  random, csv
import tensorflow as tf
def load_csv(root, filename, name2label):
    # root:数据集根目录
    # filename:csv文件名
    # name2label:类别名编码表
    if not os.path.exists(os.path.join(root, filename)):
        images = []
        for name in name2label.keys():
            images += glob.glob(os.path.join(root, name, '*.png'))
            images += glob.glob(os.path.join(root, name, '*.jpg'))
            images += glob.glob(os.path.join(root, name, '*.jpeg'))
        print(len(images), images)
        random.shuffle(images)
        with open(os.path.join(root, filename), mode='w', newline='') as f:
            writer = csv.writer(f)
            for img in images:
                name = img.split(os.sep)[-2]
                label = name2label[name]
                writer.writerow([img, label])
            print('written into csv file:', filename)
    images, labels = [], []
    with open(os.path.join(root, filename)) as f:
        reader = csv.reader(f)
        for row in reader:
            img, label = row
            label = int(label)
            images.append(img)
            labels.append(label)
    assert len(images) == len(labels)
    return images, labels
def load_pokemon(root, mode='train'):
    # 创建数字编码表
    name2label = {}  # "sq...":0
    for name in sorted(os.listdir(os.path.join(root))):
        if not os.path.isdir(os.path.join(root, name)):
            continue
        # 给每个类别编码一个数字
        name2label[name] = len(name2label.keys())
    # 读取Label信息
    # [file1,file2,], [3,1]
    images, labels = load_csv(root, 'images.csv', name2label)
    if mode == 'train':  # 60%
        images = images[:int(0.6 * len(images))]
        labels = labels[:int(0.6 * len(labels))]
    elif mode == 'val':  # 20% = 60%->80%
        images = images[int(0.6 * len(images)):int(0.8 * len(images))]
        labels = labels[int(0.6 * len(labels)):int(0.8 * len(labels))]
    else:  # 20% = 80%->100%
        images = images[int(0.8 * len(images)):]
        labels = labels[int(0.8 * len(labels)):]
    return images, labels, name2label
img_mean = tf.constant([0.485, 0.456, 0.406])
img_std = tf.constant([0.229, 0.224, 0.225])
def normalize(x, mean=img_mean, std=img_std):
    x = (x - mean)/std
    return x
def denormalize(x, mean=img_mean, std=img_std):
    x = x * std + mean
    return x
def main():
    import time
    images, labels, table = load_pokemon('pokemon', 'train')
    print('images', len(images), images)
    print('labels', len(labels), labels)
    print(table)
if __name__ == '__main__':
    main()
三、构建卷积神经网络
通过keras.Sequential构建一个简单的卷积神经网络。
network = keras.Sequential([
    layers.Conv2D(16,5,3),
    layers.MaxPool2D(3,3),
    layers.ReLU(),
    layers.Conv2D(64,5,3),
    layers.MaxPool2D(2,2),
    layers.ReLU(),
    layers.Flatten(),
    layers.Dense(64),
    layers.ReLU(),
    layers.Dense(5)
])
四、模型训练
1、读取训练数据,batchsize根据内存或显卡显存大小决定。
batchsz = 256
images, labels, table = load_pokemon('pokemon',mode='train')
db_train = tf.data.Dataset.from_tensor_slices((images, labels))
db_train = db_train.shuffle(1000).map(preprocess).batch(batchsz)
2、读取验证数据
images2, labels2, table = load_pokemon('pokemon',mode='val')
db_val = tf.data.Dataset.from_tensor_slices((images2, labels2))
db_val = db_val.map(preprocess).batch(batchsz)
3、读取测试数据
images3, labels3, table = load_pokemon('pokemon',mode='test')
db_test = tf.data.Dataset.from_tensor_slices((images3, labels3))
db_test = db_test.map(preprocess).batch(100)
4、数据预处理
def preprocess(x,y):
    # x: 图片的路径,y:图片的数字编码
    x = tf.io.read_file(x)
    x = tf.image.decode_jpeg(x, channels=3)
    x = tf.image.resize(x, [244, 244])
    x = tf.image.random_flip_left_right(x)
    x = tf.image.random_crop(x, [224,224,3])
    x = tf.cast(x, dtype=tf.float32) / 255.
    x = normalize(x)
    y = tf.convert_to_tensor(y)
    y = tf.one_hot(y, depth=5)
    return x, y
5、模型训练,损失采用交叉熵,使用earlystop防止过拟合。
network.build(input_shape=(4, 224, 224, 3))
network.summary()
early_stopping = EarlyStopping(
    monitor='val_accuracy',
    min_delta=0.001,
    patience=5
)
network.compile(optimizer=optimizers.Adam(lr=1e-3),
               loss=losses.CategoricalCrossentropy(from_logits=True),
               metrics=['accuracy'])
network.fit(db_train, validation_data=db_val, validation_freq=1, epochs=100,
           callbacks=[early_stopping])
network.evaluate(db_test)
模型结构:
Model: “sequential”
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) multiple 1216
max_pooling2d (MaxPooling2D) multiple 0
re_lu (ReLU) multiple 0
conv2d_1 (Conv2D) multiple 25664
max_pooling2d_1 (MaxPooling2 multiple 0
re_lu_1 (ReLU) multiple 0
flatten (Flatten) multiple 0
dense (Dense) multiple 36928
re_lu_2 (ReLU) multiple 0
dense_1 (Dense) multiple 325
=================================================================
Total params: 64,133
Trainable params: 64,133
Non-trainable params: 0
训练结果:
Epoch 16/100
1/3 [=========>………………..] – ETA: 6s – loss: 0.1232 – accuracy: 0.9805
2/3 [===================>……….] – ETA: 3s – loss: 0.1455 – accuracy: 0.9785
3/3 [==============================] – 11s 4s/step – loss: 0.1241 – accuracy: 0.9793 – val_loss: 0.3912 – val_accuracy: 0.8798
1/3 [=========>………………..] – ETA: 2s – loss: 0.4005 – accuracy: 0.8700
2/3 [===================>……….] – ETA: 1s – loss: 0.4779 – accuracy: 0.8450
3/3 [==============================] – 3s 899ms/step – loss: 0.4673 – accuracy: 0.8504
6、保存模型
network.save(‘model.h5’)
五、预测
1、图像读取和预处理
def preprocess(img):
    img = tf.io.read_file(img)
    img = tf.image.decode_jpeg(img, channels=3)
    img = tf.image.resize(img, [244, 244])
    img = tf.image.random_flip_left_right(img)
    img = tf.image.random_crop(img, [224,224,3])
    img = tf.cast(img, dtype=tf.float32) / 255.
    return img
img = '3.jpg'
x = preprocess(img)
x = tf.reshape(x, [1, 224, 224, 3])
2、加载训练模型
network = tf.keras.models.load_model(‘model.h5’)
3、预测分类结果及对应概率,这里使用softmax将输出的logits转换为每个分类对应概率。
logits = network .predict(x)
prob = tf.nn.softmax(logits, axis=1)
print(prob)
max_prob_index = np.argmax(prob, axis=-1)[0]
prob = prob.numpy()
max_prob = prob[0][max_prob_index]
max_index = np.argmax(logits, axis=-1)[0]
name = ['妙蛙种子', '小火龙', '超梦', '皮卡丘', '杰尼龟']
print(name[max_index] + “:” + max_prob)
测试图像:
预测结果:
tf.Tensor([[0.02942971 0.29606345 0.02201815 0.57856214 0.07392654]], shape=(1, 5), dtype=float32)
0.57856214
皮卡丘
六、分析与优化
从训练和预测效果上看,在训练集上已经达到了98%左右的精度,但是在验证集和测试集上只能达到80%多的精度,尽管使用了earlystop,也出现了明显的过拟合现象。通过预测,可以看出一张很明显的皮卡丘图像预测概率为0.578,虽然可以正确分类,但还没有达到比较好拟合状态。
1、数据集和模型结构优化
为了快速完成训练,这里采用的比较浅的卷积网络,并且由于训练数据太少(总共只有一千多张图像),很难达到比较好的拟合效果,因此可以继续增加数据集以提升精度,也可以用更深层的网络进行训练。
2、训练参数优化
可以通过修改每层参数,以及学习率,更换优化器等方式调整参数,以达到更优的训练效果。
3、迁移学习
针对小样本学习,迁移学习是一个不错的选择,使用tensorflow内置模型结合其在对应公开数据集上的训练参数,通过冻结模型最后对应不同分类结果的全连接层,使用自己的样本和自定义输出层进行训练可以达到更好的拟合效果。
数据集和全部代码地址:
链接:https://pan.baidu.com/s/1s__J2FkaGNsisTG7UAbMiQ
提取码:curk
Original: https://blog.csdn.net/jameschen9051/article/details/119515204
Author: 追猫人
Title: 深度学习tensorflow实现宝可梦图像分类
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/521204/
转载文章受原作者版权保护。转载请注明原作者出处!