【联邦学习FATE框架实战】（三）MNIST神经网络（Keras）

2023年7月14日上午1:21 • 人工智能 • 阅读 63

1. 环境
2. 获取数据集
3. FATE任务
*
3.1 上传数据
3.2 模型训练
–
环境
本文实验内容是前几篇文章的延续，同样在FATE1.6.0版本下进行。
实际开发和研究过程中，总是需要自定义实现模型的，这里查看了一下FATE1.6.0带的版本
FATE1.6.0
- Python 3.6.5
- pytorch 1.4.0
- tensorflow 2.3.4
- keras 2.4.0
获取数据集
本文使用的数据集是AI入门的手写数字识别数据集MNIST，从kaggle上下载csv格式的数据集，拷贝到虚拟机中
该数据集中6w条训练数据和1w条测试数据
为了模拟横向联邦学习，将训练数据对半切分为mnist_1_train.csv和mnist_2_train.csv
FATE训练时需要数据集有id，为数据集增加id字段，并将label字段修改为y

import pandas as pd
train = pd.read_csv("data/mnist_train.csv")
test = pd.read_csv("data/mnist_test.csv")

train['idx'] = range(train.shape[0])
idx = train['idx']
train.drop(labels=['idx'], axis=1, inplace=True)
train.insert(0, 'idx', idx)

train = train.rename(columns={"label":"y"})
y = train["y"]
train.drop(labels=["y"], axis=1, inplace=True)
train.insert(train.shape[1], "y", y)

train = train.sample(frac=1)

train_1 = train.iloc[:30000]
train_2 = train.iloc[30000:]
train_1.to_csv("data/mnist_1_train.csv", index=False, header=True)
train_2.to_csv("data/mnist_2_train.csv", index=False, header=True)

test['idx'] = range(test.shape[0])
idx_test = test['idx']
test.drop(labels=['idx'], axis=1, inplace=True)
test.insert(0, 'idx', idx)
test = test.rename(columns={"label":"y"})
y_test = test["y"]
test.drop(labels=["y"], axis=1, inplace=True)
test.insert(test.shape[1], "y", y_test)

test.to_csv("mnist_test.csv", index=False, header=True)

FATE任务

3.1 上传数据

配置conf文件

{
    "file": "workspace/HFL_nn/data/mnist_2_train.csv",
    "table_name": "homo_guest_mnist_train",
    "namespace": "experiment",
    "head": 1,
    "partition": 8,
    "work_mode": 0,
    "backend": 0
}

{
    "file": "workspace/HFL_nn/data/mnist_1_train.csv",
    "table_name": "homo_host_mnist_train",
    "namespace": "experiment",
    "head": 1,
    "partition": 8,
    "work_mode": 0,
    "backend": 0
}

上传数据

workspace/HFL_lr/ 是我建立在fate根目录下的目录
$ flow data upload -c workspace/HFL_nn/upload_train_host_conf.json
$ flow data upload -c workspace/HFL_nn/upload_train_guest_conf.json

3.2 模型训练

3.2.1 定义模型

这里采用keras定义全连接神经网络

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

model_nn = tf.keras.Sequential()
model_nn.add(layers.Dense(512, activation='relu', input_shape=(784,)))
model_nn.add(layers.Dense(256, activation='relu'))
model_nn.add(layers.Dense(10, activation='softmax'))

print(model_nn.to_json())

3.2.2 配置DSL文件（v2版本）

示例文件在/examples/dsl/v2/homo_nn/test_homo_dnn_single_layer_dsl.json，直接拿来使用

{
  "components": {
    "reader_0": {
      "module": "Reader",
      "output": {
        "data": ["data"]
      }
    },
    "dataio_0": {
      "module": "DataIO",
      "input": {
        "data": {
          "data": ["reader_0.data"]
        }
      },
      "output": {
        "data": ["data"],
        "model": ["model"]
      }
    },
    "homo_nn_0": {
      "module": "HomoNN",
      "input": {
        "data": {
          "train_data": ["dataio_0.data"]
        }
      },
      "output": {
        "data": ["data"],
        "model": ["model"]
      }
    }
  }
}

3.2.3 配置conf文件

示例文件在/examples/dsl/v2/homo_nn/test_homo_dnn_single_layer_conf.json
修改
根据自身情况，修改各个角色的party_id
job_parameters.common中, work_mode(0为单机，1为集群)
修改component_parameters.roles各个数据源，对应上传数据时定义的name和namespace
将在3.2.1中定义的模型拷贝到nn_define项
在homo_nn_0中增加”encode_label”:true，表示使用one-hot
修改loss为categorical_crossentropy
调参

{
    "dsl_version": 2,
    "initiator": {
        "role": "guest",
        "party_id": 10000
    },
    "role": {
        "arbiter": [10000],
        "host": [10000],
        "guest": [10000]
    },
    "job_parameters": {
        "common": {
            "work_mode": 0,
            "backend": 0
        }
    },
    "component_parameters": {
        "common": {
            "dataio_0": {
                "with_label": true
            },
            "homo_nn_0": {
                "encode_label":true,
                "max_iter": 20,
                "batch_size": -1,
                "early_stop": {
                    "early_stop": "diff",
                    "eps": 0.0001
                },
                "optimizer": {
                    "learning_rate": 0.0015,
                    "decay": 0.0,
                    "beta_1": 0.9,
                    "beta_2": 0.999,
                    "epsilon": 1e-07,
                    "amsgrad": false,
                    "optimizer": "Adam"
                },
                "loss": "categorical_crossentropy",
                "metrics": ["accuracy", "AUC"],
                 "nn_define": {"class_name": "Sequential", "config": {"name": "sequential", "layers": [{"class_name": "InputLayer", "config": {"batch_input_shape": [null, 784], "dtype": "float32", "sparse": false, "ragged": false, "name": "dense_input"}}, {"class_name": "Dense", "config": {"name": "dense", "trainable": true, "batch_input_shape": [null, 784], "dtype": "float32", "units": 512, "activation": "relu", "use_bias": true, "kernel_initializer": {"class_name": "GlorotUniform", "config": {"seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}, {"class_name": "Dense", "config": {"name": "dense_1", "trainable": true, "dtype": "float32", "units": 256, "activation": "relu", "use_bias": true, "kernel_initializer": {"class_name": "GlorotUniform", "config": {"seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}, {"class_name": "Dense", "config": {"name": "dense_2", "trainable": true, "dtype": "float32", "units": 10, "activation": "softmax", "use_bias": true, "kernel_initializer": {"class_name": "GlorotUniform", "config": {"seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}]}, "keras_version": "2.4.0", "backend": "tensorflow"},
                "config_type": "keras"
            }
        },
        "role": {
            "host": {
                "0": {
                    "reader_0": {
                        "table": {
                            "name": "homo_host_mnist_train",
                            "namespace": "experiment"
                        }
                    },
                    "dataio_0": {
                        "with_label": true
                    }
                }
            },
            "guest": {
                "0": {
                    "reader_0": {
                        "table": {
                            "name": "homo_guest_mnist_train",
                            "namespace": "experiment"
                        }
                    },
                    "dataio_0": {
                        "with_label": true,
                        "output_format": "dense"
                    }
                }
            }
        }
    }
}

3.2.4 提交任务，训练模型

$ flow job submit -c workspace/HFL_nn/test_homo_dnn_single_layer_conf.json -d workspace/HFL_nn/test_homo_dnn_single_layer_dsl.json

查看日志，可以看到模型的训练过程，若出现过早停止，调整参数后重新训练

Original: https://blog.csdn.net/Sisyphus_98/article/details/122998129
Author: HarrisonWu42
Title: 【联邦学习FATE框架实战】（三）MNIST神经网络（Keras）

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/691012/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Python Excel教程之如何将多个 excel 文件合并为一个文件（教程含源码）

通常，我们正在使用 Excel 文件，并且我们肯定遇到过需要将多个 Excel 文件合并为一个的场景。传统方法一直是在 excel 中使用 VBA 代码来完成这项工作，但这是一个多…

人工智能 2023年7月7日
0065
python plt 画图【自用】

常用操作基本设置x/y轴的名字：plt.xlabel，ax.set_xlabel。labelpad可以调整文字和坐标轴的距离设置坐标轴刻度：plt.xticks(x,x_自定…

人工智能 2023年7月4日
0090
【图解AES加密算法】AES算法的Python实现 | Rijndael-128 | 对称加密 | 物联网安全

系列索引：【图解安全加密算法】加密算法系列索引 Python保姆级实现教程 | 物联网安全 | 信息安全完整代码已更新文章目录 * – 一、AES的前世今生 &#8…

人工智能 2023年7月16日
0060
初学NLP的相关概念

机器学习：从大量的个样本中，寻找可以较好预测未见过所对应的函数。实例：在我们日常生活的学习中，大量的就是历年真题，是题目，而是对应的正确答案。高考时将会遇到的往往…

人工智能 2023年6月1日
0078
【PyTorch】存储YOLOv5的预训练模型 / torch.hub详解

下载源：https://github.com/ultralytics/yolov5/releases 构成： yolov5-6.1.zip，将文件解压在此文件夹下后缀 .pt 文…

人工智能 2023年7月21日
0065
二元分类，创建分类数据make_classification

分类结果为是或不是，属于或不属于，如果在特征平面中各个特征为独立的点，可用直线进行区分的分类即为二元分类。构建数据 make_classification(n_samples=x…

人工智能 2023年6月16日
0052
labelme 构造自己的数据集

文章目录前言一、labelme是什么？二、安装三、使用前言在我们训练模型的过程中，都是用的现成的数据集比如cifar10数据集等，在我们实际运用过程中，肯定要训练自己的…

人工智能 2023年7月28日
0051
二分分类（Binary Classification）

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年7月1日
0066
GTX 3090安装python，torch环境的版本

人工智能 2023年5月26日
0089
损失函数小结

损失函数用于衡量真实值y_true和预测值y_pred之间的差异。通常情况下y_true和y_pred维度相同，但特殊情况下维度不同，一般来讲框架(包括tensorflow和min…

人工智能 2023年5月25日
0074
《人工智能简史》读后感

拖了好久没有更新博客，主要是毕业论文在即，忙着改论文，降重，还是要好好搞毕设，认真写论文，不然降重真的有自己好受的，一把辛酸泪啊，哈哈。今天刚读完了《人工智能简史》，作者是尼克，…

人工智能 2023年6月1日
0089
PWC-net模型

原文链接：2006.04902.pdf (arxiv.org) 本文仅为笔记。 1.PWC-net模型将图片1和图片2分别放入共享的cnn网络，通过金字塔结构生成特征图，这些特征…

人工智能 2023年7月13日
0060
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.a…

人工智能 2023年6月13日
0083
目标检测：MS-COCO 数据集简介

### 回答1：目标检测数据集_是计算机视觉领域中一类非常重要的 _数据集。它包含大量不同种类的图像，这些图像中存在需要被识别出来的目标，并且每个目标都会进行标注，以便于后续的…

人工智能 2023年7月9日
0063
Python知识点大全（非常详细）

Python知识点大全 Python知识点汇总（一） Python知识点汇总（二） Python知识点汇总（一） 1、Python的两种编程⽅式：交互式（随输随运⾏）和⽂件式（主要…

人工智能 2023年7月3日
0058
阿里云天池赛题解析——深度学习篇重磅发布！

深度学习随着大数据和云计算技术的兴起而迅速发展，并在计算机视觉、语音等感知领域取得了快速成功。 [En] Deep learning develops rapidly with t…

人工智能 2023年5月25日
0075

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

【联邦学习FATE框架实战】（三）MNIST神经网络（Keras）

目录

3.1 上传数据

3.2 模型训练

3.2.1 定义模型

3.2.2 配置DSL文件（v2版本）

3.2.3 配置conf文件

3.2.4 提交任务，训练模型

大家都在看