tensorflow模型量化篇（1）量化方法及动态范围量化

2023年5月25日下午1:16 • 人工智能 • 阅读 89

文章目录

1 量化方法
*
1.1 优缺点比较
2 训练后量化
*
2.1 方法
2.2 选择策略
2.3 数据类型
2.4 动态范围量化（dynamic range quantization）
–
- 2.4.1 数据类型变化
- 2.4.2 使用方法
章节导航
*
下一篇：[tensorflow模型量化篇（2）全整形量化](https://blog.csdn.net/weixin_43490422/article/details/114988717)

1 量化方法

在tensorflow官网中有两种类型的量化方法：

两种方法中一种是训练时量化，另一种是训练后量化。

[En]

One of the two methods is quantification during training, and the other is quantification after training.

1.1 优缺点比较

训练后（指训练后量化）：集成到tensorflow lite转换器中，迭代快、容易使用，但是模型精度损失较大。
训练过程中（指量化感知训练）：基于Keras搭建。相对难以使用，需要更长的时间来重新训练模型，但对模型精度的保持比较好。

2 训练后量化

由于训练后的量化易用性，我们应该从这个方法开始。

[En]

Because of the quantitative ease of use after training, we should start with this method.

2.1 方法

有三种方法可以在训练后进行量化：

[En]

There are three ways to quantify after training:

技术优点硬件支持动态范围量化（dynamic range quantization）小了4x , 2x-3x 加速CPU全整形量化（Full integer quantization）小了4x, 3x+ 加速CPU, Edge TPU, Microcontrollers半浮点数量化（Float16 quantization）小了两倍，可GPU加速CPU, GPU

这里的小了多少倍，是根据数据的精度来算的，举例来说，浮点数float32类型量化为int8类型，从32位降低到了8位，所以是小了4倍。

2.2 选择策略

下图是tensorflow官方给出的这三种方法的选择策略图：

英语水平有限，暂译如下：

[En]

The proficiency of English is limited, which is temporarily translated as follows:

; 2.3 数据类型

2.4 动态范围量化（dynamic range quantization）

注：在官网上以前这个位置的名称叫做混合量化（post training “hybrid”），也有tensorflow官方工作人员的证明，确实有混合量化：(https://v.qq.com/x/page/h0927f3qzvg.html)

现在这个位置改成了动态范围量化，推测两者应该是一回事。

[En]

Now that this position has been changed to dynamic range quantization, it is speculated that the two should be the same thing.

; 2.4.1 数据类型变化

①在模型转换时量化权重张量，量化为8bit整数，此时非常量的激活张量还用浮点数来表示
②在推理时动态的量化激活张量，对于支持量化内核的ops，激活在处理之前被动态量化到8位的精度，在处理之后被反量化到浮点精度。

2.4.2 使用方法

读取保存的model模型，使用tensorflow lite转换器，配置它默认的优化方式。

官方示方法：

import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant_model = converter.convert()

官方示例代码参考：https://www.tensorflow.org/lite/performance/post_training_quant
下面从一个使用minist数据集的代码加深了解

import tensorflow as tf
from tensorflow import keras

mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()


train_images = train_images / 255.0
test_images = test_images / 255.0


model = keras.Sequential([
  keras.layers.InputLayer(input_shape=(28, 28)),
  keras.layers.Reshape(target_shape=(28, 28, 1)),
  keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation=tf.nn.relu),
  keras.layers.MaxPooling2D(pool_size=(2, 2)),
  keras.layers.Flatten(),
  keras.layers.Dense(10, activation=tf.nn.softmax)
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(
  train_images, train_labels,
  epochs=1, validation_split=0.1,
)

1688/1688 [==============================] - 12s 7ms/step - loss: 0.5289 - accuracy: 0.8666 - val_loss: 0.1034 - val_accuracy: 0.9710
    <tensorflow.python.keras.callbacks.history at 0x7f435d13eb50>
</tensorflow.python.keras.callbacks.history>

converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
tflite_name = "tflite_model"
open(tflite_name, "wb").write(tflite_model)


converter_quant = tf.lite.TFLiteConverter.from_keras_model(model)
converter_quant.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model_quant = converter_quant.convert()

DRQ_name = "quantify_DRQ.tflite"
open(DRQ_name, "wb").write(tflite_model_quant)

从上边的代码结果可以看出来模型大小从84760b降到了24176b，约为原来的1/4。

print(input_details)
print(output_details)

[{'name': 'input_1', 'index': 0, 'shape': array([ 1, 28, 28], dtype=int32), 'shape_signature': array([-1, 28, 28], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
[{'name': 'Identity', 'index': 18, 'shape': array([ 1, 10], dtype=int32), 'shape_signature': array([-1, 10], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
</class></class>


def evaluate(interpreter_path):

    interpreter = tf.lite.Interpreter(model_path=interpreter_path)
    interpreter.allocate_tensors()

    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()

    import numpy as np
    index = input_details[0]['index']
    shape = input_details[0]['shape']
    acc_count = 0
    image_count = test_images.shape[0]
    for i in range(image_count):
        interpreter.set_tensor(index, test_images[i].reshape(shape).astype("float32"))
        interpreter.invoke()
        output_data = interpreter.get_tensor(output_details[0]['index'])
        label = np.argmax(output_data)
        if label == test_labels[i]:
            acc_count += 1
    print("test_images accuracy is {:.2%}".format(acc_count/(image_count)))

evaluate(DRQ_name)
evaluate(tflite_name)

test_images accuracy is 97.03%
test_images accuracy is 97.02%

通过以上对比，动态范围量化使得模型大小缩小为原来的1/4，模型准确率降低了0.01%

章节导航

Original: https://blog.csdn.net/weixin_43490422/article/details/114961890
Author: little student
Title: tensorflow模型量化篇（1）量化方法及动态范围量化

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/514288/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

【Python】Python寻找多维数组（numpy.array）中最大值的位置（行和列）

最近需要从热力图中找出关键点的坐标，也就是极大值的行和列。搜寻了网上的一些方法，在这里总结一下。使用numpy进行多维数组中最大值的行和列搜寻非常的灵活，有以下几种方法可供参考。 …

人工智能 2023年7月25日
00102
Q-Learning 、Sarsa与 DQN算法

Q-Learning 、Sarsa与 DQN算法 * – Q-Learning算法 – Sarsa 算法 – DQN算法 Q-Learning算…

人工智能 2023年7月28日
0064
sklearn中随机森林和梯度提升树的一维回归比较

随机森林和梯度提升树都是集成评估器，它们由许多基评估器组成，而两种集成评估器的训练模式有所不同，前者是独立训练多个基评估器，基于它们结果的平均做出最终评价；后者则在每次训练中有所侧…

人工智能 2023年6月18日
0091
手势识别（二） – 静态手势动作识别

我公司的科室开始在公众号上规划一些对外的技术文章了，包括实战项目、模型优化、端侧部署和一些深度学习任务基础知识，而我负责人体图象相关技术这一系列文章，偶尔也会出一些应用/代码解读等…

人工智能 2023年7月28日
0082
今天面了个阿里拿 38K 出来的，让我见识到了基础的天花板

前言人人都有大厂梦，对于程序员来说，BAT 为首的一线互联网公司肯定是自己的心仪对象，毕竟能到这些大厂工作，不仅薪资高待遇好，而且能力技术都能够得到提升，最关键的是还能够给自己镀…

人工智能 2023年7月30日
0039
语音识别系列1：语音识别Speech recognition综述

目录 1 什么是语声识别VOICE RECOGNITION？ 2 语声识别（VOICE RECOGNITION）和语音识别(SPEECH RECOGNITION)有什么区别？ 3 …

人工智能 2023年5月23日
0074
目标检测之SSD：Single Shot MultiBox Detector

Single Shot MultiBox Detector论文学习 single shot指的是SSD算法属于one-stage方法，MultiBox说明SSD是多框预测。ssd和…

人工智能 2023年7月9日
0086
部署过程中是否需要考虑数据隐私和安全性问题

问题：在部署过程中是否需要考虑数据隐私和安全性问题？介绍在部署过程中，数据隐私和安全性问题至关重要。在很多应用场景中，我们处理的数据可能包含敏感信息，如个人身份信息、财务记录和…

人工智能 2024年1月3日
0046
Diffusion Models：生成扩散模型

Diffusion Models：生成扩散模型当前的内容是梳理《Transformer视觉系列遨游》系列过程中引申出来的。目前最近在AI作画这个领域 Transformer 火的…

人工智能 2023年6月23日
0072
R语言基础数据分析——双因素方差分析

双因素方差分析（Double factor variance analysis) ：拥有二个自变量（A,B），一个因变量（C）；双因素方差分析有两种类型：一个是无交互作用的双因素方…

人工智能 2023年6月19日
00114
python自然语言处理工具包“spaCy”安装教程

spaCy 简介 1、spaCy简单教程 spaCy 是一个Python 自然语言处理工具包，诞生于2014 年年中，号称”Industrial-Strength Na…

人工智能 2023年5月30日
0091
OpenCV图像缩放插值之BiCubic双三次插值

转载请注明出处。文章链接：https: 图像缩放算法简介在图像的仿射变换中，很多地方需要用到插值运算，常见的插值运算包括最邻近插值，双线性插值，双三次插值（立体插值），兰索思插…

人工智能 2023年6月18日
00135
图像超分辨率之FSRCNN（Accelerating the Super-Resolution Convolutional Neural Network）

ECCV2016论文下载地址：Accelerating the Super-Resolution Convolutional Neural Network代码：参考了这位同学htt…

人工智能 2023年6月22日
0097
Mac pro m1上tensorflow安装教程

在买了m1之后，感觉网上有蛮多东西还没有能够完全适配。之前研究生的时候学过算法，也接触过tensorflow，可是，当时在找工作的时候，自己阴差阳错的没有继续了，但是，自己还是喜欢…

人工智能 2023年5月23日
00147
Python selenium webdriver 基本使用

系列文章目录 selenium webdriver 的常用示例文章目录系列文章目录 * selenium webdriver 的常用示例前言一、Pip安装&创建Bo…

人工智能 2023年7月18日
0071
深度强化学习DQN网络

DQN网络 DQN（Deep Q Networks）网络属于深度强化学习中的一种网络，它是深度学习与Q学习的结合，在传统的Q学习中，我们需要维护一张Q(s,a)表，在实际运用中，Q…

人工智能 2023年7月13日
0066

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

tensorflow模型量化篇（1）量化方法及动态范围量化

文章目录

1.1 优缺点比较

2.1 方法

2.2 选择策略

; 2.3 数据类型

2.4 动态范围量化（dynamic range quantization）

; 2.4.1 数据类型变化

2.4.2 使用方法

大家都在看