卷积层TSNE可视化

【自取】最近整理的,有需要可以领取学习:

很多小伙伴经常问,怎么把卷积层的输出提取出来,然后画曲线、可视化、连接到其他网络等等问题,由于本人使用的是基于keras和tensorflow框架的Spyder软件编写的代码,因此对别的软件怎么输出参数不清楚,单说spyder,往往经过卷积层后提取到的特征形式是:样本数量×特征长度×特征维度,因此即使输出来也是很操蛋的就一个样本长度为特征长度而且仅仅第一个维度的数据。使用这些数据去绘图很难,因为输出来的一个样本的数据并不是一条,而是经过堆叠成的数据。
但是,可以使用这些数据进行一些可视化操作并连接到其他网络。

[En]

However, it is possible to use this data to do some visualization and connect to other networks.

以可视化为例,前面有过卷积层可视化,但并不能从图中得到有效信息。因此用这些数据做TSNE可视化试试看。由于卷积层输出来的数据是样本数量×特征长度×特征维度,而tsne输入是样本数乘特征数,因此需要将:样本数量×特征长度×特征维度重新reshape成样本数×特征数的形式。先看一下效果图,首先展示神经网络对数据进行二分类,观察每个卷积层的TSNE可视化情况:
1、原始数据TSNE

卷积层TSNE可视化
2、第一个卷积层等不在意义叙述:
卷积层TSNE可视化
卷积层TSNE可视化
卷积层TSNE可视化

卷积层TSNE可视化

卷积层TSNE可视化

卷积层TSNE可视化

卷积层TSNE可视化
卷积层TSNE可视化

卷积层TSNE可视化

卷积层TSNE可视化

卷积层TSNE可视化
从这些图片可以分析出什么?毛不能分析,这只能说明这两种数据有很大的差异。另外,用神经网络做两次分类,根本没有表现出神经网络特征提取的强大能力,所以上述曲线收敛速度很快,几乎在第二批也就不足为奇了。
[En]

What can be analyzed from these pictures? Mao can not analyze it, which only shows that there is a great difference between the two kinds of data. In addition, using neural network to do two classification does not show the strong ability of neural network feature extraction at all, so it is not surprising that the curve mentioned above converges very fast, almost in the second batch.

因此增加样本数量,并研究单一个卷积层在不同训练批次下,输出数据的可视化i情况:
1、没有训练:

卷积层TSNE可视化

2、训练3次

卷积层TSNE可视化
3、训练300次

卷积层TSNE可视化
4、训练3000次
卷积层TSNE可视化
这种效果不需要重复,很明显,300次训练是最好的。
[En]

There is no need to repeat this effect, it is obvious that 300 training times is the best.

代码如下:


"""
Created on Wed Jul  7 11:55:08 2021

@author: 1
"""
import tensorflow as tf
from sklearn.manifold import TSNE
import numpy as np
import pandas as pd
import keras
from keras.models import Sequential
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils,plot_model
from sklearn.model_selection import cross_val_score,train_test_split,KFold
from sklearn.preprocessing import LabelEncoder
from keras.models import model_from_json
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix
import itertools
from keras.optimizers import SGD
from keras.layers import Dense,LSTM, Activation, Flatten, Convolution1D, Dropout,MaxPooling1D,BatchNormalization
from keras.models import load_model
from sklearn import preprocessing

df = pd.read_csv(r'C:/Users/1/Desktop/14改.csv')
X = np.expand_dims(df.values[:, 0:1024].astype(float), axis=2)
Y = df.values[:, 1024]

X_train, X_test, K, y = train_test_split(X, Y, test_size=0.3, random_state=0)
K=K

encoder = LabelEncoder()
Y_encoded1 = encoder.fit_transform(K)
Y_train = np_utils.to_categorical(Y_encoded1)

Y_encoded2 = encoder.fit_transform(y)
Y_test = np_utils.to_categorical(Y_encoded2)

def baseline_model():
    model = Sequential()
    model.add(Convolution1D(16, 64,strides=16,padding='same', input_shape=(1024, 1),activation='relu'))
    model.add(MaxPooling1D(2,strides=2,padding='same'))
    model.add(BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None))

    model.add(Convolution1D(32,3,padding='same',activation='relu'))
    model.add(MaxPooling1D(2,strides=2,padding='same'))
    model.add(BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None))

    model.add(Convolution1D(64,3,padding='same',activation='relu'))
    model.add(MaxPooling1D(2,strides=2,padding='same'))
    model.add(BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None))

    model.add(Convolution1D(64, 3,padding='same',activation='relu'))
    model.add(MaxPooling1D(2,strides=2,padding='same'))
    model.add(BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None))

    model.add(Convolution1D(64, 3,padding='same',activation='relu'))
    model.add(MaxPooling1D(2,strides=2,padding='same'))
    model.add(BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None))

    model.add(Convolution1D(64,3,padding='same',activation='relu'))
    model.add(MaxPooling1D(2,strides=2,padding='same'))
    model.add(BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None))

    model.add(Dense(100,activation='relu'))
    model.add(LSTM(64,return_sequences=True))
    model.add(Dropout(0.5))
    model.add(LSTM(32))
    model.add(Flatten())
    model.add(Dense(9, activation='softmax'))
    model.compile(loss='categorical_crossentropy',optimizer='adam', metrics=['accuracy'])
    model.summary()
    return model

estimator = KerasClassifier(build_fn=baseline_model, epochs=3000, batch_size=128, verbose=1)
history=estimator.fit(X_train, Y_train, validation_data=(X_test, Y_test))
import matplotlib.pyplot as plt

def visual(model, data, num_layer=1):
     layer = keras.backend.function([model.layers[0].input], [model.layers[num_layer].output])
     f1 = layer([data])[0]
     np.set_printoptions(threshold=np.inf)
     print(f1.shape)
     print(f1)
     f2=f1.reshape(6034,64)
     print(f2)
     num = f1.shape[-1]
     print(num)
     plt.figure(figsize=(6, 12), dpi=150)
     for i in range(num):
         plt.subplot(np.ceil(np.sqrt(num)), np.ceil(np.sqrt(num)), i+1)
         plt.imshow(f1[:, :, i] * 255, cmap='prism')
         plt.axis('off')
     plt.show()
     def get_data():

         digits=2
         data = f2
         label = K
         n_samples=6034
         n_features =64
         return data, label, n_samples, n_features

     def plot_embedding(data, label, title):

         x_min, x_max = np.min(data, 0), np.max(data, 0)
         data = (data - x_min) / (x_max - x_min)
         fig = plt.figure()
         ax = plt.subplot(111)

         for i in range(data.shape[0]):

             plt.text(data[i, 0], data[i, 1], str(label[i]), color=plt.cm.Set1(label[i] / 10),
                 fontdict={'weight': 'bold', 'size': 7})
         plt.xticks()
         plt.yticks()
         plt.title(title, fontsize=14)

         return fig

     data, label , n_samples, n_features = get_data()
     print('Starting compute t-SNE Embedding...')
     ts = TSNE(n_components=2, init='pca', random_state=0)

     reslut = ts.fit_transform(data)

     fig = plot_embedding(reslut, label, 't-SNE Embedding of digits')

     plt.show()

visual(estimator.model, X_train, 20)

本来想用这个代码写一个核作为创新点,结果被拒绝了,说没有创新,只好放弃了这个追求更高的创新。

[En]

Originally, I wanted to write a core with this code as an innovation point, but as a result, I was rejected and said that there was no innovation, so I had no choice but to give up this pursuit of higher innovation.

卷积层TSNE可视化
我发现投稿的难易程度与该期刊的影响因子并不成正比。今年,影响因子接近1的几种期刊被踢出了北大的核心,比如影响因子高于1的《机械设计与研究》等期刊就很好。然而,影响因子小于0.5的期刊,如机械强度和制造自动化,纯粹是选择性的,很难取胜。
[En]

I found that the difficulty of contribution is not directly proportional to the impact factor of this periodical. This year, several journals with an impact factor close to 1 have been kicked out of the core of Peking University, and for example, journals such as mechanical design and research, whose impact factor is higher than 1, is very good. However, journals with an impact factor of less than 0.5, such as mechanical strength and manufacturing automation, are purely selective and difficult to win.

本次使用的数据接入方式如下,永久有效:

[En]

The data used this time is connected as follows, which is permanently valid:

链接:https://pan.baidu.com/s/1jmOOKXFA27I6iGlvvMBiwg
提取码:SLBY
链接:https://pan.baidu.com/s/1plW9siresvTIvlNZsDA_tw
提取码:SLBY
这两个数据的名称相同,但内容不同。

[En]

The names of the two data are the same, but the contents are different.

Original: https://blog.csdn.net/qq_45714906/article/details/118568358
Author: 让我顺利毕业吧
Title: 卷积层TSNE可视化

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/7676/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

发表回复

登录后才能评论
免费咨询
免费咨询
扫码关注
扫码关注
联系站长

站长Johngo!

大数据和算法重度研究者!

持续产出大数据、算法、LeetCode干货,以及业界好资源!

2022012703491714

微信来撩,免费咨询:xiaozhu_tec

分享本页
返回顶部