利用LSTM实现预测时间序列(股票预测)

目录

1. 作者介绍

米洪民,男,西安工程大学电子信息学院,2019级,张宏伟人工智能课题组。

[En]

Mi Hongmin, male, School of Electronic Information, Xi’an Engineering University, Grade 2019, Zhang Hongwei artificial Intelligence Research Group.

研究方向:机器视觉和人工智能。

[En]

Research interests: machine vision and artificial intelligence.

电子邮件:1353197091@qq.com

[En]

Email: 1353197091@qq.com

2. tushare 简介

Tushare是一个免费、开源的python财经数据接口包。主要实现对股票等金融数据从数据采集、清洗加工到数据存储的过程,能够为金融分析人员提供快速、整洁、和多样的便于分析的数据,为他们在数据获取方面极大地减轻工作量,使他们更加专注于策略和模型的研究与实现上。考虑到Python pandas包在金融量化分析中体现出的优势,Tushare返回的绝大部分的数据格式都是pandas DataFrame类型,非常便于用pandas、NumPy、Matplotlib进行数据分析和可视化。
更详细的介绍http://tushare.org/index.html#id3

通过TuShare API下载的部分数据的可视化如图所示:

[En]

The visualization of some of the data downloaded through the tushare API is shown in the figure:

利用LSTM实现预测时间序列(股票预测)
利用LSTM实现预测时间序列(股票预测)

; 3. LSTM简介

3.1 循环神经网络 (Recurrent Neural Networks)

人们不会从一开始就完全考虑问题。例如,当你阅读这篇文章时,你会根据你之前理解的信息来理解下面的文字。当你理解当前文本时,你不会忘记之前读过的文本,并从头开始思考当前文本的含义。

[En]

People don’t think about a problem completely from the beginning. For example, when you read this article, you will understand the text below according to the information you have understood before. When you understand the current text, you will not forget the text you have read before and think about the meaning of the current text from the beginning.

传统的神经网络不能做到这一点,这在预测这类序列信息(如语音)方面是一个缺点。例如,如果要对电影每个片段中的事件进行分类,传统的神经网络很难利用之前的事件信息来对后续事件进行分类。

[En]

Traditional neural networks can not do this, which is a disadvantage in predicting such sequence information (such as speech). For example, if you want to classify events in each segment of the movie, it is difficult for traditional neural networks to classify subsequent events by using the previous event information.

而循环神经网络(RNN)可以通过不停的将信息循环操作,保证信息持续存在,从而解决上述问题。RNN如下图所示:(原文来自这里)

利用LSTM实现预测时间序列(股票预测)
由此可见,an是一组神经网络(可以理解为网络的自循环),它的工作是不断地接收xt和输出ht。从图中可以看出,AN允许信息在内部连续循环,这确保了在计算的每一步都保存了先前的信息。
[En]

It can be seen that An is a group of neural networks (can be understood as a network self-cycle), its job is to constantly receive Xt and output ht. It can be seen from the figure that An allows the information to be continuously looped internally, which ensures that the previous information is saved at each step of the calculation.

为了更好地理解,可以扩展RNN的自环结构,例如复制相同的网络并将其连接成线路结构,然后将提取的信息传递给下一个后继网络,如下图所示。

[En]

To better understand, expand the self-loop structure of RNN, such as copying the same network and connecting it into a line structure, and passing the extracted information to the next successor, as shown in the following figure.

利用LSTM实现预测时间序列(股票预测)
这样的链式神经网络代表着一个递归神经网络,可以看作是同一个神经网络的多个副本,每个时刻的神经网络都会将信息传递到下一个时刻。
[En]

Such a chain neural network represents a recurrent neural network, which can be thought of as multiple copies of the same neural network, and the neural network at each moment will transmit information to the next moment.

由于递归神经网络具有一定的记忆功能,可以用来解决语音识别、语言模型、机器翻译等问题。但它并没有很好地处理长期依赖的问题。

[En]

Because of its certain memory function, recurrent neural network can be used to solve many problems, such as speech recognition, language model, machine translation and so on. But it doesn’t handle the problem of long-term dependency very well.

长期依赖就是这样一个问题,当预测点远离相关信息时,就很难了解到相关信息。比如,在一句“我出生在法国,我会说法语”中,要预测“法语”的结尾,我们需要使用语境“法国”。理论上,递归神经网络可以解决这一问题,但实际上,常规的递归神经网络不能很好地解决长期依赖问题。好消息是,LSTM可以很好地解决这个问题。

[En]

Long-term dependence is such a problem, when the prediction point is far away from the related information, it is difficult to learn the relevant information. For example, in the sentence “I was born in France, I can speak French”, to predict the end of “French”, we need to use the context “France”. In theory, recurrent neural network can deal with this problem, but in fact, conventional recurrent neural network can not solve long-term dependence very well. The good thing is that LSTM can solve this problem very well.

; 3.2 LSTM网络

Long Short Term Memory networks(以下简称LSTMs),一种特殊的RNN网络,该网络设计出来是为了解决长依赖问题。该网络由 Hochreiter & Schmidhuber (1997)引入,并有许多人对其进行了改进和普及。他们的工作被用来解决了各种各样的问题,直到目前还被广泛应用。

所有的循环神经网络都具有神经网络的重复模块链形式。在标准的RNN中,重复模块将具有非常简单的结构,例如单个TANH层。标准RNN网络如下图所示

[En]

All cyclic neural networks have the form of repetitive module chain of neural networks. In standard RNN, the repeating module will have a very simple structure, such as a single tanh layer. The standard RNN network is shown in the following figure

利用LSTM实现预测时间序列(股票预测)
LSTM也具有这种链式结构,但是它的重复单元不同于标准RNN网络里的单元只有一个网络层,它的内部有四个网络层。LSTM的结构如下图所示。
利用LSTM实现预测时间序列(股票预测)
在解释LSTM的详细结构之前,请先定义图中每个符号的含义。这些符号包括以下符号
[En]

Define the meaning of each symbol in the diagram before explaining the detailed structure of LSTMs. The symbols include the following

利用LSTM实现预测时间序列(股票预测)
图中的黄色与CNN中的激活功能操作类似。粉色圆圈表示点运算,单箭头表示数据流,箭头合并表示向量的合并运算,箭头分叉表示向量的复制操作。
[En]

The yellow in the figure is similar to the activation function operation in CNN. The pink circle represents the point operation, the single arrow indicates the data flow, the arrow merges indicates the concat operation of the vector, and the arrow bifurcates indicates the copy operation of the vector.

; 3.2.1 LSTM的核心思想

LSTM的核心是细胞状态,用贯穿细胞的水平线表示。

牢房就像传送带一样。它贯穿整个细胞,但只有几个分支,这确保了信息在整个RNN中流动。单元格的状态如下图所示:

[En]

The cells are in a state like a conveyor belt. It runs through the whole cell but has only a few branches, which ensures that information flows through the entire RNN. The state of the cell is shown in the following figure:

利用LSTM实现预测时间序列(股票预测)
LSTM网络能通过一种被称为门的结构对细胞状态进行删除或者添加信息。

门可以有选择地决定让哪些信息通过。事实上,门的结构非常简单,它是乙状层和点乘运算的组合。如下图所示:

[En]

The door can selectively decide which information to let through. In fact, the structure of the door is very simple, which is a combination of a sigmoid layer and a dot multiplication operation. As shown in the following figure:

利用LSTM实现预测时间序列(股票预测)
由于Sigmoid层的输出值为0-1,这表示可以流经Sigmoid层的信息量。0表示没有一个可以通过,1表示所有人都可以通过。
[En]

Because the output of the sigmoid layer is a value of 0-1, this represents how much information can flow through the sigmoid layer. 0 means none can pass, 1 means all can pass.

LSTM包含三个门来控制单元的状态。

[En]

A LSTM contains three gates to control the state of the cell.

3.2.2 一步一步理解LSTM

如前所述,LSTM有三个门来控制单元状态,它们被称为遗忘门、输入门和输出门。让我们一个接一个地谈吧。

[En]

As mentioned earlier, LSTM has three gates to control the cell state, which are called forgetting gate, input gate and output gate. Let’s talk about it one by one.

LSTM的第一步就是决定细胞状态需要丢弃哪些信息。这部分操作是通过一个称为忘记门的sigmoid单元来处理的。它通过查看 h t − 1 h_{t-1}h t −1 ​ 和 x t x_t x t ​ 信息来输出一个0-1之间的向量,该向量里面的0-1值表示细胞状态 C t − 1 C_{t-1}C t −1 ​中的哪些信息保留或丢弃多少。0表示不保留,1表示都保留。忘记门如下图所示:

利用LSTM实现预测时间序列(股票预测)
下一步是决定给细胞状态添加哪些新的信息。这一步又分为两个步骤,首先,利用 h t − 1 h_{t-1}h t −1 ​ 和 x t x_t x t ​ 通过一个称为输入门的操作来决定更新哪些信息。然后利用 h t − 1 h_{t-1}h t −1 ​ 和 x t x_t x t ​ 通过一个tanh层得到新的候选细胞信息 C t ~ \tilde{C_t}C t ​~​,这些信息可能会被更新到细胞信息中。这两步描述如下图所示:
利用LSTM实现预测时间序列(股票预测)
下面将更新旧的细胞信息 C t − 1 C_{t-1}C t −1 ​,变为新的细胞信息 C t C_{t}C t ​。更新的规则就是通过忘记门选择忘记旧细胞信息的一部分,通过输入门选择添加候选细胞信息 C ~ t \tilde C_{t}C ~t ​ 的一部分得到新的细胞信息 C t C_{t}C t ​。更新操作如下图所示:
利用LSTM实现预测时间序列(股票预测)更新完细胞状态后需要根据输入的 h t − 1 h_{t-1}h t −1 ​ 和 x t x_t x t ​ 来判断输出细胞的哪些状态特征,这里需要将输入经过一个称为输出门的sigmoid层得到判断条件,然后将细胞状态经过tanh层得到一个-1~1之间值的向量,该向量与输出门得到的判断条件相乘就得到了最终该RNN单元的输出。该步骤如下图所示:
利用LSTM实现预测时间序列(股票预测)
或者以语言模型为例,在预测动词的形式时,需要根据输入主语是单数还是复数来推断输出动词是单数还是复数。
[En]

Or take the language model as an example, when predicting the form of verbs, we need to infer whether the output verbs are singular or plural by whether the input subject is singular or plural.

; 4. 代码实现

4.1 导入相关资源包

除了安装深度学习框架pytorch外,还需要安装matplotlib、NumPy、Pandas、TuShare等库。安装步骤非常简单,只需输入PIP3 Install xxx(库名称)即可。

[En]

In addition to installing the deep learning framework pytorch, you also need to install matplotlib, numpy, pandas, tushare and other libraries. The installation steps are very simple, just enter pip3 install xxx (library name).

import matplotlib.pyplot as plt
import numpy as np
import tushare as ts
import torch
from torch import nn
import datetime
import time

4.2 定义模型结构

class LSTM_Regression(nn.Module):
"""
        使用LSTM进行回归

        参数:
        - input_size: feature size
        - hidden_size: number of hidden units
        - output_size: number of output
        - num_layers: layers of LSTM to stack
"""
    def __init__(self, input_size, hidden_size, output_size=1, num_layers=2):
        super().__init__()

        self.lstm = nn.LSTM(input_size, hidden_size, num_layers)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, _x):
        x, _ = self.lstm(_x)
        s, b, h = x.shape
        x = x.view(s*b, h)
        x = self.fc(x)
        x = x.view(s, b, -1)
        return x

4.3 制作数据集

def create_dataset(data, days_for_train=5) -> (np.array, np.array):
"""
        根据给定的序列data,生成数据集

        数据集分为输入和输出,每一个输入的长度为days_for_train,每一个输出的长度为1。
        也就是说用days_for_train天的数据,对应下一天的数据。

        若给定序列的长度为d,将输出长度为(d-days_for_train+1)个输入/输出对
"""
    dataset_x, dataset_y= [], []
    for i in range(len(data)-days_for_train):
        _x = data[i:(i+days_for_train)]
        dataset_x.append(_x)
        dataset_y.append(data[i+days_for_train])
    return (np.array(dataset_x), np.array(dataset_y))

4.4 模型训练

您可以根据自己的兴趣选择想要下载的时间范围,例如,代码中的开始日期是2019年1月1日。此外,训练集和测试集的长度也可以自由调整。

[En]

You can choose the time range you want to download according to your interests, for example, the start date in the code is January 1, 2019. In addition, the length of training set and test set can also be adjusted freely.

if __name__ == '__main__':
    t0 = time.time()
    data_close = ts.get_k_data('000001', start='2019-01-01', index=True)['close'].values
    data_close = data_close.astype('float32')
    plt.plot(data_close)
    plt.savefig('data.png', format='png', dpi=200)
    plt.close()

    max_value = np.max(data_close)
    min_value = np.min(data_close)
    data_close = (data_close - min_value) / (max_value - min_value)

    dataset_x, dataset_y = create_dataset(data_close, DAYS_FOR_TRAIN)

    train_size = int(len(dataset_x) * 0.7)

    train_x = dataset_x[:train_size]
    train_y = dataset_y[:train_size]

    train_x = train_x.reshape(-1, 1, DAYS_FOR_TRAIN)
    train_y = train_y.reshape(-1, 1, 1)

    train_x = torch.from_numpy(train_x)
    train_y = torch.from_numpy(train_y)

    model = LSTM_Regression(DAYS_FOR_TRAIN, 8, output_size=1, num_layers=2)

    loss_function = nn.MSELoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=1e-2)

    for i in range(1000):
        out = model(train_x)
        loss = loss_function(out, train_y)

        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        with open('log.txt', 'a+') as f:
            f.write('{} - {}\n'.format(i+1, loss.item()))
        if (i+1) % 1 == 0:
            print('Epoch: {}, Loss:{:.5f}'.format(i+1, loss.item()))

4.5 测试与保存结果

    model = model.eval()

    dataset_x = dataset_x.reshape(-1, 1, DAYS_FOR_TRAIN)
    dataset_x = torch.from_numpy(dataset_x)

    pred_test = model(dataset_x)
    pred_test = pred_test.view(-1).data.numpy()
    pred_test = np.concatenate((np.zeros(DAYS_FOR_TRAIN), pred_test))
    assert len(pred_test) == len(data_close)

    plt.plot(pred_test, 'r', label='prediction')
    plt.plot(data_close, 'b', label='real')
    plt.plot((train_size, train_size), (0, 1), 'g--')
    plt.legend(loc='best')
    plt.savefig('result.png', format='png', dpi=200)
    plt.close()

4.6 实验结果

培训结束后,航站楼将显示如下内容:

[En]

After the training, there will be the following display on the terminal:

利用LSTM实现预测时间序列(股票预测)
实验结果将以.png格式保存在代码的根目录中,如下图所示(线条颜色可以根据个人喜好选择)。
[En]

The experimental results will be saved in .png format in the root directory of the code, as shown in the following figure (the line color can be chosen according to individual preferences).

利用LSTM实现预测时间序列(股票预测)

; 5. 完整代码


import matplotlib.pyplot as plt
import numpy as np
import tushare as ts
import pandas as pd
import torch
from torch import nn
import datetime
import time

DAYS_FOR_TRAIN = 10

class LSTM_Regression(nn.Module):
"""
        使用LSTM进行回归

        参数:
        - input_size: feature size
        - hidden_size: number of hidden units
        - output_size: number of output
        - num_layers: layers of LSTM to stack
"""
    def __init__(self, input_size, hidden_size, output_size=1, num_layers=2):
        super().__init__()

        self.lstm = nn.LSTM(input_size, hidden_size, num_layers)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, _x):
        x, _ = self.lstm(_x)
        s, b, h = x.shape
        x = x.view(s*b, h)
        x = self.fc(x)
        x = x.view(s, b, -1)
        return x

def create_dataset(data, days_for_train=5) -> (np.array, np.array):
"""
        根据给定的序列data,生成数据集

        数据集分为输入和输出,每一个输入的长度为days_for_train,每一个输出的长度为1。
        也就是说用days_for_train天的数据,对应下一天的数据。

        若给定序列的长度为d,将输出长度为(d-days_for_train+1)个输入/输出对
"""
    dataset_x, dataset_y= [], []
    for i in range(len(data)-days_for_train):
        _x = data[i:(i+days_for_train)]
        dataset_x.append(_x)
        dataset_y.append(data[i+days_for_train])
    return (np.array(dataset_x), np.array(dataset_y))

if __name__ == '__main__':
    t0 = time.time()

    data_close = pd.read_csv('000001.csv')

    data_close = data_close.astype('float32').values
    plt.plot(data_close)
    plt.savefig('data.png', format='png', dpi=200)
    plt.close()

    max_value = np.max(data_close)
    min_value = np.min(data_close)
    data_close = (data_close - min_value) / (max_value - min_value)

    dataset_x, dataset_y = create_dataset(data_close, DAYS_FOR_TRAIN)

    train_size = int(len(dataset_x) * 0.7)

    train_x = dataset_x[:train_size]
    train_y = dataset_y[:train_size]

    train_x = train_x.reshape(-1, 1, DAYS_FOR_TRAIN)
    train_y = train_y.reshape(-1, 1, 1)

    train_x = torch.from_numpy(train_x)
    train_y = torch.from_numpy(train_y)

    model = LSTM_Regression(DAYS_FOR_TRAIN, 8, output_size=1, num_layers=2)

    model_total = sum([param.nelement() for param in model.parameters()])
    print("Number of model_total parameter: %.8fM" % (model_total/1e6))

    train_loss = []
    loss_function = nn.MSELoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=1e-2, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)
    for i in range(200):
        out = model(train_x)
        loss = loss_function(out, train_y)
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        train_loss.append(loss.item())

        with open('log.txt', 'a+') as f:
            f.write('{} - {}\n'.format(i+1, loss.item()))
        if (i+1) % 1 == 0:
            print('Epoch: {}, Loss:{:.5f}'.format(i+1, loss.item()))

    plt.figure()
    plt.plot(train_loss, 'b', label='loss')
    plt.title("Train_Loss_Curve")
    plt.ylabel('train_loss')
    plt.xlabel('epoch_num')
    plt.savefig('loss.png', format='png', dpi=200)
    plt.close()

    t1=time.time()
    T=t1-t0
    print('The training time took %.2f'%(T/60)+' mins.')

    tt0=time.asctime(time.localtime(t0))
    tt1=time.asctime(time.localtime(t1))
    print('The starting time was ',tt0)
    print('The finishing time was ',tt1)

    model = model.eval()

    dataset_x = dataset_x.reshape(-1, 1, DAYS_FOR_TRAIN)
    dataset_x = torch.from_numpy(dataset_x)

    pred_test = model(dataset_x)

    pred_test = pred_test.view(-1).data.numpy()
    pred_test = np.concatenate((np.zeros(DAYS_FOR_TRAIN), pred_test))
    assert len(pred_test) == len(data_close)

    plt.plot(pred_test, 'r', label='prediction')
    plt.plot(data_close, 'b', label='real')
    plt.plot((train_size, train_size), (0, 1), 'g--')
    plt.legend(loc='best')
    plt.savefig('result.png', format='png', dpi=200)
    plt.close()

Original: https://blog.csdn.net/m0_37758063/article/details/117995469
Author: ZHW_AI课题组
Title: 利用LSTM实现预测时间序列(股票预测)

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/6248/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

最近整理资源【免费获取】:   👉 程序员最新必读书单  | 👏 互联网各方向面试题下载 | ✌️计算机核心资源汇总