克隆策略

用于算法交易的神经网络——正确的时间序列预测和回测

我想展示如何用不同的方法来归一化数据,并且讨论更多的过拟合相关问题。

In [1]:
import matplotlib.pylab as plt
import seaborn as sns
sns.despine()

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.normalization import BatchNormalization
from keras.layers import Merge
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, CSVLogger, EarlyStopping
from keras.optimizers import RMSprop, Adam, SGD, Nadam
from keras.layers.advanced_activations import *
from keras.layers import Convolution1D, MaxPooling1D, AtrousConvolution1D
from keras.layers.recurrent import LSTM, GRU
from keras import regularizers

import theano
theano.config.compute_test_value = "ignore"

from sklearn.cross_validation import train_test_split

import pandas as pd
import numpy as np
Using TensorFlow backend.
/opt/conda/lib/python3.5/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)
<matplotlib.figure.Figure at 0x7f34bd6f02b0>

数据准备

让我们来看看茅台公司(600519.SHA)从2010年至今的股价的时间序列。

金融时间序列分析的最大问题是——他们不是静止的,这意味着其统计特性(平均值、方程、最值)都会随着时间的推移而发生变化,并且我们可以通过扩张的迪基-福勒检定(Augmented Dickey–Fuller test)来检验它。这样一来,我们就不能使用经典的数据归一化方法,比如最大最小值法等了。

在我们的例子中,我对于分类问题稍微做了点弊。我们不需要预测到确切的价格,所以未来的预期价值和方差对于我们来说不那么重要——我们只需要预测上下波动。这就是为什么我们可以冒险通过均值和方差(标准值化)来讲归一化我们的30天时间窗口。我们可以假设在同一个时间窗口中这些值并不会改变太多,而且不会触及到未来的变化。

X = [(np.array(x)  np.mean(x)) / np.std(x) for x in X]
In [2]:
def shuffle_in_unison(a, b):
    # courtsey http://stackoverflow.com/users/190280/josh-bleecher-snyder
    assert len(a) == len(b)
    shuffled_a = np.empty(a.shape, dtype=a.dtype)
    shuffled_b = np.empty(b.shape, dtype=b.dtype)
    permutation = np.random.permutation(len(a))
    for old_index, new_index in enumerate(permutation):
        shuffled_a[new_index] = a[old_index]
        shuffled_b[new_index] = b[old_index]
    return shuffled_a, shuffled_b

# 将数据的前90%作为训练集,后10%作为测试集
def create_Xt_Yt(X, y, percentage=0.9):
    p = int(len(X) * percentage)
    X_train = X[0:p]
    Y_train = y[0:p]
     
    X_train, Y_train = shuffle_in_unison(X_train, Y_train)
 
    X_test = X[p:]
    Y_test = y[p:]

    return X_train, X_test, Y_train, Y_test
In [3]:
#训练起始时间
start_date='2010-01-01'
end_date='2016-02-01'
#inst = D.instruments(start_date, end_date, market='CN_STOCK_A')
#print(inst)
instruments = ['600519.SHA']
hist = D.history_data(instruments, start_date, end_date, fields=['close'])

print(hist.head())

plt.plot(hist['date'], hist['close'])
plt.show()

# 转化为时间序列数据
hist = hist.ix[:, 'close'].tolist()

WINDOW = 30
EMB_SIZE = 1
STEP = 1
FORECAST = 1

# Straightforward way for creating time windows
X, Y = [], []
for i in range(0, len(hist), STEP): 
    try:
        x_i = hist[i:i+WINDOW]
        y_i = hist[i+WINDOW+FORECAST]  

        last_close = x_i[WINDOW-1]
        next_close = y_i

        if last_close < next_close:
            y_i = [1, 0]
        else:
            y_i = [0, 1] 

    except Exception as e:
        print(e)
        break

    X.append(x_i)
    Y.append(y_i)

X = [(np.array(x) - np.mean(x)) / np.std(x) for x in X] # comment it to remove normalization
X, Y = np.array(X), np.array(Y)

#X_train, X_test, Y_train, Y_test = create_Xt_Yt(X, Y)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y)
        date       close  instrument
0 2010-01-04  782.114868  600519.SHA
1 2010-01-05  779.813721  600519.SHA
2 2010-01-06  767.479553  600519.SHA
3 2010-01-07  753.488586  600519.SHA
4 2010-01-08  745.572571  600519.SHA
list index out of range

神经网络结构

如前所述,我在本文中将只会使用多层感知机算法(MLP)来分析。因为我想展示神经网络是多么容易在金融数据分析中发生过拟合现象,以及如防止它们。想要在卷积神经网络 (RNN) 和递归神经网络 (RNN) 中扩展这些想法是相对容易的,理解这些概念更加重要。和以前一样,我们采用 Keras 架构作为神经网络模型的主要框架。

我建议在每一个仿射层或卷积层后面使用批量归一化 (BatchNormalization) 并且利用 ReLU 作为激活函数。因为这个已经成为了“行业标准”——它们能够有助于更快地训练神经网络。另外一个值得注意的是我们可以在训练中降低学习率。 Keras 上可以利用 ReduceLROnPlateau() 函数实现这一要求。

我们利用 L2 范数和 Dropout 来防止过拟合现象。

In [4]:
model = Sequential()
model.add(Dense(64, input_dim=WINDOW,
                activity_regularizer=regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(LeakyReLU())
# 加入 Dropout 防止过拟合
model.add(Dropout(0.5))
model.add(Dense(16, activity_regularizer=regularizers.l2(0.01)))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dense(2))
model.add(Activation('softmax'))

opt = Nadam(lr=0.001)

reduce_lr = ReduceLROnPlateau(monitor='val_acc', factor=0.9, patience=25, min_lr=0.000001, verbose=1)
checkpointer = ModelCheckpoint(filepath="test.hdf5", verbose=1, save_best_only=True)
early_stopper = EarlyStopping(monitor='val_loss', min_delta=0.05, patience=20, verbose=0, mode='auto')
model.compile(optimizer=opt, 
              loss='categorical_crossentropy',
              metrics=['accuracy'])
In [5]:
# Train model
history = model.fit(X_train, Y_train, nb_epoch = 100, batch_size = 128, verbose=1, validation_data=(X_test, Y_test), shuffle=True, callbacks=[reduce_lr, checkpointer, early_stopper])
Train on 1084 samples, validate on 362 samples
Epoch 1/100
 128/1084 [==>...........................] - ETA: 5s - loss: 91.2162 - acc: 0.4062Epoch 00000: val_loss improved from inf to 50.72495, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 81.5286 - acc: 0.4963 - val_loss: 50.7249 - val_acc: 0.5470
Epoch 2/100
 128/1084 [==>...........................] - ETA: 0s - loss: 76.1115 - acc: 0.5312Epoch 00001: val_loss improved from 50.72495 to 41.24485, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 68.7040 - acc: 0.5092 - val_loss: 41.2449 - val_acc: 0.5249
Epoch 3/100
 128/1084 [==>...........................] - ETA: 0s - loss: 66.4745 - acc: 0.5547Epoch 00002: val_loss improved from 41.24485 to 33.97715, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 57.5401 - acc: 0.4815 - val_loss: 33.9771 - val_acc: 0.5055
Epoch 4/100
 128/1084 [==>...........................] - ETA: 0s - loss: 55.3943 - acc: 0.4453Epoch 00003: val_loss improved from 33.97715 to 28.57957, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 49.9550 - acc: 0.4732 - val_loss: 28.5796 - val_acc: 0.5028
Epoch 5/100
 128/1084 [==>...........................] - ETA: 0s - loss: 44.9905 - acc: 0.4375Epoch 00004: val_loss improved from 28.57957 to 24.50018, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 42.9227 - acc: 0.4705 - val_loss: 24.5002 - val_acc: 0.4917
Epoch 6/100
 128/1084 [==>...........................] - ETA: 0s - loss: 40.1950 - acc: 0.5547Epoch 00005: val_loss improved from 24.50018 to 21.42075, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 38.0938 - acc: 0.5000 - val_loss: 21.4208 - val_acc: 0.4751
Epoch 7/100
 128/1084 [==>...........................] - ETA: 0s - loss: 35.0901 - acc: 0.5156Epoch 00006: val_loss improved from 21.42075 to 19.01681, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 33.6836 - acc: 0.5166 - val_loss: 19.0168 - val_acc: 0.4779
Epoch 8/100
 128/1084 [==>...........................] - ETA: 0s - loss: 33.7607 - acc: 0.4688Epoch 00007: val_loss improved from 19.01681 to 17.06850, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 30.4628 - acc: 0.4871 - val_loss: 17.0685 - val_acc: 0.4669
Epoch 9/100
 128/1084 [==>...........................] - ETA: 0s - loss: 28.7869 - acc: 0.4922Epoch 00008: val_loss improved from 17.06850 to 15.47105, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 27.7941 - acc: 0.4991 - val_loss: 15.4711 - val_acc: 0.4724
Epoch 10/100
 128/1084 [==>...........................] - ETA: 0s - loss: 26.6982 - acc: 0.4688Epoch 00009: val_loss improved from 15.47105 to 14.13001, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 25.2462 - acc: 0.4899 - val_loss: 14.1300 - val_acc: 0.4724
Epoch 11/100
 128/1084 [==>...........................] - ETA: 0s - loss: 24.8188 - acc: 0.5781Epoch 00010: val_loss improved from 14.13001 to 12.99218, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 23.3908 - acc: 0.5074 - val_loss: 12.9922 - val_acc: 0.4586
Epoch 12/100
 128/1084 [==>...........................] - ETA: 0s - loss: 23.5220 - acc: 0.5078Epoch 00011: val_loss improved from 12.99218 to 12.00143, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 21.3852 - acc: 0.4889 - val_loss: 12.0014 - val_acc: 0.4669
Epoch 13/100
 128/1084 [==>...........................] - ETA: 0s - loss: 20.4281 - acc: 0.4688Epoch 00012: val_loss improved from 12.00143 to 11.17020, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 19.7244 - acc: 0.5129 - val_loss: 11.1702 - val_acc: 0.4751
Epoch 14/100
 128/1084 [==>...........................] - ETA: 0s - loss: 20.0149 - acc: 0.4922Epoch 00013: val_loss improved from 11.17020 to 10.45228, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 18.3238 - acc: 0.4991 - val_loss: 10.4523 - val_acc: 0.4724
Epoch 15/100
 128/1084 [==>...........................] - ETA: 0s - loss: 17.2553 - acc: 0.4453Epoch 00014: val_loss improved from 10.45228 to 9.81123, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 17.0083 - acc: 0.4908 - val_loss: 9.8112 - val_acc: 0.4669
Epoch 16/100
 128/1084 [==>...........................] - ETA: 0s - loss: 16.7534 - acc: 0.5391Epoch 00015: val_loss improved from 9.81123 to 9.23227, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 15.8247 - acc: 0.5028 - val_loss: 9.2323 - val_acc: 0.4669
Epoch 17/100
 128/1084 [==>...........................] - ETA: 0s - loss: 16.5391 - acc: 0.5000Epoch 00016: val_loss improved from 9.23227 to 8.70657, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 14.7879 - acc: 0.5341 - val_loss: 8.7066 - val_acc: 0.4669
Epoch 18/100
 128/1084 [==>...........................] - ETA: 0s - loss: 14.8193 - acc: 0.4766Epoch 00017: val_loss improved from 8.70657 to 8.23928, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 13.8178 - acc: 0.5240 - val_loss: 8.2393 - val_acc: 0.4751
Epoch 19/100
 128/1084 [==>...........................] - ETA: 0s - loss: 13.7703 - acc: 0.4922Epoch 00018: val_loss improved from 8.23928 to 7.81454, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 12.9452 - acc: 0.5083 - val_loss: 7.8145 - val_acc: 0.4751
Epoch 20/100
 128/1084 [==>...........................] - ETA: 0s - loss: 12.7324 - acc: 0.5547Epoch 00019: val_loss improved from 7.81454 to 7.42530, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 12.1328 - acc: 0.5175 - val_loss: 7.4253 - val_acc: 0.4696
Epoch 21/100
 128/1084 [==>...........................] - ETA: 0s - loss: 11.9531 - acc: 0.4922Epoch 00020: val_loss improved from 7.42530 to 7.07429, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 11.4254 - acc: 0.5000 - val_loss: 7.0743 - val_acc: 0.4696
Epoch 22/100
 128/1084 [==>...........................] - ETA: 0s - loss: 11.6938 - acc: 0.5000Epoch 00021: val_loss improved from 7.07429 to 6.75653, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 10.8461 - acc: 0.4880 - val_loss: 6.7565 - val_acc: 0.4669
Epoch 23/100
 128/1084 [==>...........................] - ETA: 0s - loss: 11.2233 - acc: 0.5547Epoch 00022: val_loss improved from 6.75653 to 6.46033, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 10.1924 - acc: 0.5129 - val_loss: 6.4603 - val_acc: 0.4641
Epoch 24/100
 128/1084 [==>...........................] - ETA: 0s - loss: 9.9074 - acc: 0.5625Epoch 00023: val_loss improved from 6.46033 to 6.18173, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 9.7679 - acc: 0.5138 - val_loss: 6.1817 - val_acc: 0.4669
Epoch 25/100
 128/1084 [==>...........................] - ETA: 0s - loss: 9.8934 - acc: 0.4453Epoch 00024: val_loss improved from 6.18173 to 5.92346, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 9.2309 - acc: 0.4945 - val_loss: 5.9235 - val_acc: 0.4669
Epoch 26/100
 128/1084 [==>...........................] - ETA: 0s - loss: 9.6554 - acc: 0.5234Epoch 00025: val_loss improved from 5.92346 to 5.69132, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 8.7307 - acc: 0.5249 - val_loss: 5.6913 - val_acc: 0.4641
Epoch 27/100
 128/1084 [==>...........................] - ETA: 0s - loss: 8.7323 - acc: 0.5234
Epoch 00026: reducing learning rate to 0.0009000000427477062.
Epoch 00026: val_loss improved from 5.69132 to 5.47541, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 8.1596 - acc: 0.5129 - val_loss: 5.4754 - val_acc: 0.4613
Epoch 28/100
 128/1084 [==>...........................] - ETA: 0s - loss: 8.1075 - acc: 0.5547Epoch 00027: val_loss improved from 5.47541 to 5.28975, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 7.8426 - acc: 0.5065 - val_loss: 5.2898 - val_acc: 0.4641
Epoch 29/100
 128/1084 [==>...........................] - ETA: 0s - loss: 7.9505 - acc: 0.5156Epoch 00028: val_loss improved from 5.28975 to 5.11519, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 7.4499 - acc: 0.4926 - val_loss: 5.1152 - val_acc: 0.4641
Epoch 30/100
 128/1084 [==>...........................] - ETA: 0s - loss: 7.3043 - acc: 0.5312Epoch 00029: val_loss improved from 5.11519 to 4.95241, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 7.1836 - acc: 0.5221 - val_loss: 4.9524 - val_acc: 0.4669
Epoch 31/100
 128/1084 [==>...........................] - ETA: 0s - loss: 7.4803 - acc: 0.5703Epoch 00030: val_loss improved from 4.95241 to 4.79680, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 6.8762 - acc: 0.5175 - val_loss: 4.7968 - val_acc: 0.4669
Epoch 32/100
 128/1084 [==>...........................] - ETA: 0s - loss: 6.9840 - acc: 0.5312Epoch 00031: val_loss improved from 4.79680 to 4.65077, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 6.6576 - acc: 0.5009 - val_loss: 4.6508 - val_acc: 0.4669
Epoch 33/100
 128/1084 [==>...........................] - ETA: 0s - loss: 6.5340 - acc: 0.5156Epoch 00032: val_loss improved from 4.65077 to 4.51311, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 6.3326 - acc: 0.5055 - val_loss: 4.5131 - val_acc: 0.4641
Epoch 34/100
 128/1084 [==>...........................] - ETA: 0s - loss: 6.0475 - acc: 0.5938Epoch 00033: val_loss improved from 4.51311 to 4.37638, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 6.1370 - acc: 0.5212 - val_loss: 4.3764 - val_acc: 0.4641
Epoch 35/100
 128/1084 [==>...........................] - ETA: 0s - loss: 5.9178 - acc: 0.4922Epoch 00034: val_loss improved from 4.37638 to 4.24935, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 5.8864 - acc: 0.4954 - val_loss: 4.2494 - val_acc: 0.4641
Epoch 36/100
 128/1084 [==>...........................] - ETA: 0s - loss: 6.1510 - acc: 0.5469Epoch 00035: val_loss improved from 4.24935 to 4.12805, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 5.6226 - acc: 0.5360 - val_loss: 4.1280 - val_acc: 0.4641
Epoch 37/100
 128/1084 [==>...........................] - ETA: 0s - loss: 5.4026 - acc: 0.5312Epoch 00036: val_loss improved from 4.12805 to 4.01517, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 5.3830 - acc: 0.5341 - val_loss: 4.0152 - val_acc: 0.4641
Epoch 38/100
 128/1084 [==>...........................] - ETA: 0s - loss: 5.3196 - acc: 0.4922Epoch 00037: val_loss improved from 4.01517 to 3.90542, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 5.1950 - acc: 0.5111 - val_loss: 3.9054 - val_acc: 0.4613
Epoch 39/100
 128/1084 [==>...........................] - ETA: 0s - loss: 5.0701 - acc: 0.4922Epoch 00038: val_loss improved from 3.90542 to 3.80052, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 5.0546 - acc: 0.5221 - val_loss: 3.8005 - val_acc: 0.4613
Epoch 40/100
 128/1084 [==>...........................] - ETA: 0s - loss: 5.0003 - acc: 0.5234Epoch 00039: val_loss improved from 3.80052 to 3.70435, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 4.8610 - acc: 0.4963 - val_loss: 3.7043 - val_acc: 0.4613
Epoch 41/100
 128/1084 [==>...........................] - ETA: 0s - loss: 4.7781 - acc: 0.5469Epoch 00040: val_loss improved from 3.70435 to 3.60871, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 4.6663 - acc: 0.5258 - val_loss: 3.6087 - val_acc: 0.4613
Epoch 42/100
 128/1084 [==>...........................] - ETA: 0s - loss: 4.7546 - acc: 0.5781Epoch 00041: val_loss improved from 3.60871 to 3.51471, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 4.5224 - acc: 0.5175 - val_loss: 3.5147 - val_acc: 0.4613
Epoch 43/100
 128/1084 [==>...........................] - ETA: 0s - loss: 4.8210 - acc: 0.5234Epoch 00042: val_loss improved from 3.51471 to 3.42869, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 4.3806 - acc: 0.5212 - val_loss: 3.4287 - val_acc: 0.4641
Epoch 44/100
 128/1084 [==>...........................] - ETA: 0s - loss: 4.2570 - acc: 0.5156Epoch 00043: val_loss improved from 3.42869 to 3.34354, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 4.1988 - acc: 0.5434 - val_loss: 3.3435 - val_acc: 0.4641
Epoch 45/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.9979 - acc: 0.4688Epoch 00044: val_loss improved from 3.34354 to 3.26095, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 4.0911 - acc: 0.5498 - val_loss: 3.2609 - val_acc: 0.4641
Epoch 46/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.9866 - acc: 0.5547Epoch 00045: val_loss improved from 3.26095 to 3.18039, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 3.9739 - acc: 0.5138 - val_loss: 3.1804 - val_acc: 0.4613
Epoch 47/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.8169 - acc: 0.5000Epoch 00046: val_loss improved from 3.18039 to 3.10498, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 3.8310 - acc: 0.5111 - val_loss: 3.1050 - val_acc: 0.4586
Epoch 48/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.9109 - acc: 0.5625Epoch 00047: val_loss improved from 3.10498 to 3.03295, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 3.7291 - acc: 0.5295 - val_loss: 3.0329 - val_acc: 0.4586
Epoch 49/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.9106 - acc: 0.5156Epoch 00048: val_loss improved from 3.03295 to 2.96408, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 3.5744 - acc: 0.5120 - val_loss: 2.9641 - val_acc: 0.4613
Epoch 50/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.6134 - acc: 0.5859Epoch 00049: val_loss improved from 2.96408 to 2.89541, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 3.4885 - acc: 0.5120 - val_loss: 2.8954 - val_acc: 0.4613
Epoch 51/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.8443 - acc: 0.5391Epoch 00050: val_loss improved from 2.89541 to 2.83012, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 3.3862 - acc: 0.5295 - val_loss: 2.8301 - val_acc: 0.4613
Epoch 52/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.7027 - acc: 0.5547
Epoch 00051: reducing learning rate to 0.0008100000384729356.
Epoch 00051: val_loss improved from 2.83012 to 2.76783, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 3.2860 - acc: 0.5101 - val_loss: 2.7678 - val_acc: 0.4613
Epoch 53/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.3022 - acc: 0.4375Epoch 00052: val_loss improved from 2.76783 to 2.71372, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 3.1942 - acc: 0.5286 - val_loss: 2.7137 - val_acc: 0.4613
Epoch 54/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.2839 - acc: 0.6641Epoch 00053: val_loss improved from 2.71372 to 2.66211, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 3.1168 - acc: 0.5480 - val_loss: 2.6621 - val_acc: 0.4613
Epoch 55/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.1977 - acc: 0.5000Epoch 00054: val_loss improved from 2.66211 to 2.61089, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 3.0303 - acc: 0.5157 - val_loss: 2.6109 - val_acc: 0.4613
Epoch 56/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.0275 - acc: 0.4766Epoch 00055: val_loss improved from 2.61089 to 2.56154, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.9504 - acc: 0.5443 - val_loss: 2.5615 - val_acc: 0.4669
Epoch 57/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.1129 - acc: 0.5078Epoch 00056: val_loss improved from 2.56154 to 2.51410, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.8752 - acc: 0.5314 - val_loss: 2.5141 - val_acc: 0.4613
Epoch 58/100
 128/1084 [==>...........................] - ETA: 0s - loss: 3.0323 - acc: 0.5078Epoch 00057: val_loss improved from 2.51410 to 2.46761, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.8184 - acc: 0.5212 - val_loss: 2.4676 - val_acc: 0.4613
Epoch 59/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.7792 - acc: 0.5703Epoch 00058: val_loss improved from 2.46761 to 2.42350, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.7327 - acc: 0.5517 - val_loss: 2.4235 - val_acc: 0.4586
Epoch 60/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.8210 - acc: 0.5312Epoch 00059: val_loss improved from 2.42350 to 2.37911, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.6806 - acc: 0.5341 - val_loss: 2.3791 - val_acc: 0.4613
Epoch 61/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.6337 - acc: 0.5469Epoch 00060: val_loss improved from 2.37911 to 2.33651, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.6060 - acc: 0.5185 - val_loss: 2.3365 - val_acc: 0.4586
Epoch 62/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.5643 - acc: 0.5547Epoch 00061: val_loss improved from 2.33651 to 2.29660, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.5567 - acc: 0.5341 - val_loss: 2.2966 - val_acc: 0.4613
Epoch 63/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.4206 - acc: 0.5234Epoch 00062: val_loss improved from 2.29660 to 2.25568, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.4959 - acc: 0.5268 - val_loss: 2.2557 - val_acc: 0.4613
Epoch 64/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.6219 - acc: 0.5703Epoch 00063: val_loss improved from 2.25568 to 2.21465, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.4483 - acc: 0.5341 - val_loss: 2.2146 - val_acc: 0.4641
Epoch 65/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.4834 - acc: 0.5391Epoch 00064: val_loss improved from 2.21465 to 2.17584, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.3919 - acc: 0.5563 - val_loss: 2.1758 - val_acc: 0.4613
Epoch 66/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.4860 - acc: 0.5391Epoch 00065: val_loss improved from 2.17584 to 2.13900, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.3397 - acc: 0.5203 - val_loss: 2.1390 - val_acc: 0.4669
Epoch 67/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.3421 - acc: 0.5938Epoch 00066: val_loss improved from 2.13900 to 2.10287, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.2864 - acc: 0.5415 - val_loss: 2.1029 - val_acc: 0.4696
Epoch 68/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.4517 - acc: 0.5156Epoch 00067: val_loss improved from 2.10287 to 2.06829, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.2404 - acc: 0.5314 - val_loss: 2.0683 - val_acc: 0.4724
Epoch 69/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.2093 - acc: 0.4922Epoch 00068: val_loss improved from 2.06829 to 2.03300, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.1929 - acc: 0.5332 - val_loss: 2.0330 - val_acc: 0.4696
Epoch 70/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.1788 - acc: 0.6094Epoch 00069: val_loss improved from 2.03300 to 1.99914, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.1388 - acc: 0.5397 - val_loss: 1.9991 - val_acc: 0.4586
Epoch 71/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.0208 - acc: 0.5469Epoch 00070: val_loss improved from 1.99914 to 1.96695, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.1156 - acc: 0.5443 - val_loss: 1.9670 - val_acc: 0.4586
Epoch 72/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.0313 - acc: 0.5703Epoch 00071: val_loss improved from 1.96695 to 1.93581, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.0778 - acc: 0.5249 - val_loss: 1.9358 - val_acc: 0.4558
Epoch 73/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.0452 - acc: 0.5234Epoch 00072: val_loss improved from 1.93581 to 1.90607, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 2.0256 - acc: 0.5295 - val_loss: 1.9061 - val_acc: 0.4613
Epoch 74/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.9097 - acc: 0.5938Epoch 00073: val_loss improved from 1.90607 to 1.87733, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.9829 - acc: 0.5341 - val_loss: 1.8773 - val_acc: 0.4669
Epoch 75/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.9349 - acc: 0.5391Epoch 00074: val_loss improved from 1.87733 to 1.84876, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.9584 - acc: 0.5304 - val_loss: 1.8488 - val_acc: 0.4696
Epoch 76/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.8821 - acc: 0.5938Epoch 00075: val_loss improved from 1.84876 to 1.81901, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.9021 - acc: 0.5295 - val_loss: 1.8190 - val_acc: 0.4751
Epoch 77/100
 128/1084 [==>...........................] - ETA: 0s - loss: 2.0270 - acc: 0.4844
Epoch 00076: reducing learning rate to 0.0007290000503417104.
Epoch 00076: val_loss improved from 1.81901 to 1.79135, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.8805 - acc: 0.5341 - val_loss: 1.7913 - val_acc: 0.4862
Epoch 78/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.8900 - acc: 0.5547Epoch 00077: val_loss improved from 1.79135 to 1.76778, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.8370 - acc: 0.5480 - val_loss: 1.7678 - val_acc: 0.4807
Epoch 79/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.8372 - acc: 0.5625Epoch 00078: val_loss improved from 1.76778 to 1.74514, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.8178 - acc: 0.5590 - val_loss: 1.7451 - val_acc: 0.4751
Epoch 80/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.8012 - acc: 0.5938Epoch 00079: val_loss improved from 1.74514 to 1.72273, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.7882 - acc: 0.5434 - val_loss: 1.7227 - val_acc: 0.4696
Epoch 81/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.8240 - acc: 0.5312Epoch 00080: val_loss improved from 1.72273 to 1.70064, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.7604 - acc: 0.5323 - val_loss: 1.7006 - val_acc: 0.4696
Epoch 82/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.7881 - acc: 0.6094Epoch 00081: val_loss improved from 1.70064 to 1.67865, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.7352 - acc: 0.5498 - val_loss: 1.6786 - val_acc: 0.4724
Epoch 83/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.7625 - acc: 0.5312Epoch 00082: val_loss improved from 1.67865 to 1.65685, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.6969 - acc: 0.5572 - val_loss: 1.6569 - val_acc: 0.4724
Epoch 84/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.7100 - acc: 0.5703Epoch 00083: val_loss improved from 1.65685 to 1.63582, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.6764 - acc: 0.5563 - val_loss: 1.6358 - val_acc: 0.4751
Epoch 85/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.7639 - acc: 0.5391Epoch 00084: val_loss improved from 1.63582 to 1.61518, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.6516 - acc: 0.5637 - val_loss: 1.6152 - val_acc: 0.4779
Epoch 86/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.6245 - acc: 0.6016Epoch 00085: val_loss improved from 1.61518 to 1.59537, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.6278 - acc: 0.5332 - val_loss: 1.5954 - val_acc: 0.4779
Epoch 87/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.7024 - acc: 0.4844Epoch 00086: val_loss improved from 1.59537 to 1.57593, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.6048 - acc: 0.5406 - val_loss: 1.5759 - val_acc: 0.4641
Epoch 88/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.6021 - acc: 0.5469Epoch 00087: val_loss improved from 1.57593 to 1.55596, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.5738 - acc: 0.5563 - val_loss: 1.5560 - val_acc: 0.4669
Epoch 89/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.6391 - acc: 0.6328Epoch 00088: val_loss improved from 1.55596 to 1.53756, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.5610 - acc: 0.5526 - val_loss: 1.5376 - val_acc: 0.4834
Epoch 90/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.5048 - acc: 0.5234Epoch 00089: val_loss improved from 1.53756 to 1.51938, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.5357 - acc: 0.5590 - val_loss: 1.5194 - val_acc: 0.4834
Epoch 91/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.5581 - acc: 0.5781Epoch 00090: val_loss improved from 1.51938 to 1.50129, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.5231 - acc: 0.5397 - val_loss: 1.5013 - val_acc: 0.4917
Epoch 92/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.5664 - acc: 0.6172Epoch 00091: val_loss improved from 1.50129 to 1.48285, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.4931 - acc: 0.5397 - val_loss: 1.4828 - val_acc: 0.4807
Epoch 93/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.4695 - acc: 0.5781Epoch 00092: val_loss improved from 1.48285 to 1.46595, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.4763 - acc: 0.5609 - val_loss: 1.4660 - val_acc: 0.4807
Epoch 94/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.5131 - acc: 0.5469Epoch 00093: val_loss improved from 1.46595 to 1.44766, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.4542 - acc: 0.5664 - val_loss: 1.4477 - val_acc: 0.4834
Epoch 95/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.4692 - acc: 0.5859Epoch 00094: val_loss improved from 1.44766 to 1.43202, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.4348 - acc: 0.5664 - val_loss: 1.4320 - val_acc: 0.4724
Epoch 96/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.4003 - acc: 0.5469Epoch 00095: val_loss improved from 1.43202 to 1.41523, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.4183 - acc: 0.5434 - val_loss: 1.4152 - val_acc: 0.4862
Epoch 97/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.4630 - acc: 0.5078Epoch 00096: val_loss improved from 1.41523 to 1.40256, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.4055 - acc: 0.5295 - val_loss: 1.4026 - val_acc: 0.4779
Epoch 98/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.3837 - acc: 0.5625Epoch 00097: val_loss improved from 1.40256 to 1.38666, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.3832 - acc: 0.5517 - val_loss: 1.3867 - val_acc: 0.4862
Epoch 99/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.3358 - acc: 0.5312Epoch 00098: val_loss improved from 1.38666 to 1.37119, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.3671 - acc: 0.5360 - val_loss: 1.3712 - val_acc: 0.4807
Epoch 100/100
 128/1084 [==>...........................] - ETA: 0s - loss: 1.3487 - acc: 0.5703Epoch 00099: val_loss improved from 1.37119 to 1.35424, saving model to test.hdf5
1084/1084 [==============================] - 0s - loss: 1.3528 - acc: 0.5563 - val_loss: 1.3542 - val_acc: 0.5083
In [6]:
# Visualization of loss
plt.figure() 
plt.plot(history.history['loss']) 
plt.plot(history.history['val_loss']) 
plt.title('model loss') 
plt.ylabel('loss') 
plt.xlabel('epoch') 
plt.legend(['train', 'test'], loc='best') 
Out[6]:
<matplotlib.legend.Legend at 0x7f34183ac898>
In [7]:
# Visualization of Accuracy
plt.figure()
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='best')
plt.show()

回测

测试的策略相对比较简单。 若预测为上涨,则满仓,否则空仓。

In [8]:
# 1. 策略基本参数

# 回测起始时间
start_date_2 = '2016-02-02'
# 回测结束时间
end_date_2 = '2017-06-20'
# 策略比较参考标准,以沪深300为例
benchmark = '000300.INDX'
# 证券池 以贵州茅台为例
# instruments = ['000030.SZA']
# 起始资金
capital_base = 100000

# 2. 策略主体函数

# 初始化虚拟账户状态,只在第一个交易日运行
def initialize(context):
    # 设置手续费,买入时万3,卖出是千分之1.3,不足5元以5元计
    context.set_commission(PerOrder(buy_cost=0.0003, sell_cost=0.0013, min_cost=5))

# 策略交易逻辑,每个交易日运行一次
def handle_data(context, data):
    # 在这里添加策略代码
    for instrument in instruments:
        # 字符型股票代码转化成 BigQuant回测引擎所需的股票代码
        instrument = context.symbol(instrument)
        history = data.history(instrument, 'close', 30, '1d').tolist()
        norm_hist = [(np.array(history) - np.mean(history)) / np.std(history)]
        prediction = model.predict(np.asarray(norm_hist))[0]
        
        #print(prediction)
        
        if  np.argmax(prediction) == 0 and data.can_trade(instrument):
            order_target_percent(instrument, 1)
        elif np.argmax(prediction) == 1 and data.can_trade(instrument):
            order_target_percent(instrument, 0)

# 3. 启动回测

# 策略回测接口: https://bigquant.com/docs/module_trade.html
m = M.trade.v1(
    instruments=instruments,
    start_date=start_date_2,
    end_date=end_date_2,
    initialize=initialize,
    handle_data=handle_data,
    # 买入订单以开盘价成交
    order_price_field_buy='open',
    # 卖出订单以开盘价成交
    order_price_field_sell='open',
    capital_base=capital_base,
    benchmark=benchmark,
)
[2017-06-24 11:34:51.748801] INFO: bigquant: backtest.v6 start ..
[2017-06-24 11:34:51.754018] INFO: bigquant: hit cache
  • 收益率112.79%
  • 年化收益率76.78%
  • 基准收益率22.25%
  • 阿尔法0.65
  • 贝塔0.59
  • 夏普比率3.15
  • 收益波动率22.94%
  • 信息比率2.74
  • 最大回撤9.58%
[2017-06-24 11:34:52.668434] INFO: bigquant: backtest.v6 end [0.919634s].
In [ ]: