DQN个股择时策略研究

策略分享
强化学习
dqn
rl
标签: #<Tag:0x00007f4ccbd3bbb0> #<Tag:0x00007f4ccbd3ba70> #<Tag:0x00007f4ccbd3b930> #<Tag:0x00007f4ccbd3b7f0>

(iQuant) #1

本文主要分享一个基于Deep Q Network的对于个股的择时策略

1.算法简介

1.1 DQN与Q-Learning

本文主要使用的是Deep Q Network。DQN是强化学习的一种方法,结合了Q-learning和深度学习神经网络。

Q-learning是用一张表来记录各个状态下的各个行为的q值,它能记录的状态的个数是有限的。而在金融市场上,价格、交易量等数据都是连续的,因此可以组合成无数种状态,如果用一张表来记录,那么这张表将大到无法想象。

而DQN不用表来记录Q值,而是用一个深度学习神经网络来预测Q值,并通过不断更新神经网络从而学习到最优的行动路径。结构如下所示:

DQN1

仿真环境输出state,reward, 神经网络根据状态选择action,形成一条(state, reward, action, next_state)的记录记录到memory中,当memory中的记录达到一定数量后,用一种采样策略从memory中进行采样,然后对模型进行训练和参数更新,以更好的选择action。

1.2 算法更新

强化学习中,agent的目标是最大化期望累积奖励。
我们用Q(s, a)来表示Q-table中 s 状态下选择 a 行为的Q值, s’ 表示 s 的下一个状态, 则
max Q(s’, a’) 表示下一个状态下各个行为中最大的Q值,则我们的目标Q值是
$$reward+\gamma*\max_{a’}Q(s’,a’)$$而我们实际预测的Q值为Q(s,a),则预测误差为他们的差值。因此我们用以下式子来对表中的Q值进行更新:
$$Q(s,a)=Q(s,a)+\alpha*[reward+\gamma *max_{a’} (s’,a’)-Q(s,a)]$$

DQN中网络参数的更新逻辑和Q-table中是一致的。我们用目标Q值作为输入来训练模型。
由于模型中输出的是s状态下所有行为的Q值 [ q1, q2, q3, …],其中q1, q2, q3 分别对应行为 1, 2, 3的预测Q值。假设我们已知s状态下2行为的reward,则我们需要先得到模型预测的Q-value=[q1, q2, q3, …], s’状态下最大Q值 max Q(s’,a’), 则(s,2)的Q-target为 reward+gamma*max Q(s’,a’), 用这个Q-target替代原Q-value中的 q2, 得到 [ q1, q2-target, q3, …], 并用这条数据去训练模型来更新参数。也就是说虽然网络输出s状态下所有行为的q值,但我们每次只能更新其中一个行为的参数,所以需要做这样一个替代再进行训练。以上这些逻辑可以用以下几行代码进行实现:

q_value=self.model.predict(state) #得到预测q值
target=reward+self.gamma*np.amax(self.model.predict(new_state)[0]) #计算q-target
q_value[0][action]=target #用q-target替换对应行为的q值
self.model.fit(state,q_value,epochs=1,verbose=0) #fit 模型

2.数据选取

本文示例中选择了000001.SZA,用2012-01-01到2016-01-01的日线数据做训练,2016-01-01到2017-01-01的数据做测试。

3.策略思路

DQN的原理与个股的择时逻辑相一致,每当有新的一天的行情数据出现时,我们在原始数据的基础上计算得到一个state, 神经网络根据这个state选择买进或是卖出,选择之后环境会给我们一个反馈,我们根据这个反馈反思这次行为选择的好坏,从而更新网络的参数,使得下一次能够选择更好的行为。
模型训练结束后,输入测试集数据,根据输出q值选择行为,并根据这些行为在平台交易引擎上进行回测,根据回测结果对模型进行评价。

4.策略研究流程

策略代码一共分为四个部分:构建环境,定义agent,训练以及测试。

111

使用DQN时,首先我们需要构建一个环境,定义state, action, reward以及手续费等,这里我们选择当天的OHLCV和7个常用因子数据作为一条state, 定义sell和buy两种action,把总资产价值的变化作为行为的reward。

然后我们需要定义一个agent,agent里主要定义replay, model, choose_act三个函数。model定义神经网络结构,这里我们选择构建一个3层的全连接神经网络,每层分别定义64/32/8个神经节点。replay用于模型训练,choose_act用于根据模型输出结果选择行为。

训练时依次获取state,根据模型结果选择action,形成一条记录写到memory中,当memory中的记录数量超过我们设置的batch_size时,从memory中采样对模型进行训练。

测试时根据测试集数据依次获取state并输入模型获得q值预测,从而选择行为,根据选择的行为在平台交易引擎上进行回测,作为模型的评估依据。

5.回测结果

6.未来扩展研究

这个策略只是DQN的一个非常简单的实现,还有很多的改进和调整的空间,比如:

  • 目前这个策略基于日线数据进行操作的,也可以改为对分钟数据进行操作,众多的相关研究报告都表明强化学习在日内高频数据上优于日频数据,本文只是一个强化学习算法在量化交易中的demo实现,后续会推出DQN在高频数据上的研究成果。

  • 可以在模型中定义两个神经网络,一个及时更新参数,用来输出当前Q值,另一个不及时更新参数,用来输出目标Q值, 这样在一段时间里目标Q值是保持不变的,一定程度降低了当前Q值和目标Q值的相关性,提高了算法稳定性。

  • 可以定义sell/buy/hold三种action,这样就不一定要每天进行买卖操作,和真实情况更加接近,也能降低频繁操作带来的成本,如手续费等。

  • 可以用不同的方法来定义state。

  • 可以对神经网络的深度和参数做进一步的调整。

  • 目前这是一个针对个股的择时策略,可以在这个基础上做一定的修改,把它变成一个选股策略。

欢迎大家进行尝试和讨论。

克隆策略

    {"Description":"实验创建于2019/7/10","Summary":"","Graph":{"EdgesInternal":[{"DestinationInputPortId":"-215:instruments","SourceOutputPortId":"-1873:data"},{"DestinationInputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-15:instruments","SourceOutputPortId":"-1873:data"},{"DestinationInputPortId":"-222:input_data","SourceOutputPortId":"-215:data"},{"DestinationInputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-53:data2","SourceOutputPortId":"-222:data"},{"DestinationInputPortId":"-215:features","SourceOutputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24:data"},{"DestinationInputPortId":"-222:features","SourceOutputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24:data"},{"DestinationInputPortId":"-144:data","SourceOutputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-53:data"},{"DestinationInputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-53:data1","SourceOutputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-15:data"},{"DestinationInputPortId":"-167:instruments","SourceOutputPortId":"-158:data"},{"DestinationInputPortId":"-193:instruments","SourceOutputPortId":"-158:data"},{"DestinationInputPortId":"-213:instruments","SourceOutputPortId":"-158:data"},{"DestinationInputPortId":"-174:input_data","SourceOutputPortId":"-167:data"},{"DestinationInputPortId":"-187:data2","SourceOutputPortId":"-174:data"},{"DestinationInputPortId":"-167:features","SourceOutputPortId":"-182:data"},{"DestinationInputPortId":"-174:features","SourceOutputPortId":"-182:data"},{"DestinationInputPortId":"-3698:input_data","SourceOutputPortId":"-187:data"},{"DestinationInputPortId":"-187:data1","SourceOutputPortId":"-193:data"},{"DestinationInputPortId":"-213:options_data","SourceOutputPortId":"-3071:data"},{"DestinationInputPortId":"-144:data_2","SourceOutputPortId":"-3698:data"},{"DestinationInputPortId":"-3071:input_data","SourceOutputPortId":"-144:data_1"}],"ModuleNodes":[{"Id":"-1873","ModuleId":"BigQuantSpace.instruments.instruments-v2","ModuleParameters":[{"Name":"start_date","Value":"2012-01-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"2016-01-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"market","Value":"CN_STOCK_A","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"instrument_list","Value":"000001.SZA\n","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"max_count","Value":0,"ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"rolling_conf","NodeId":"-1873"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-1873","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":7,"Comment":"","CommentCollapsed":true},{"Id":"-215","ModuleId":"BigQuantSpace.general_feature_extractor.general_feature_extractor-v7","ModuleParameters":[{"Name":"start_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"before_start_days","Value":"20","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"instruments","NodeId":"-215"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-215"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-215","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":9,"Comment":"","CommentCollapsed":true},{"Id":"-222","ModuleId":"BigQuantSpace.derived_feature_extractor.derived_feature_extractor-v3","ModuleParameters":[{"Name":"date_col","Value":"date","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"instrument_col","Value":"instrument","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_na","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"remove_extra_columns","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"user_functions","Value":"","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_data","NodeId":"-222"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-222"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-222","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":10,"Comment":"","CommentCollapsed":true},{"Id":"287d2cb0-f53c-4101-bdf8-104b137c8601-24","ModuleId":"BigQuantSpace.input_features.input_features-v1","ModuleParameters":[{"Name":"features","Value":"# #号开始的表示注释\n# 多个特征,每行一个,可以包含基础特征和衍生特征\nclose_0/mean(close_0,5)\nclose_0/mean(close_0,10)\nclose_0/mean(close_0,20)\nclose_0/open_0\nopen_0/mean(close_0,5)\nopen_0/mean(close_0,10)\nopen_0/mean(close_0,20)","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features_ds","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24"}],"OutputPortsInternal":[{"Name":"data","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":11,"Comment":"","CommentCollapsed":true},{"Id":"287d2cb0-f53c-4101-bdf8-104b137c8601-53","ModuleId":"BigQuantSpace.join.join-v3","ModuleParameters":[{"Name":"on","Value":"date,instrument","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"how","Value":"inner","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"sort","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"data1","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-53"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"data2","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-53"}],"OutputPortsInternal":[{"Name":"data","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-53","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":12,"Comment":"","CommentCollapsed":true},{"Id":"287d2cb0-f53c-4101-bdf8-104b137c8601-15","ModuleId":"BigQuantSpace.advanced_auto_labeler.advanced_auto_labeler-v2","ModuleParameters":[{"Name":"label_expr","Value":"# #号开始的表示注释\n# 0. 每行一个,顺序执行,从第二个开始,可以使用label字段\n# 1. 可用数据字段见 https://bigquant.com/docs/develop/datasource/deprecated/history_data.html\n# 添加benchmark_前缀,可使用对应的benchmark数据\n# 2. 可用操作符和函数见 `表达式引擎 <https://bigquant.com/docs/develop/bigexpr/usage.html>`_\n\n# 计算收益:5日收盘价(作为卖出价格)除以明日开盘价(作为买入价格)\nshift(close, -5) / shift(open, -1)\n\n# 极值处理:用1%和99%分位的值做clip\nclip(label, all_quantile(label, 0.01), all_quantile(label, 0.99))\n\n# 将分数映射到分类,这里使用20个分类\nall_wbins(label, 20)\n\n# 过滤掉一字涨停的情况 (设置label为NaN,在后续处理和训练中会忽略NaN的label)\nwhere(shift(high, -1) == shift(low, -1), NaN, label)\n","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"start_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"benchmark","Value":"000300.SHA","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_na_label","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"cast_label_int","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"user_functions","Value":"","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"instruments","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-15"}],"OutputPortsInternal":[{"Name":"data","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-15","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":8,"Comment":"","CommentCollapsed":true},{"Id":"-158","ModuleId":"BigQuantSpace.instruments.instruments-v2","ModuleParameters":[{"Name":"start_date","Value":"2016-01-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"2017-01-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"market","Value":"CN_STOCK_A","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"instrument_list","Value":"000001.SZA","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"max_count","Value":0,"ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"rolling_conf","NodeId":"-158"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-158","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":14,"Comment":"","CommentCollapsed":true},{"Id":"-167","ModuleId":"BigQuantSpace.general_feature_extractor.general_feature_extractor-v7","ModuleParameters":[{"Name":"start_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"before_start_days","Value":"20","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"instruments","NodeId":"-167"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-167"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-167","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":15,"Comment":"","CommentCollapsed":true},{"Id":"-174","ModuleId":"BigQuantSpace.derived_feature_extractor.derived_feature_extractor-v3","ModuleParameters":[{"Name":"date_col","Value":"date","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"instrument_col","Value":"instrument","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_na","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"remove_extra_columns","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"user_functions","Value":"","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_data","NodeId":"-174"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-174"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-174","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":16,"Comment":"","CommentCollapsed":true},{"Id":"-182","ModuleId":"BigQuantSpace.input_features.input_features-v1","ModuleParameters":[{"Name":"features","Value":"# #号开始的表示注释\n# 多个特征,每行一个,可以包含基础特征和衍生特征\nclose_0/mean(close_0,5)\nclose_0/mean(close_0,10)\nclose_0/mean(close_0,20)\nclose_0/open_0\nopen_0/mean(close_0,5)\nopen_0/mean(close_0,10)\nopen_0/mean(close_0,20)\n","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features_ds","NodeId":"-182"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-182","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":17,"Comment":"","CommentCollapsed":true},{"Id":"-187","ModuleId":"BigQuantSpace.join.join-v3","ModuleParameters":[{"Name":"on","Value":"date,instrument","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"how","Value":"inner","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"sort","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"data1","NodeId":"-187"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"data2","NodeId":"-187"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-187","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":18,"Comment":"","CommentCollapsed":true},{"Id":"-193","ModuleId":"BigQuantSpace.advanced_auto_labeler.advanced_auto_labeler-v2","ModuleParameters":[{"Name":"label_expr","Value":"# #号开始的表示注释\n# 0. 每行一个,顺序执行,从第二个开始,可以使用label字段\n# 1. 可用数据字段见 https://bigquant.com/docs/develop/datasource/deprecated/history_data.html\n# 添加benchmark_前缀,可使用对应的benchmark数据\n# 2. 可用操作符和函数见 `表达式引擎 <https://bigquant.com/docs/develop/bigexpr/usage.html>`_\n\n# 计算收益:5日收盘价(作为卖出价格)除以明日开盘价(作为买入价格)\nshift(close, -5) / shift(open, -1)\n\n# 极值处理:用1%和99%分位的值做clip\nclip(label, all_quantile(label, 0.01), all_quantile(label, 0.99))\n\n# 将分数映射到分类,这里使用20个分类\nall_wbins(label, 20)\n\n# 过滤掉一字涨停的情况 (设置label为NaN,在后续处理和训练中会忽略NaN的label)\nwhere(shift(high, -1) == shift(low, -1), NaN, label)\n","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"start_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"benchmark","Value":"000300.SHA","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_na_label","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"cast_label_int","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"user_functions","Value":"","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"instruments","NodeId":"-193"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-193","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":19,"Comment":"","CommentCollapsed":true},{"Id":"-213","ModuleId":"BigQuantSpace.trade.trade-v4","ModuleParameters":[{"Name":"start_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"initialize","Value":"# 回测引擎:初始化函数,只执行一次\ndef bigquant_run(context):\n # 加载预测数据\n\n context.pre_act = context.options['data'].read_df()\n\n # 系统已经设置了默认的交易手续费和滑点,要修改手续()费可使用如下函数\n context.set_commission(PerOrder(buy_cost=0.0003, sell_cost=0.0013, min_cost=5))\n # 预测数据,通过options传入进来,使用 read_df 函数,加载到内存 (DataFrame)\n # 设置买入的股票数量,这里买入预测股票列表排名靠前的5只\n stock_count = 5\n # 每只的股票的权重,如下的权重分配会使得靠前的股票分配多一点的资金,[0.339160, 0.213986, 0.169580, ..]\n context.stock_weights = T.norm([1 / math.log(i + 2) for i in range(0, stock_count)])\n # 设置每只股票占用的最大资金比例\n context.max_cash_per_instrument = 0.2\n context.hold_days = 5\n","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"handle_data","Value":"# 回测引擎:每日数据处理函数,每天执行一次\ndef bigquant_run(context, data):\n sid = symbol('000001.SZA')\n \n action = context.pre_act[context.pre_act['date'] == data.current_dt.strftime('%Y-%m-%d')]\n \n # 持仓\n cur_position = context.portfolio.positions[sid].amount \n \n if len(action['pre_action'])>0:\n\n if int(action['pre_action'])==0 and cur_position == 0:\n context.order(sid, 100)\n \n elif int(action['pre_action'])==1 and cur_position > 0:\n context.order_target_percent(sid, 0)\n","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"prepare","Value":"# 回测引擎:准备数据,只执行一次\ndef bigquant_run(context):\n pass\n","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"before_trading_start","Value":"# 回测引擎:每个单位时间开始前调用一次,即每日开盘前调用一次。\ndef bigquant_run(context, data):\n pass\n","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"volume_limit","Value":0.025,"ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"order_price_field_buy","Value":"open","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"order_price_field_sell","Value":"close","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"capital_base","Value":"100000","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"auto_cancel_non_tradable_orders","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"data_frequency","Value":"daily","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"price_type","Value":"后复权","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"product_type","Value":"股票","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"plot_charts","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"backtest_only","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"benchmark","Value":"","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"instruments","NodeId":"-213"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"options_data","NodeId":"-213"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"history_ds","NodeId":"-213"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"benchmark_ds","NodeId":"-213"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"trading_calendar","NodeId":"-213"}],"OutputPortsInternal":[{"Name":"raw_perf","NodeId":"-213","OutputType":null}],"UsePreviousResults":false,"moduleIdForCode":2,"Comment":"","CommentCollapsed":true},{"Id":"-3071","ModuleId":"BigQuantSpace.dropnan.dropnan-v1","ModuleParameters":[],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_data","NodeId":"-3071"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-3071","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":3,"Comment":"","CommentCollapsed":true},{"Id":"-3698","ModuleId":"BigQuantSpace.dropnan.dropnan-v1","ModuleParameters":[],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_data","NodeId":"-3698"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-3698","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":4,"Comment":"","CommentCollapsed":true},{"Id":"-144","ModuleId":"BigQuantSpace.DQN_model.DQN_model-v9","ModuleParameters":[],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"data","NodeId":"-144"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"data_2","NodeId":"-144"}],"OutputPortsInternal":[{"Name":"data_1","NodeId":"-144","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":1,"Comment":"","CommentCollapsed":true}],"SerializedClientData":"<?xml version='1.0' encoding='utf-16'?><DataV1 xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'><Meta /><NodePositions><NodePosition Node='-1873' Position='-452,-11,200,200'/><NodePosition Node='-215' Position='-231,164,200,200'/><NodePosition Node='-222' Position='-230,242,200,200'/><NodePosition Node='287d2cb0-f53c-4101-bdf8-104b137c8601-24' Position='-139,-5,200,200'/><NodePosition Node='287d2cb0-f53c-4101-bdf8-104b137c8601-53' Position='-265.2052307128906,354,200,200'/><NodePosition Node='287d2cb0-f53c-4101-bdf8-104b137c8601-15' Position='-542,158,200,200'/><NodePosition Node='-158' Position='104,-101,200,200'/><NodePosition Node='-167' Position='325,78,200,200'/><NodePosition Node='-174' Position='332,165,200,200'/><NodePosition Node='-182' Position='417,-90,200,200'/><NodePosition Node='-187' Position='174,249,200,200'/><NodePosition Node='-193' Position='14,72,200,200'/><NodePosition Node='-213' Position='23.275100708007812,615.275146484375,200,200'/><NodePosition Node='-3071' Position='31,542,200,200'/><NodePosition Node='-3698' Position='175,349,200,200'/><NodePosition Node='-144' Position='10.752975463867188,459.2115783691406,200,200'/></NodePositions><NodeGroups /></DataV1>"},"IsDraft":true,"ParentExperimentId":null,"WebService":{"IsWebServiceExperiment":false,"Inputs":[],"Outputs":[],"Parameters":[{"Name":"交易日期","Value":"","ParameterDefinition":{"Name":"交易日期","FriendlyName":"交易日期","DefaultValue":"","ParameterType":"String","HasDefaultValue":true,"IsOptional":true,"ParameterRules":[],"HasRules":false,"MarkupType":0,"CredentialDescriptor":null}}],"WebServiceGroupId":null,"SerializedClientData":"<?xml version='1.0' encoding='utf-16'?><DataV1 xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'><Meta /><NodePositions></NodePositions><NodeGroups /></DataV1>"},"DisableNodesUpdate":false,"Category":"user","Tags":[],"IsPartialRun":true}
    In [8]:
    # 本代码由可视化策略环境自动生成 2019年8月30日 11:30
    # 本代码单元只能在可视化模式下编辑。您也可以拷贝代码,粘贴到新建的代码单元或者策略,然后修改。
    
    
    # 回测引擎:初始化函数,只执行一次
    def m2_initialize_bigquant_run(context):
        # 加载预测数据
    
        context.pre_act = context.options['data'].read_df()
    
        # 系统已经设置了默认的交易手续费和滑点,要修改手续()费可使用如下函数
        context.set_commission(PerOrder(buy_cost=0.0003, sell_cost=0.0013, min_cost=5))
        # 预测数据,通过options传入进来,使用 read_df 函数,加载到内存 (DataFrame)
        # 设置买入的股票数量,这里买入预测股票列表排名靠前的5只
        stock_count = 5
        # 每只的股票的权重,如下的权重分配会使得靠前的股票分配多一点的资金,[0.339160, 0.213986, 0.169580, ..]
        context.stock_weights = T.norm([1 / math.log(i + 2) for i in range(0, stock_count)])
        # 设置每只股票占用的最大资金比例
        context.max_cash_per_instrument = 0.2
        context.hold_days = 5
    
    # 回测引擎:每日数据处理函数,每天执行一次
    def m2_handle_data_bigquant_run(context, data):
        sid = symbol('000001.SZA')
        
        action = context.pre_act[context.pre_act['date'] == data.current_dt.strftime('%Y-%m-%d')]
        
        # 持仓
        cur_position = context.portfolio.positions[sid].amount 
        
        if len(action['pre_action'])>0:
    
            if int(action['pre_action'])==0 and  cur_position == 0:
                context.order(sid, 100)
                
            elif int(action['pre_action'])==1 and cur_position > 0:
                context.order_target_percent(sid, 0)
    
    # 回测引擎:准备数据,只执行一次
    def m2_prepare_bigquant_run(context):
        pass
    
    # 回测引擎:每个单位时间开始前调用一次,即每日开盘前调用一次。
    def m2_before_trading_start_bigquant_run(context, data):
        pass
    
    
    m7 = M.instruments.v2(
        start_date='2012-01-01',
        end_date='2016-01-01',
        market='CN_STOCK_A',
        instrument_list="""000001.SZA
    """,
        max_count=0
    )
    
    m8 = M.advanced_auto_labeler.v2(
        instruments=m7.data,
        label_expr="""# #号开始的表示注释
    # 0. 每行一个,顺序执行,从第二个开始,可以使用label字段
    # 1. 可用数据字段见 https://bigquant.com/docs/develop/datasource/deprecated/history_data.html
    #   添加benchmark_前缀,可使用对应的benchmark数据
    # 2. 可用操作符和函数见 `表达式引擎 <https://bigquant.com/docs/develop/bigexpr/usage.html>`_
    
    # 计算收益:5日收盘价(作为卖出价格)除以明日开盘价(作为买入价格)
    shift(close, -5) / shift(open, -1)
    
    # 极值处理:用1%和99%分位的值做clip
    clip(label, all_quantile(label, 0.01), all_quantile(label, 0.99))
    
    # 将分数映射到分类,这里使用20个分类
    all_wbins(label, 20)
    
    # 过滤掉一字涨停的情况 (设置label为NaN,在后续处理和训练中会忽略NaN的label)
    where(shift(high, -1) == shift(low, -1), NaN, label)
    """,
        start_date='',
        end_date='',
        benchmark='000300.SHA',
        drop_na_label=True,
        cast_label_int=True
    )
    
    m11 = M.input_features.v1(
        features="""# #号开始的表示注释
    # 多个特征,每行一个,可以包含基础特征和衍生特征
    close_0/mean(close_0,5)
    close_0/mean(close_0,10)
    close_0/mean(close_0,20)
    close_0/open_0
    open_0/mean(close_0,5)
    open_0/mean(close_0,10)
    open_0/mean(close_0,20)"""
    )
    
    m9 = M.general_feature_extractor.v7(
        instruments=m7.data,
        features=m11.data,
        start_date='',
        end_date='',
        before_start_days=20
    )
    
    m10 = M.derived_feature_extractor.v3(
        input_data=m9.data,
        features=m11.data,
        date_col='date',
        instrument_col='instrument',
        drop_na=False,
        remove_extra_columns=False
    )
    
    m12 = M.join.v3(
        data1=m8.data,
        data2=m10.data,
        on='date,instrument',
        how='inner',
        sort=False
    )
    
    m14 = M.instruments.v2(
        start_date='2016-01-01',
        end_date='2017-01-01',
        market='CN_STOCK_A',
        instrument_list='000001.SZA',
        max_count=0
    )
    
    m19 = M.advanced_auto_labeler.v2(
        instruments=m14.data,
        label_expr="""# #号开始的表示注释
    # 0. 每行一个,顺序执行,从第二个开始,可以使用label字段
    # 1. 可用数据字段见 https://bigquant.com/docs/develop/datasource/deprecated/history_data.html
    #   添加benchmark_前缀,可使用对应的benchmark数据
    # 2. 可用操作符和函数见 `表达式引擎 <https://bigquant.com/docs/develop/bigexpr/usage.html>`_
    
    # 计算收益:5日收盘价(作为卖出价格)除以明日开盘价(作为买入价格)
    shift(close, -5) / shift(open, -1)
    
    # 极值处理:用1%和99%分位的值做clip
    clip(label, all_quantile(label, 0.01), all_quantile(label, 0.99))
    
    # 将分数映射到分类,这里使用20个分类
    all_wbins(label, 20)
    
    # 过滤掉一字涨停的情况 (设置label为NaN,在后续处理和训练中会忽略NaN的label)
    where(shift(high, -1) == shift(low, -1), NaN, label)
    """,
        start_date='',
        end_date='',
        benchmark='000300.SHA',
        drop_na_label=True,
        cast_label_int=True
    )
    
    m17 = M.input_features.v1(
        features="""# #号开始的表示注释
    # 多个特征,每行一个,可以包含基础特征和衍生特征
    close_0/mean(close_0,5)
    close_0/mean(close_0,10)
    close_0/mean(close_0,20)
    close_0/open_0
    open_0/mean(close_0,5)
    open_0/mean(close_0,10)
    open_0/mean(close_0,20)
    """
    )
    
    m15 = M.general_feature_extractor.v7(
        instruments=m14.data,
        features=m17.data,
        start_date='',
        end_date='',
        before_start_days=20
    )
    
    m16 = M.derived_feature_extractor.v3(
        input_data=m15.data,
        features=m17.data,
        date_col='date',
        instrument_col='instrument',
        drop_na=False,
        remove_extra_columns=False
    )
    
    m18 = M.join.v3(
        data1=m19.data,
        data2=m16.data,
        on='date,instrument',
        how='inner',
        sort=False
    )
    
    m4 = M.dropnan.v1(
        input_data=m18.data
    )
    
    m1 = M.DQN_model.v9(
        data=m12.data,
        data_2=m4.data
    )
    
    m3 = M.dropnan.v1(
        input_data=m1.data_1
    )
    
    m2 = M.trade.v4(
        instruments=m14.data,
        options_data=m3.data,
        start_date='',
        end_date='',
        initialize=m2_initialize_bigquant_run,
        handle_data=m2_handle_data_bigquant_run,
        prepare=m2_prepare_bigquant_run,
        before_trading_start=m2_before_trading_start_bigquant_run,
        volume_limit=0.025,
        order_price_field_buy='open',
        order_price_field_sell='close',
        capital_base=100000,
        auto_cancel_non_tradable_orders=True,
        data_frequency='daily',
        price_type='后复权',
        product_type='股票',
        plot_charts=True,
        backtest_only=False,
        benchmark=''
    )
    
    episode: 0
    total_reward: 181854.07299804688
    episode: 1
    total_reward: 284361.70330810547
    episode: 2
    total_reward: 286217.8956298828
    episode: 3
    total_reward: 286217.8956298828
    episode: 4
    total_reward: 286217.8956298828
    
    • 收益率2.17%
    • 年化收益率2.24%
    • 基准收益率-11.28%
    • 阿尔法0.07
    • 贝塔0.54
    • 夏普比率0.03
    • 胜率1.0
    • 盈亏比0.0
    • 收益波动率16.0%
    • 信息比率0.06
    • 最大回撤10.95%
    bigcharts-data-start/{"__id":"bigchart-6c7b8e52b1394778aed0f3ee00dfb9a1","__type":"tabs"}/bigcharts-data-end

    (yjc6296) #4

    这个是不是有问题?为什么仓位一直是不动?仓位不动怎么择时?


    (yjc6296) #5

    我又重新训练了另外一个,也只是交易了一笔,仓位 一直不动。


    (iQuant) #6

    您好,收到您的提问,我们来看一下哈。


    (iQuant) #7

    你可以从m5模块的预测输出看一下,应该是预测买入,如果开始买入了,后面就不用买入了。这个示例预测输出都是看多信号。


    (tkyz) #8


    改了测试集的时间到2019-01-01出现错误


    (kobe) #10

    之前社区上有一篇强化学习的例子,请问这两个在策略逻辑、模型思想上差别大吗?我看目前这个策略模板会复杂一些,基本上是不断地训练model,再不断地预测,比较符合强化学习的场景


    (iQuant) #11

    您好,感谢您的反馈,我们已经对模块进行了升级,您可以重新克隆策略运行,或直接使用升级后的模块DQN_model.v5代替原来的v4


    (tkyz) #12

    这个示例只会买,不会卖吗?


    (大胡子) #13

    这个应该与模型超参数有关系,当训练不充分的时候,会出现预测值都是买入或者都是卖出这样的情形,建议用户自己调参。


    (tkyz) #14


    可不可改进下,特征超啦


    (yjc6296) #15

    最新的v5版本还没看,原版本v4没有参数可调,有参数调我肯定会试着调参的


    (yjc6296) #16

    还是不对,我当时看过那么DQN模块的值,q值的确有,但是q值从来不动,所以买入之后仓位一直不动,貌似买了一手就达到了资金上限


    (user316) #18

    你好,直接克隆策略点击全部运行后报错。

    内容如下:

    [2019-08-29 10:50:20.019747] INFO: bigquant: 命中缓存
    [2019-08-29 10:50:20.022483] INFO: bigquant: join.v3 运行完成[0.069238s].
    [2019-08-29 10:50:20.026588] INFO: bigquant: dropnan.v1 开始运行..
    [2019-08-29 10:50:20.109467] INFO: bigquant: 命中缓存
    [2019-08-29 10:50:20.112029] INFO: bigquant: dropnan.v1 运行完成[0.085413s].
    [2019-08-29 10:50:20.115667] INFO: bigquant: dropnan.v1 开始运行..
    [2019-08-29 10:50:20.120002] ERROR: bigquant: module name: dropnan, module version: v1, trackeback: Traceback (most recent call last):
    TypeError: __init__() takes exactly 2 positional arguments (1 given)
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-3-e467a47c5fcf> in <module>()
        196 )
        197 
    --> 198 m3 = M.dropnan.v1(
        199 
        200 )
    
    TypeError: __init__() takes exactly 2 positional arguments (1 given)
    

    (user4035) #19

    DQN模块源码被封装看不到,是嘛


    (iQuant) #20

    嗯 是的 我们会逐步开放源码供大家研究使用


    (iQuant) #21

    收到,我们来检查一下,稍后给您回复。


    (iQuant) #22

    已修复,可以再尝试一下哈。


    (yangziriver) #23

    有个疑问,请教一下:为什么在测试集数据中也要加入自动标注呢?这是DQN的特殊要求吗?