复制链接
克隆策略

如何计算股票板块的收益率 用于构造 模型训练标注 和模型过滤?

1.标注-- 如何构造板块收益率的标注?

(1)首先通过代码列表 和特征因子列表提取基础因子 用于后续计算我们的板块收益率

industry_sw_level2_0
market_cap_float_0
daily_return_0

(2)其次将申万二级行业 的数据库表 跟 股票的日线行情表 连接起来

运用 dataframe的 grouby 和merge操作 计算出我们需要的 板块流通收益率(ps:这里也可以直接计算收益 不需要股票的流通市值)

block_data = basic_data.groupby(['name_SW2','date']).agg({'daily_return_0':'sum','market_cap_float_0':'sum'}).reset_index()

block_data.columns = ['name_SW2','date','daily_return_0_block_sum','market_cap_float_0_block_sum']

合并

rst = pd.merge(basic_data[['date','name_SW2','instrument','daily_return_0','market_cap_float_0']],block_data,on=['date','name_SW2'],how='left').dropna()

得到 rst['板块流通收益率'] = rst['daily_return_0']*rst['market_cap_float_0']/rst['market_cap_float_0_block_sum']

(3) 最后 把这个合并后的表 m12 传入 我们的自动标注引擎 我们的标注就完成了,赶快打开m12模块 和m21模块查看我们构造的标注

2.数据过滤-- 如何利用板块收益率对我们的股票标的进行过滤?

(1)先说方法: 把计算好的板块收益率数据 跟我们的因子数据连接起来之后 使用数据过滤模块 就可以很轻松实现这一个需求。

(2)再说原理: 为何要过滤板块收益率?---

[1]热点追涨:很多时候我们模型选出来的股票 虽然得分socre很高,但是明显不是近期的热点,那模型选股的赚钱效应就有可能不是很强,这个时候如果能让我们模型的选股风格贴近市场的热点

过滤比如近期 板块收益率较高,或者板块动量较强的,或者全市场收益前5名的板块,在这一个范围内选股,就有可能 使我们的模型选股达到一个强中选强,甚至1+1>2的效果

例如:

过滤 当日板块收益 排在 头部top前5% 的的股票

(rank(板块流通收益率)>=0.95)

过滤 近五日板块收益总合 排在 头部top前10% 的的股票

&(rank(sum(板块流通收益率,5))>=0.9)

过滤 近3日板块收益均值 大于 近3日板块益均值 (均线)

&(mean(板块流通收益率,3)>=mean(板块流通收益率,5))

过滤 近5日板块收益均值 大于 近10日板块益均值 (均线)

&(mean(板块流通收益率,5)>=mean(板块流通收益率,10))

[2]热点回调:有些模型的因子特征 还有选股的风格偏好,偏向于选取前期超跌 有反弹预期的股票,那这个时候 如果我们叠加板块的超跌效应,

过滤选取近期 板块收益率 排名靠后的那些股票,可以让我们模型的选股偏好更具有针对性,有概率选到前期的老龙头,老妖股,超跌股 和强势股反弹

In [ ]:
#我们可以看一下 m12中的数据长啥样
m12.data.read()
In [ ]:
#我们可以看一下 m21中的标注数据长啥样
m21.data.read()

    {"description":"实验创建于2017/8/26","graph":{"edges":[{"to_node_id":"-215:instruments","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-8:data"},{"to_node_id":"-15559:instruments","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-8:data"},{"to_node_id":"-419:instruments","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-8:data"},{"to_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-60:model","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-43:model"},{"to_node_id":"-2918:input_1","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-53:data"},{"to_node_id":"-4170:input_ds","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-60:predictions"},{"to_node_id":"-2027:input_ds","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-60:predictions"},{"to_node_id":"-231:instruments","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-62:data"},{"to_node_id":"-250:instruments","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-62:data"},{"to_node_id":"-1557:instruments","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-62:data"},{"to_node_id":"-1564:instruments","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-62:data"},{"to_node_id":"-2038:instruments","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-62:data"},{"to_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-43:training_ds","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-84:data"},{"to_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-43:test_ds","from_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-84:data"},{"to_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-60:data","from_node_id":"-86:data"},{"to_node_id":"-222:input_data","from_node_id":"-215:data"},{"to_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-53:data2","from_node_id":"-222:data"},{"to_node_id":"-238:input_data","from_node_id":"-231:data"},{"to_node_id":"-2914:input_1","from_node_id":"-238:data"},{"to_node_id":"-1597:input_data","from_node_id":"-15566:data"},{"to_node_id":"-15566:data2","from_node_id":"-15559:data"},{"to_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-53:data1","from_node_id":"-15242:data"},{"to_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-43:features","from_node_id":"-2125:data"},{"to_node_id":"-215:features","from_node_id":"-2125:data"},{"to_node_id":"-222:features","from_node_id":"-2125:data"},{"to_node_id":"-231:features","from_node_id":"-2125:data"},{"to_node_id":"-238:features","from_node_id":"-2125:data"},{"to_node_id":"-250:options_data","from_node_id":"-4170:sorted_data"},{"to_node_id":"-6473:input_1","from_node_id":"-6470:data_1"},{"to_node_id":"-1590:data1","from_node_id":"-6473:data"},{"to_node_id":"-6470:input_1","from_node_id":"-2914:data_1"},{"to_node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-84:input_data","from_node_id":"-2918:data_1"},{"to_node_id":"-419:features","from_node_id":"-406:data"},{"to_node_id":"-173:input_1","from_node_id":"-419:data"},{"to_node_id":"-15566:data1","from_node_id":"-173:data_1"},{"to_node_id":"-1564:features","from_node_id":"-1552:data"},{"to_node_id":"-1583:data2","from_node_id":"-1557:data"},{"to_node_id":"-1574:input_1","from_node_id":"-1564:data"},{"to_node_id":"-1583:data1","from_node_id":"-1574:data_1"},{"to_node_id":"-1590:data2","from_node_id":"-1583:data"},{"to_node_id":"-1603:input_data","from_node_id":"-1590:data"},{"to_node_id":"-15242:input_data","from_node_id":"-1597:data"},{"to_node_id":"-86:input_data","from_node_id":"-1603:data"},{"to_node_id":"-2038:options_data","from_node_id":"-2027:sorted_data"}],"nodes":[{"node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-8","module_id":"BigQuantSpace.instruments.instruments-v2","parameters":[{"name":"start_date","value":"2018-06-01","type":"Literal","bound_global_parameter":null},{"name":"end_date","value":"2022-01-01","type":"Literal","bound_global_parameter":null},{"name":"market","value":"CN_STOCK_A","type":"Literal","bound_global_parameter":null},{"name":"instrument_list","value":"","type":"Literal","bound_global_parameter":null},{"name":"max_count","value":"0","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"rolling_conf","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-8"}],"output_ports":[{"name":"data","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-8"}],"cacheable":true,"seq_num":1,"comment":"训练集","comment_collapsed":false},{"node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-43","module_id":"BigQuantSpace.stock_ranker_train.stock_ranker_train-v6","parameters":[{"name":"learning_algorithm","value":"排序","type":"Literal","bound_global_parameter":null},{"name":"number_of_leaves","value":"30","type":"Literal","bound_global_parameter":null},{"name":"minimum_docs_per_leaf","value":"1000","type":"Literal","bound_global_parameter":null},{"name":"number_of_trees","value":"50","type":"Literal","bound_global_parameter":null},{"name":"learning_rate","value":0.1,"type":"Literal","bound_global_parameter":null},{"name":"max_bins","value":1023,"type":"Literal","bound_global_parameter":null},{"name":"feature_fraction","value":1,"type":"Literal","bound_global_parameter":null},{"name":"data_row_fraction","value":1,"type":"Literal","bound_global_parameter":null},{"name":"plot_charts","value":"True","type":"Literal","bound_global_parameter":null},{"name":"ndcg_discount_base","value":1,"type":"Literal","bound_global_parameter":null},{"name":"m_lazy_run","value":"False","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"training_ds","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-43"},{"name":"features","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-43"},{"name":"test_ds","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-43"},{"name":"base_model","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-43"}],"output_ports":[{"name":"model","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-43"},{"name":"feature_gains","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-43"},{"name":"m_lazy_run","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-43"}],"cacheable":true,"seq_num":6,"comment":"","comment_collapsed":true},{"node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-53","module_id":"BigQuantSpace.join.join-v3","parameters":[{"name":"on","value":"date,instrument","type":"Literal","bound_global_parameter":null},{"name":"how","value":"inner","type":"Literal","bound_global_parameter":null},{"name":"sort","value":"False","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"data1","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-53"},{"name":"data2","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-53"}],"output_ports":[{"name":"data","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-53"}],"cacheable":true,"seq_num":7,"comment":"连接因子数据和板块收益率数据","comment_collapsed":false},{"node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-60","module_id":"BigQuantSpace.stock_ranker_predict.stock_ranker_predict-v5","parameters":[{"name":"m_lazy_run","value":"False","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"model","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-60"},{"name":"data","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-60"}],"output_ports":[{"name":"predictions","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-60"},{"name":"m_lazy_run","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-60"}],"cacheable":true,"seq_num":8,"comment":"","comment_collapsed":true},{"node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-62","module_id":"BigQuantSpace.instruments.instruments-v2","parameters":[{"name":"start_date","value":"2022-01-01","type":"Literal","bound_global_parameter":"交易日期"},{"name":"end_date","value":"2022-06-22","type":"Literal","bound_global_parameter":"交易日期"},{"name":"market","value":"CN_STOCK_A","type":"Literal","bound_global_parameter":null},{"name":"instrument_list","value":"","type":"Literal","bound_global_parameter":null},{"name":"max_count","value":"0","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"rolling_conf","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-62"}],"output_ports":[{"name":"data","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-62"}],"cacheable":true,"seq_num":9,"comment":"预测数据,用于回测和模拟","comment_collapsed":false},{"node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-84","module_id":"BigQuantSpace.dropnan.dropnan-v1","parameters":[],"input_ports":[{"name":"input_data","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-84"}],"output_ports":[{"name":"data","node_id":"287d2cb0-f53c-4101-bdf8-104b137c8601-84"}],"cacheable":true,"seq_num":13,"comment":"","comment_collapsed":true},{"node_id":"-86","module_id":"BigQuantSpace.dropnan.dropnan-v1","parameters":[],"input_ports":[{"name":"input_data","node_id":"-86"}],"output_ports":[{"name":"data","node_id":"-86"}],"cacheable":true,"seq_num":14,"comment":"","comment_collapsed":true},{"node_id":"-215","module_id":"BigQuantSpace.general_feature_extractor.general_feature_extractor-v7","parameters":[{"name":"start_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"end_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"before_start_days","value":"90","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"instruments","node_id":"-215"},{"name":"features","node_id":"-215"}],"output_ports":[{"name":"data","node_id":"-215"}],"cacheable":true,"seq_num":15,"comment":"","comment_collapsed":true},{"node_id":"-222","module_id":"BigQuantSpace.derived_feature_extractor.derived_feature_extractor-v3","parameters":[{"name":"date_col","value":"date","type":"Literal","bound_global_parameter":null},{"name":"instrument_col","value":"instrument","type":"Literal","bound_global_parameter":null},{"name":"drop_na","value":"False","type":"Literal","bound_global_parameter":null},{"name":"remove_extra_columns","value":"False","type":"Literal","bound_global_parameter":null},{"name":"user_functions","value":"","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"input_data","node_id":"-222"},{"name":"features","node_id":"-222"}],"output_ports":[{"name":"data","node_id":"-222"}],"cacheable":true,"seq_num":16,"comment":"","comment_collapsed":true},{"node_id":"-231","module_id":"BigQuantSpace.general_feature_extractor.general_feature_extractor-v7","parameters":[{"name":"start_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"end_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"before_start_days","value":"90","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"instruments","node_id":"-231"},{"name":"features","node_id":"-231"}],"output_ports":[{"name":"data","node_id":"-231"}],"cacheable":true,"seq_num":17,"comment":"","comment_collapsed":true},{"node_id":"-238","module_id":"BigQuantSpace.derived_feature_extractor.derived_feature_extractor-v3","parameters":[{"name":"date_col","value":"date","type":"Literal","bound_global_parameter":null},{"name":"instrument_col","value":"instrument","type":"Literal","bound_global_parameter":null},{"name":"drop_na","value":"False","type":"Literal","bound_global_parameter":null},{"name":"remove_extra_columns","value":"False","type":"Literal","bound_global_parameter":null},{"name":"user_functions","value":"","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"input_data","node_id":"-238"},{"name":"features","node_id":"-238"}],"output_ports":[{"name":"data","node_id":"-238"}],"cacheable":true,"seq_num":18,"comment":"","comment_collapsed":true},{"node_id":"-15566","module_id":"BigQuantSpace.join.join-v3","parameters":[{"name":"on","value":"date,instrument","type":"Literal","bound_global_parameter":null},{"name":"how","value":"inner","type":"Literal","bound_global_parameter":null},{"name":"sort","value":"False","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"data1","node_id":"-15566"},{"name":"data2","node_id":"-15566"}],"output_ports":[{"name":"data","node_id":"-15566"}],"cacheable":true,"seq_num":12,"comment":"连接板块数据","comment_collapsed":false},{"node_id":"-15559","module_id":"BigQuantSpace.use_datasource.use_datasource-v1","parameters":[{"name":"datasource_id","value":"bar1d_CN_STOCK_A","type":"Literal","bound_global_parameter":null},{"name":"start_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"end_date","value":"","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"instruments","node_id":"-15559"},{"name":"features","node_id":"-15559"}],"output_ports":[{"name":"data","node_id":"-15559"}],"cacheable":true,"seq_num":20,"comment":"日线数据-标注","comment_collapsed":false},{"node_id":"-15242","module_id":"BigQuantSpace.auto_labeler_on_datasource.auto_labeler_on_datasource-v1","parameters":[{"name":"label_expr","value":"\n\n# 计算收益:3日最高价(作为卖出价格) 除以明日开盘价(作为买入价格) / 未来3日 的板块流通收益率\n-1*(shift(high, -3)/shift(open, -1))/shift(板块流通收益率, -3) -1\n# 极值处理:用1%和99%分位的值做clip\nclip(label, all_quantile(label, 0.01), all_quantile(label, 0.99))\n\n# 将分数映射到分类,这里使用20个分类\nall_wbins(label, 20)\n\n# 过滤掉一字涨停的情况 (设置label为NaN,在后续处理和训练中会忽略NaN的label)\nwhere(shift(high, -1) == shift(low, -1), NaN, label)","type":"Literal","bound_global_parameter":null},{"name":"drop_na_label","value":"True","type":"Literal","bound_global_parameter":null},{"name":"cast_label_int","value":"True","type":"Literal","bound_global_parameter":null},{"name":"date_col","value":"date","type":"Literal","bound_global_parameter":null},{"name":"instrument_col","value":"instrument","type":"Literal","bound_global_parameter":null},{"name":"user_functions","value":"{}","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"input_data","node_id":"-15242"}],"output_ports":[{"name":"data","node_id":"-15242"}],"cacheable":true,"seq_num":21,"comment":"标注板块收益率","comment_collapsed":false},{"node_id":"-2125","module_id":"BigQuantSpace.input_features.input_features-v1","parameters":[{"name":"features","value":"# #号开始的表示注释\n# 多个特征,每行一个,可以包含基础特征和衍生特征\n\ncorrelation(log(volume_0),return_0,10)\n\n","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"features_ds","node_id":"-2125"}],"output_ports":[{"name":"data","node_id":"-2125"}],"cacheable":true,"seq_num":2,"comment":"","comment_collapsed":true},{"node_id":"-4170","module_id":"BigQuantSpace.sort.sort-v5","parameters":[{"name":"sort_by","value":"score","type":"Literal","bound_global_parameter":null},{"name":"group_by","value":"date","type":"Literal","bound_global_parameter":null},{"name":"keep_columns","value":"--","type":"Literal","bound_global_parameter":null},{"name":"ascending","value":"False","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"input_ds","node_id":"-4170"},{"name":"sort_by_ds","node_id":"-4170"}],"output_ports":[{"name":"sorted_data","node_id":"-4170"}],"cacheable":true,"seq_num":23,"comment":"降序","comment_collapsed":true},{"node_id":"-6470","module_id":"BigQuantSpace.filtet_st_stock.filtet_st_stock-v7","parameters":[],"input_ports":[{"name":"input_1","node_id":"-6470"}],"output_ports":[{"name":"data_1","node_id":"-6470"}],"cacheable":true,"seq_num":27,"comment":"","comment_collapsed":true},{"node_id":"-6473","module_id":"BigQuantSpace.filter_delist_stocks.filter_delist_stocks-v3","parameters":[],"input_ports":[{"name":"input_1","node_id":"-6473"}],"output_ports":[{"name":"data","node_id":"-6473"}],"cacheable":true,"seq_num":28,"comment":"","comment_collapsed":true},{"node_id":"-2914","module_id":"BigQuantSpace.filter_stockcode.filter_stockcode-v2","parameters":[{"name":"start","value":"688","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"input_1","node_id":"-2914"}],"output_ports":[{"name":"data_1","node_id":"-2914"}],"cacheable":true,"seq_num":26,"comment":"","comment_collapsed":true},{"node_id":"-2918","module_id":"BigQuantSpace.filtet_st_stock.filtet_st_stock-v7","parameters":[],"input_ports":[{"name":"input_1","node_id":"-2918"}],"output_ports":[{"name":"data_1","node_id":"-2918"}],"cacheable":true,"seq_num":29,"comment":"","comment_collapsed":true},{"node_id":"-406","module_id":"BigQuantSpace.input_features.input_features-v1","parameters":[{"name":"features","value":"industry_sw_level2_0\nmarket_cap_float_0\ndaily_return_0\n","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"features_ds","node_id":"-406"}],"output_ports":[{"name":"data","node_id":"-406"}],"cacheable":true,"seq_num":31,"comment":"","comment_collapsed":true},{"node_id":"-419","module_id":"BigQuantSpace.general_feature_extractor.general_feature_extractor-v7","parameters":[{"name":"start_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"end_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"before_start_days","value":90,"type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"instruments","node_id":"-419"},{"name":"features","node_id":"-419"}],"output_ports":[{"name":"data","node_id":"-419"}],"cacheable":true,"seq_num":33,"comment":"","comment_collapsed":true},{"node_id":"-173","module_id":"BigQuantSpace.cached.cached-v3","parameters":[{"name":"run","value":"# Python 代码入口函数,input_1/2/3 对应三个输入端,data_1/2/3 对应三个输出端\ndef bigquant_run(input_1, input_2, input_3, SW_type):\n basic_data = input_1.read()\n start_date = str(basic_data.date.min())\n end_date = str(basic_data.date.max())\n SW_data = DataSource('basic_info_IndustrySw').read()\n SW_data['code'] = SW_data.code.astype('int')\n\n SW_data_2014_2 = SW_data[(SW_data.version==2014)&(SW_data.industry_sw_level==2)][['code','name']].rename(columns={'code':'industry_sw_level2_0','name':'name_SW2'})\n\n SW_data_2021_2 = SW_data[(SW_data.version==2021)&(SW_data.industry_sw_level==2)][['code','name']].rename(columns={'code':'industry_sw_level2_0','name':'name_SW2'})\n\n basic_data['daily_return_0'] = basic_data['daily_return_0']-1\n\n if end_date < '2021-12-13':\n\n basic_data = basic_data.merge(SW_data_2014_2,how='left',on='industry_sw_level2_0')\n\n else:\n basic_data_2014 = basic_data[basic_data.date < '2021-12-13']\n\n basic_data_2014 = basic_data_2014.merge(SW_data_2014_2,how='left',on='industry_sw_level2_0')\n\n basic_data_2021 = basic_data[basic_data.date >= '2021-12-13']\n\n basic_data_2021 = basic_data_2021.merge(SW_data_2021_2,how='left',on='industry_sw_level2_0')\n\n basic_data = pd.concat([basic_data_2014,basic_data_2021])\n \n block_data = basic_data.groupby(['name_SW2','date']).agg({'daily_return_0':'sum','market_cap_float_0':'sum'}).reset_index()\n block_data.columns = ['name_SW2','date','daily_return_0_block_sum','market_cap_float_0_block_sum']\n rst = pd.merge(basic_data[['date','name_SW2','instrument','daily_return_0','market_cap_float_0']],block_data,on=['date','name_SW2'],how='left').dropna()\n \n rst['板块流通收益率'] = rst['daily_return_0']*rst['market_cap_float_0']/rst['market_cap_float_0_block_sum']\n data_1 = DataSource.write_df(rst)\n return Outputs(data_1=data_1)\n","type":"Literal","bound_global_parameter":null},{"name":"post_run","value":"# 后处理函数,可选。输入是主函数的输出,可以在这里对数据做处理,或者返回更友好的outputs数据格式。此函数输出不会被缓存。\ndef bigquant_run(outputs):\n return outputs\n","type":"Literal","bound_global_parameter":null},{"name":"input_ports","value":"","type":"Literal","bound_global_parameter":null},{"name":"params","value":"{\n 'SW_type':'name_SW2'\n}","type":"Literal","bound_global_parameter":null},{"name":"output_ports","value":"data_1,data_2","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"input_1","node_id":"-173"},{"name":"input_2","node_id":"-173"},{"name":"input_3","node_id":"-173"}],"output_ports":[{"name":"data_1","node_id":"-173"},{"name":"data_2","node_id":"-173"},{"name":"data_3","node_id":"-173"}],"cacheable":true,"seq_num":34,"comment":"申万二级收益","comment_collapsed":false},{"node_id":"-250","module_id":"BigQuantSpace.trade.trade-v4","parameters":[{"name":"start_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"end_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"initialize","value":"# 回测引擎:初始化函数,只执行一次\ndef bigquant_run(context):\n # 加载预测数据\n context.ranker_prediction = context.options['data'].read_df()\n\n # 系统已经设置了默认的交易手续费和滑点,要修改手续费可使用如下函数\n context.set_commission(PerOrder(buy_cost=0.0003, sell_cost=0.0013, min_cost=5))\n # 预测数据,通过options传入进来,使用 read_df 函数,加载到内存 (DataFrame)\n # 设置买入的股票数量,这里买入预测股票列表排名靠前的5只\n stock_count = 5\n # 每只的股票的权重,如下的权重分配会使得靠前的股票分配多一点的资金,[0.339160, 0.213986, 0.169580, ..]\n context.stock_weights = T.norm([1 / math.log(i + 2) for i in range(0, stock_count)])\n # 设置每只股票占用的最大资金比例\n context.max_cash_per_instrument = 1\n context.options['hold_days'] = 0\n","type":"Literal","bound_global_parameter":null},{"name":"handle_data","value":"# 回测引擎:每日数据处理函数,每天执行一次\ndef bigquant_run(context, data):\n # 按日期过滤得到今日的预测数据\n ranker_prediction = context.ranker_prediction[\n context.ranker_prediction.date == data.current_dt.strftime('%Y-%m-%d')]\n\n # 1. 资金分配\n # 平均持仓时间是hold_days,每日都将买入股票,每日预期使用 1/hold_days 的资金\n # 实际操作中,会存在一定的买入误差,所以在前hold_days天,等量使用资金;之后,尽量使用剩余资金(这里设置最多用等量的1.5倍)\n is_staging = context.trading_day_index < context.options['hold_days'] # 是否在建仓期间(前 hold_days 天)\n cash_avg = context.portfolio.portfolio_value / context.options['hold_days']\n cash_for_buy = min(context.portfolio.cash, (1 if is_staging else 1.5) * cash_avg)\n cash_for_sell = cash_avg - (context.portfolio.cash - cash_for_buy)\n positions = {e.symbol: p.amount * p.last_sale_price\n for e, p in context.portfolio.positions.items()}\n\n # 2. 生成卖出订单:hold_days天之后才开始卖出;对持仓的股票,按机器学习算法预测的排序末位淘汰\n if not is_staging and cash_for_sell > 0:\n equities = {e.symbol: e for e, p in context.portfolio.positions.items()}\n instruments = list(reversed(list(ranker_prediction.instrument[ranker_prediction.instrument.apply(\n lambda x: x in equities)])))\n\n for instrument in instruments:\n context.order_target(context.symbol(instrument), 0)\n cash_for_sell -= positions[instrument]\n if cash_for_sell <= 0:\n break\n\n # 3. 生成买入订单:按机器学习算法预测的排序,买入前面的stock_count只股票\n buy_cash_weights = context.stock_weights\n buy_instruments = list(ranker_prediction.instrument[:len(buy_cash_weights)])\n max_cash_per_instrument = context.portfolio.portfolio_value * context.max_cash_per_instrument\n for i, instrument in enumerate(buy_instruments):\n cash = cash_for_buy * buy_cash_weights[i]\n if cash > max_cash_per_instrument - positions.get(instrument, 0):\n # 确保股票持仓量不会超过每次股票最大的占用资金量\n cash = max_cash_per_instrument - positions.get(instrument, 0)\n if cash > 0:\n context.order_value(context.symbol(instrument), cash)\n","type":"Literal","bound_global_parameter":null},{"name":"prepare","value":"# 回测引擎:准备数据,只执行一次\ndef bigquant_run(context):\n pass\n","type":"Literal","bound_global_parameter":null},{"name":"before_trading_start","value":"","type":"Literal","bound_global_parameter":null},{"name":"volume_limit","value":0.025,"type":"Literal","bound_global_parameter":null},{"name":"order_price_field_buy","value":"open","type":"Literal","bound_global_parameter":null},{"name":"order_price_field_sell","value":"close","type":"Literal","bound_global_parameter":null},{"name":"capital_base","value":1000000,"type":"Literal","bound_global_parameter":null},{"name":"auto_cancel_non_tradable_orders","value":"True","type":"Literal","bound_global_parameter":null},{"name":"data_frequency","value":"daily","type":"Literal","bound_global_parameter":null},{"name":"price_type","value":"真实价格","type":"Literal","bound_global_parameter":null},{"name":"product_type","value":"股票","type":"Literal","bound_global_parameter":null},{"name":"plot_charts","value":"True","type":"Literal","bound_global_parameter":null},{"name":"backtest_only","value":"False","type":"Literal","bound_global_parameter":null},{"name":"benchmark","value":"000300.SHA","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"instruments","node_id":"-250"},{"name":"options_data","node_id":"-250"},{"name":"history_ds","node_id":"-250"},{"name":"benchmark_ds","node_id":"-250"},{"name":"trading_calendar","node_id":"-250"}],"output_ports":[{"name":"raw_perf","node_id":"-250"}],"cacheable":false,"seq_num":10,"comment":"","comment_collapsed":true},{"node_id":"-1552","module_id":"BigQuantSpace.input_features.input_features-v1","parameters":[{"name":"features","value":"industry_sw_level2_0\nmarket_cap_float_0\ndaily_return_0\n","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"features_ds","node_id":"-1552"}],"output_ports":[{"name":"data","node_id":"-1552"}],"cacheable":true,"seq_num":4,"comment":"","comment_collapsed":true},{"node_id":"-1557","module_id":"BigQuantSpace.use_datasource.use_datasource-v1","parameters":[{"name":"datasource_id","value":"bar1d_CN_STOCK_A","type":"Literal","bound_global_parameter":null},{"name":"start_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"end_date","value":"","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"instruments","node_id":"-1557"},{"name":"features","node_id":"-1557"}],"output_ports":[{"name":"data","node_id":"-1557"}],"cacheable":true,"seq_num":5,"comment":"日线数据-标注","comment_collapsed":false},{"node_id":"-1564","module_id":"BigQuantSpace.general_feature_extractor.general_feature_extractor-v7","parameters":[{"name":"start_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"end_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"before_start_days","value":90,"type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"instruments","node_id":"-1564"},{"name":"features","node_id":"-1564"}],"output_ports":[{"name":"data","node_id":"-1564"}],"cacheable":true,"seq_num":11,"comment":"","comment_collapsed":true},{"node_id":"-1574","module_id":"BigQuantSpace.cached.cached-v3","parameters":[{"name":"run","value":"# Python 代码入口函数,input_1/2/3 对应三个输入端,data_1/2/3 对应三个输出端\ndef bigquant_run(input_1, input_2, input_3, SW_type):\n basic_data = input_1.read()\n start_date = str(basic_data.date.min())\n end_date = str(basic_data.date.max())\n SW_data = DataSource('basic_info_IndustrySw').read()\n SW_data['code'] = SW_data.code.astype('int')\n\n SW_data_2014_2 = SW_data[(SW_data.version==2014)&(SW_data.industry_sw_level==2)][['code','name']].rename(columns={'code':'industry_sw_level2_0','name':'name_SW2'})\n\n SW_data_2021_2 = SW_data[(SW_data.version==2021)&(SW_data.industry_sw_level==2)][['code','name']].rename(columns={'code':'industry_sw_level2_0','name':'name_SW2'})\n\n basic_data['daily_return_0'] = basic_data['daily_return_0']-1\n\n if end_date < '2021-12-13':\n\n basic_data = basic_data.merge(SW_data_2014_2,how='left',on='industry_sw_level2_0')\n\n else:\n basic_data_2014 = basic_data[basic_data.date < '2021-12-13']\n\n basic_data_2014 = basic_data_2014.merge(SW_data_2014_2,how='left',on='industry_sw_level2_0')\n\n basic_data_2021 = basic_data[basic_data.date >= '2021-12-13']\n\n basic_data_2021 = basic_data_2021.merge(SW_data_2021_2,how='left',on='industry_sw_level2_0')\n\n basic_data = pd.concat([basic_data_2014,basic_data_2021])\n \n block_data = basic_data.groupby(['name_SW2','date']).agg({'daily_return_0':'sum','market_cap_float_0':'sum'}).reset_index()\n block_data.columns = ['name_SW2','date','daily_return_0_block_sum','market_cap_float_0_block_sum']\n rst = pd.merge(basic_data[['date','name_SW2','instrument','daily_return_0','market_cap_float_0']],block_data,on=['date','name_SW2'],how='left').dropna()\n \n rst['板块流通收益率'] = rst['daily_return_0']*rst['market_cap_float_0']/rst['market_cap_float_0_block_sum']\n data_1 = DataSource.write_df(rst)\n return Outputs(data_1=data_1)\n","type":"Literal","bound_global_parameter":null},{"name":"post_run","value":"# 后处理函数,可选。输入是主函数的输出,可以在这里对数据做处理,或者返回更友好的outputs数据格式。此函数输出不会被缓存。\ndef bigquant_run(outputs):\n return outputs\n","type":"Literal","bound_global_parameter":null},{"name":"input_ports","value":"","type":"Literal","bound_global_parameter":null},{"name":"params","value":"{\n 'SW_type':'name_SW2'\n}","type":"Literal","bound_global_parameter":null},{"name":"output_ports","value":"data_1,data_2","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"input_1","node_id":"-1574"},{"name":"input_2","node_id":"-1574"},{"name":"input_3","node_id":"-1574"}],"output_ports":[{"name":"data_1","node_id":"-1574"},{"name":"data_2","node_id":"-1574"},{"name":"data_3","node_id":"-1574"}],"cacheable":true,"seq_num":19,"comment":"申万二级收益","comment_collapsed":false},{"node_id":"-1583","module_id":"BigQuantSpace.join.join-v3","parameters":[{"name":"on","value":"date,instrument","type":"Literal","bound_global_parameter":null},{"name":"how","value":"inner","type":"Literal","bound_global_parameter":null},{"name":"sort","value":"False","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"data1","node_id":"-1583"},{"name":"data2","node_id":"-1583"}],"output_ports":[{"name":"data","node_id":"-1583"}],"cacheable":true,"seq_num":22,"comment":"连接板块数据","comment_collapsed":false},{"node_id":"-1590","module_id":"BigQuantSpace.join.join-v3","parameters":[{"name":"on","value":"date,instrument","type":"Literal","bound_global_parameter":null},{"name":"how","value":"inner","type":"Literal","bound_global_parameter":null},{"name":"sort","value":"False","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"data1","node_id":"-1590"},{"name":"data2","node_id":"-1590"}],"output_ports":[{"name":"data","node_id":"-1590"}],"cacheable":true,"seq_num":24,"comment":"连接因子数据和板块数据","comment_collapsed":false},{"node_id":"-1597","module_id":"BigQuantSpace.filter.filter-v3","parameters":[{"name":"expr","value":"rank(板块流通收益率)>0.95","type":"Literal","bound_global_parameter":null},{"name":"output_left_data","value":"False","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"input_data","node_id":"-1597"}],"output_ports":[{"name":"data","node_id":"-1597"},{"name":"left_data","node_id":"-1597"}],"cacheable":true,"seq_num":25,"comment":"过滤板块收益率","comment_collapsed":false},{"node_id":"-1603","module_id":"BigQuantSpace.filter.filter-v3","parameters":[{"name":"expr","value":"rank(板块流通收益率)>0.95","type":"Literal","bound_global_parameter":null},{"name":"output_left_data","value":"False","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"input_data","node_id":"-1603"}],"output_ports":[{"name":"data","node_id":"-1603"},{"name":"left_data","node_id":"-1603"}],"cacheable":true,"seq_num":30,"comment":"过滤板块流通收益率","comment_collapsed":false},{"node_id":"-2027","module_id":"BigQuantSpace.sort.sort-v5","parameters":[{"name":"sort_by","value":"score","type":"Literal","bound_global_parameter":null},{"name":"group_by","value":"date","type":"Literal","bound_global_parameter":null},{"name":"keep_columns","value":"--","type":"Literal","bound_global_parameter":null},{"name":"ascending","value":"True","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"input_ds","node_id":"-2027"},{"name":"sort_by_ds","node_id":"-2027"}],"output_ports":[{"name":"sorted_data","node_id":"-2027"}],"cacheable":true,"seq_num":3,"comment":"降序","comment_collapsed":true},{"node_id":"-2038","module_id":"BigQuantSpace.trade.trade-v4","parameters":[{"name":"start_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"end_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"initialize","value":"# 回测引擎:初始化函数,只执行一次\ndef bigquant_run(context):\n # 加载预测数据\n context.ranker_prediction = context.options['data'].read_df()\n\n # 系统已经设置了默认的交易手续费和滑点,要修改手续费可使用如下函数\n context.set_commission(PerOrder(buy_cost=0.0003, sell_cost=0.0013, min_cost=5))\n # 预测数据,通过options传入进来,使用 read_df 函数,加载到内存 (DataFrame)\n # 设置买入的股票数量,这里买入预测股票列表排名靠前的5只\n stock_count = 5\n # 每只的股票的权重,如下的权重分配会使得靠前的股票分配多一点的资金,[0.339160, 0.213986, 0.169580, ..]\n context.stock_weights = T.norm([1 / math.log(i + 2) for i in range(0, stock_count)])\n # 设置每只股票占用的最大资金比例\n context.max_cash_per_instrument = 1\n context.options['hold_days'] = 0\n","type":"Literal","bound_global_parameter":null},{"name":"handle_data","value":"# 回测引擎:每日数据处理函数,每天执行一次\ndef bigquant_run(context, data):\n # 按日期过滤得到今日的预测数据\n ranker_prediction = context.ranker_prediction[\n context.ranker_prediction.date == data.current_dt.strftime('%Y-%m-%d')]\n\n # 1. 资金分配\n # 平均持仓时间是hold_days,每日都将买入股票,每日预期使用 1/hold_days 的资金\n # 实际操作中,会存在一定的买入误差,所以在前hold_days天,等量使用资金;之后,尽量使用剩余资金(这里设置最多用等量的1.5倍)\n is_staging = context.trading_day_index < context.options['hold_days'] # 是否在建仓期间(前 hold_days 天)\n cash_avg = context.portfolio.portfolio_value / context.options['hold_days']\n cash_for_buy = min(context.portfolio.cash, (1 if is_staging else 1.5) * cash_avg)\n cash_for_sell = cash_avg - (context.portfolio.cash - cash_for_buy)\n positions = {e.symbol: p.amount * p.last_sale_price\n for e, p in context.portfolio.positions.items()}\n\n # 2. 生成卖出订单:hold_days天之后才开始卖出;对持仓的股票,按机器学习算法预测的排序末位淘汰\n if not is_staging and cash_for_sell > 0:\n equities = {e.symbol: e for e, p in context.portfolio.positions.items()}\n instruments = list(reversed(list(ranker_prediction.instrument[ranker_prediction.instrument.apply(\n lambda x: x in equities)])))\n\n for instrument in instruments:\n context.order_target(context.symbol(instrument), 0)\n cash_for_sell -= positions[instrument]\n if cash_for_sell <= 0:\n break\n\n # 3. 生成买入订单:按机器学习算法预测的排序,买入前面的stock_count只股票\n buy_cash_weights = context.stock_weights\n buy_instruments = list(ranker_prediction.instrument[:len(buy_cash_weights)])\n max_cash_per_instrument = context.portfolio.portfolio_value * context.max_cash_per_instrument\n for i, instrument in enumerate(buy_instruments):\n cash = cash_for_buy * buy_cash_weights[i]\n if cash > max_cash_per_instrument - positions.get(instrument, 0):\n # 确保股票持仓量不会超过每次股票最大的占用资金量\n cash = max_cash_per_instrument - positions.get(instrument, 0)\n if cash > 0:\n context.order_value(context.symbol(instrument), cash)\n","type":"Literal","bound_global_parameter":null},{"name":"prepare","value":"# 回测引擎:准备数据,只执行一次\ndef bigquant_run(context):\n pass\n","type":"Literal","bound_global_parameter":null},{"name":"before_trading_start","value":"","type":"Literal","bound_global_parameter":null},{"name":"volume_limit","value":0.025,"type":"Literal","bound_global_parameter":null},{"name":"order_price_field_buy","value":"open","type":"Literal","bound_global_parameter":null},{"name":"order_price_field_sell","value":"close","type":"Literal","bound_global_parameter":null},{"name":"capital_base","value":1000000,"type":"Literal","bound_global_parameter":null},{"name":"auto_cancel_non_tradable_orders","value":"True","type":"Literal","bound_global_parameter":null},{"name":"data_frequency","value":"daily","type":"Literal","bound_global_parameter":null},{"name":"price_type","value":"真实价格","type":"Literal","bound_global_parameter":null},{"name":"product_type","value":"股票","type":"Literal","bound_global_parameter":null},{"name":"plot_charts","value":"True","type":"Literal","bound_global_parameter":null},{"name":"backtest_only","value":"False","type":"Literal","bound_global_parameter":null},{"name":"benchmark","value":"000300.SHA","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"instruments","node_id":"-2038"},{"name":"options_data","node_id":"-2038"},{"name":"history_ds","node_id":"-2038"},{"name":"benchmark_ds","node_id":"-2038"},{"name":"trading_calendar","node_id":"-2038"}],"output_ports":[{"name":"raw_perf","node_id":"-2038"}],"cacheable":false,"seq_num":32,"comment":"","comment_collapsed":true}],"node_layout":"<node_postions><node_position Node='287d2cb0-f53c-4101-bdf8-104b137c8601-8' Position='-115,-189,200,200'/><node_position Node='287d2cb0-f53c-4101-bdf8-104b137c8601-43' Position='574,634,200,200'/><node_position Node='287d2cb0-f53c-4101-bdf8-104b137c8601-53' Position='-14,319,200,200'/><node_position Node='287d2cb0-f53c-4101-bdf8-104b137c8601-60' Position='909,705,200,200'/><node_position Node='287d2cb0-f53c-4101-bdf8-104b137c8601-62' Position='1391,394,200,200'/><node_position Node='287d2cb0-f53c-4101-bdf8-104b137c8601-84' Position='333,583,200,200'/><node_position Node='-86' Position='1056,625,200,200'/><node_position Node='-215' Position='381,188,200,200'/><node_position Node='-222' Position='385,280,200,200'/><node_position Node='-231' Position='1044,175,200,200'/><node_position Node='-238' Position='1038,232,200,200'/><node_position Node='-15566' Position='-332,217,200,200'/><node_position Node='-15559' Position='-174,128,200,200'/><node_position Node='-15242' Position='-468,443,200,200'/><node_position Node='-2125' Position='678,-45,200,200'/><node_position Node='-4170' Position='918,789,200,200'/><node_position Node='-6470' Position='1053,409,200,200'/><node_position Node='-6473' Position='1053,464,200,200'/><node_position Node='-2914' Position='1052,352,200,200'/><node_position Node='-2918' Position='21,468,200,200'/><node_position Node='-406' Position='-474,-39,200,200'/><node_position Node='-419' Position='-490,34,200,200'/><node_position Node='-173' Position='-484,129,200,200'/><node_position Node='-250' Position='1017,926,200,200'/><node_position Node='-1552' Position='1795,301,200,200'/><node_position Node='-1557' Position='1828,199,200,200'/><node_position Node='-1564' Position='1783,366,200,200'/><node_position Node='-1574' Position='1732,448,200,200'/><node_position Node='-1583' Position='1733,531,200,200'/><node_position Node='-1590' Position='1659,641,200,200'/><node_position Node='-1597' Position='-462,323,200,200'/><node_position Node='-1603' Position='1053,531,200,200'/><node_position Node='-2027' Position='1286,809,200,200'/><node_position Node='-2038' Position='1384,946,200,200'/></node_postions>"},"nodes_readonly":false,"studio_version":"v2"}
    In [ ]:
    # 本代码由可视化策略环境自动生成 2022年9月9日 14:05
    # 本代码单元只能在可视化模式下编辑。您也可以拷贝代码,粘贴到新建的代码单元或者策略,然后修改。
    
    
    # Python 代码入口函数,input_1/2/3 对应三个输入端,data_1/2/3 对应三个输出端
    def m34_run_bigquant_run(input_1, input_2, input_3, SW_type):
        basic_data = input_1.read()
        start_date = str(basic_data.date.min())
        end_date = str(basic_data.date.max())
        SW_data = DataSource('basic_info_IndustrySw').read()
        SW_data['code'] = SW_data.code.astype('int')
    
        SW_data_2014_2 = SW_data[(SW_data.version==2014)&(SW_data.industry_sw_level==2)][['code','name']].rename(columns={'code':'industry_sw_level2_0','name':'name_SW2'})
    
        SW_data_2021_2 = SW_data[(SW_data.version==2021)&(SW_data.industry_sw_level==2)][['code','name']].rename(columns={'code':'industry_sw_level2_0','name':'name_SW2'})
    
        basic_data['daily_return_0'] = basic_data['daily_return_0']-1
    
        if end_date < '2021-12-13':
    
            basic_data = basic_data.merge(SW_data_2014_2,how='left',on='industry_sw_level2_0')
    
        else:
            basic_data_2014 = basic_data[basic_data.date < '2021-12-13']
    
            basic_data_2014 = basic_data_2014.merge(SW_data_2014_2,how='left',on='industry_sw_level2_0')
    
            basic_data_2021 = basic_data[basic_data.date >= '2021-12-13']
    
            basic_data_2021 = basic_data_2021.merge(SW_data_2021_2,how='left',on='industry_sw_level2_0')
    
            basic_data = pd.concat([basic_data_2014,basic_data_2021])
            
        block_data = basic_data.groupby(['name_SW2','date']).agg({'daily_return_0':'sum','market_cap_float_0':'sum'}).reset_index()
        block_data.columns = ['name_SW2','date','daily_return_0_block_sum','market_cap_float_0_block_sum']
        rst = pd.merge(basic_data[['date','name_SW2','instrument','daily_return_0','market_cap_float_0']],block_data,on=['date','name_SW2'],how='left').dropna()
        
        rst['板块流通收益率'] = rst['daily_return_0']*rst['market_cap_float_0']/rst['market_cap_float_0_block_sum']
        data_1 = DataSource.write_df(rst)
        return Outputs(data_1=data_1)
    
    # 后处理函数,可选。输入是主函数的输出,可以在这里对数据做处理,或者返回更友好的outputs数据格式。此函数输出不会被缓存。
    def m34_post_run_bigquant_run(outputs):
        return outputs
    
    # Python 代码入口函数,input_1/2/3 对应三个输入端,data_1/2/3 对应三个输出端
    def m19_run_bigquant_run(input_1, input_2, input_3, SW_type):
        basic_data = input_1.read()
        start_date = str(basic_data.date.min())
        end_date = str(basic_data.date.max())
        SW_data = DataSource('basic_info_IndustrySw').read()
        SW_data['code'] = SW_data.code.astype('int')
    
        SW_data_2014_2 = SW_data[(SW_data.version==2014)&(SW_data.industry_sw_level==2)][['code','name']].rename(columns={'code':'industry_sw_level2_0','name':'name_SW2'})
    
        SW_data_2021_2 = SW_data[(SW_data.version==2021)&(SW_data.industry_sw_level==2)][['code','name']].rename(columns={'code':'industry_sw_level2_0','name':'name_SW2'})
    
        basic_data['daily_return_0'] = basic_data['daily_return_0']-1
    
        if end_date < '2021-12-13':
    
            basic_data = basic_data.merge(SW_data_2014_2,how='left',on='industry_sw_level2_0')
    
        else:
            basic_data_2014 = basic_data[basic_data.date < '2021-12-13']
    
            basic_data_2014 = basic_data_2014.merge(SW_data_2014_2,how='left',on='industry_sw_level2_0')
    
            basic_data_2021 = basic_data[basic_data.date >= '2021-12-13']
    
            basic_data_2021 = basic_data_2021.merge(SW_data_2021_2,how='left',on='industry_sw_level2_0')
    
            basic_data = pd.concat([basic_data_2014,basic_data_2021])
            
        block_data = basic_data.groupby(['name_SW2','date']).agg({'daily_return_0':'sum','market_cap_float_0':'sum'}).reset_index()
        block_data.columns = ['name_SW2','date','daily_return_0_block_sum','market_cap_float_0_block_sum']
        rst = pd.merge(basic_data[['date','name_SW2','instrument','daily_return_0','market_cap_float_0']],block_data,on=['date','name_SW2'],how='left').dropna()
        
        rst['板块流通收益率'] = rst['daily_return_0']*rst['market_cap_float_0']/rst['market_cap_float_0_block_sum']
        data_1 = DataSource.write_df(rst)
        return Outputs(data_1=data_1)
    
    # 后处理函数,可选。输入是主函数的输出,可以在这里对数据做处理,或者返回更友好的outputs数据格式。此函数输出不会被缓存。
    def m19_post_run_bigquant_run(outputs):
        return outputs
    
    # 回测引擎:初始化函数,只执行一次
    def m10_initialize_bigquant_run(context):
        # 加载预测数据
        context.ranker_prediction = context.options['data'].read_df()
    
        # 系统已经设置了默认的交易手续费和滑点,要修改手续费可使用如下函数
        context.set_commission(PerOrder(buy_cost=0.0003, sell_cost=0.0013, min_cost=5))
        # 预测数据,通过options传入进来,使用 read_df 函数,加载到内存 (DataFrame)
        # 设置买入的股票数量,这里买入预测股票列表排名靠前的5只
        stock_count = 5
        # 每只的股票的权重,如下的权重分配会使得靠前的股票分配多一点的资金,[0.339160, 0.213986, 0.169580, ..]
        context.stock_weights = T.norm([1 / math.log(i + 2) for i in range(0, stock_count)])
        # 设置每只股票占用的最大资金比例
        context.max_cash_per_instrument = 1
        context.options['hold_days'] = 0
    
    # 回测引擎:每日数据处理函数,每天执行一次
    def m10_handle_data_bigquant_run(context, data):
        # 按日期过滤得到今日的预测数据
        ranker_prediction = context.ranker_prediction[
            context.ranker_prediction.date == data.current_dt.strftime('%Y-%m-%d')]
    
        # 1. 资金分配
        # 平均持仓时间是hold_days,每日都将买入股票,每日预期使用 1/hold_days 的资金
        # 实际操作中,会存在一定的买入误差,所以在前hold_days天,等量使用资金;之后,尽量使用剩余资金(这里设置最多用等量的1.5倍)
        is_staging = context.trading_day_index < context.options['hold_days'] # 是否在建仓期间(前 hold_days 天)
        cash_avg = context.portfolio.portfolio_value / context.options['hold_days']
        cash_for_buy = min(context.portfolio.cash, (1 if is_staging else 1.5) * cash_avg)
        cash_for_sell = cash_avg - (context.portfolio.cash - cash_for_buy)
        positions = {e.symbol: p.amount * p.last_sale_price
                     for e, p in context.portfolio.positions.items()}
    
        # 2. 生成卖出订单:hold_days天之后才开始卖出;对持仓的股票,按机器学习算法预测的排序末位淘汰
        if not is_staging and cash_for_sell > 0:
            equities = {e.symbol: e for e, p in context.portfolio.positions.items()}
            instruments = list(reversed(list(ranker_prediction.instrument[ranker_prediction.instrument.apply(
                    lambda x: x in equities)])))
    
            for instrument in instruments:
                context.order_target(context.symbol(instrument), 0)
                cash_for_sell -= positions[instrument]
                if cash_for_sell <= 0:
                    break
    
        # 3. 生成买入订单:按机器学习算法预测的排序,买入前面的stock_count只股票
        buy_cash_weights = context.stock_weights
        buy_instruments = list(ranker_prediction.instrument[:len(buy_cash_weights)])
        max_cash_per_instrument = context.portfolio.portfolio_value * context.max_cash_per_instrument
        for i, instrument in enumerate(buy_instruments):
            cash = cash_for_buy * buy_cash_weights[i]
            if cash > max_cash_per_instrument - positions.get(instrument, 0):
                # 确保股票持仓量不会超过每次股票最大的占用资金量
                cash = max_cash_per_instrument - positions.get(instrument, 0)
            if cash > 0:
                context.order_value(context.symbol(instrument), cash)
    
    # 回测引擎:准备数据,只执行一次
    def m10_prepare_bigquant_run(context):
        pass
    
    # 回测引擎:初始化函数,只执行一次
    def m32_initialize_bigquant_run(context):
        # 加载预测数据
        context.ranker_prediction = context.options['data'].read_df()
    
        # 系统已经设置了默认的交易手续费和滑点,要修改手续费可使用如下函数
        context.set_commission(PerOrder(buy_cost=0.0003, sell_cost=0.0013, min_cost=5))
        # 预测数据,通过options传入进来,使用 read_df 函数,加载到内存 (DataFrame)
        # 设置买入的股票数量,这里买入预测股票列表排名靠前的5只
        stock_count = 5
        # 每只的股票的权重,如下的权重分配会使得靠前的股票分配多一点的资金,[0.339160, 0.213986, 0.169580, ..]
        context.stock_weights = T.norm([1 / math.log(i + 2) for i in range(0, stock_count)])
        # 设置每只股票占用的最大资金比例
        context.max_cash_per_instrument = 1
        context.options['hold_days'] = 0
    
    # 回测引擎:每日数据处理函数,每天执行一次
    def m32_handle_data_bigquant_run(context, data):
        # 按日期过滤得到今日的预测数据
        ranker_prediction = context.ranker_prediction[
            context.ranker_prediction.date == data.current_dt.strftime('%Y-%m-%d')]
    
        # 1. 资金分配
        # 平均持仓时间是hold_days,每日都将买入股票,每日预期使用 1/hold_days 的资金
        # 实际操作中,会存在一定的买入误差,所以在前hold_days天,等量使用资金;之后,尽量使用剩余资金(这里设置最多用等量的1.5倍)
        is_staging = context.trading_day_index < context.options['hold_days'] # 是否在建仓期间(前 hold_days 天)
        cash_avg = context.portfolio.portfolio_value / context.options['hold_days']
        cash_for_buy = min(context.portfolio.cash, (1 if is_staging else 1.5) * cash_avg)
        cash_for_sell = cash_avg - (context.portfolio.cash - cash_for_buy)
        positions = {e.symbol: p.amount * p.last_sale_price
                     for e, p in context.portfolio.positions.items()}
    
        # 2. 生成卖出订单:hold_days天之后才开始卖出;对持仓的股票,按机器学习算法预测的排序末位淘汰
        if not is_staging and cash_for_sell > 0:
            equities = {e.symbol: e for e, p in context.portfolio.positions.items()}
            instruments = list(reversed(list(ranker_prediction.instrument[ranker_prediction.instrument.apply(
                    lambda x: x in equities)])))
    
            for instrument in instruments:
                context.order_target(context.symbol(instrument), 0)
                cash_for_sell -= positions[instrument]
                if cash_for_sell <= 0:
                    break
    
        # 3. 生成买入订单:按机器学习算法预测的排序,买入前面的stock_count只股票
        buy_cash_weights = context.stock_weights
        buy_instruments = list(ranker_prediction.instrument[:len(buy_cash_weights)])
        max_cash_per_instrument = context.portfolio.portfolio_value * context.max_cash_per_instrument
        for i, instrument in enumerate(buy_instruments):
            cash = cash_for_buy * buy_cash_weights[i]
            if cash > max_cash_per_instrument - positions.get(instrument, 0):
                # 确保股票持仓量不会超过每次股票最大的占用资金量
                cash = max_cash_per_instrument - positions.get(instrument, 0)
            if cash > 0:
                context.order_value(context.symbol(instrument), cash)
    
    # 回测引擎:准备数据,只执行一次
    def m32_prepare_bigquant_run(context):
        pass
    
    
    m1 = M.instruments.v2(
        start_date='2018-06-01',
        end_date='2022-01-01',
        market='CN_STOCK_A',
        instrument_list='',
        max_count=0
    )
    
    m20 = M.use_datasource.v1(
        instruments=m1.data,
        datasource_id='bar1d_CN_STOCK_A',
        start_date='',
        end_date=''
    )
    
    m9 = M.instruments.v2(
        start_date=T.live_run_param('trading_date', '2022-01-01'),
        end_date=T.live_run_param('trading_date', '2022-06-22'),
        market='CN_STOCK_A',
        instrument_list='',
        max_count=0
    )
    
    m5 = M.use_datasource.v1(
        instruments=m9.data,
        datasource_id='bar1d_CN_STOCK_A',
        start_date='',
        end_date=''
    )
    
    m2 = M.input_features.v1(
        features="""# #号开始的表示注释
    # 多个特征,每行一个,可以包含基础特征和衍生特征
    
    correlation(log(volume_0),return_0,10)
    
    """
    )
    
    m15 = M.general_feature_extractor.v7(
        instruments=m1.data,
        features=m2.data,
        start_date='',
        end_date='',
        before_start_days=90
    )
    
    m16 = M.derived_feature_extractor.v3(
        input_data=m15.data,
        features=m2.data,
        date_col='date',
        instrument_col='instrument',
        drop_na=False,
        remove_extra_columns=False
    )
    
    m17 = M.general_feature_extractor.v7(
        instruments=m9.data,
        features=m2.data,
        start_date='',
        end_date='',
        before_start_days=90
    )
    
    m18 = M.derived_feature_extractor.v3(
        input_data=m17.data,
        features=m2.data,
        date_col='date',
        instrument_col='instrument',
        drop_na=False,
        remove_extra_columns=False
    )
    
    m26 = M.filter_stockcode.v2(
        input_1=m18.data,
        start='688'
    )
    
    m27 = M.filtet_st_stock.v7(
        input_1=m26.data_1
    )
    
    m28 = M.filter_delist_stocks.v3(
        input_1=m27.data_1
    )
    
    m31 = M.input_features.v1(
        features="""industry_sw_level2_0
    market_cap_float_0
    daily_return_0
    """
    )
    
    m33 = M.general_feature_extractor.v7(
        instruments=m1.data,
        features=m31.data,
        start_date='',
        end_date='',
        before_start_days=90
    )
    
    m34 = M.cached.v3(
        input_1=m33.data,
        run=m34_run_bigquant_run,
        post_run=m34_post_run_bigquant_run,
        input_ports='',
        params="""{
        'SW_type':'name_SW2'
    }""",
        output_ports='data_1,data_2'
    )
    
    m12 = M.join.v3(
        data1=m34.data_1,
        data2=m20.data,
        on='date,instrument',
        how='inner',
        sort=False
    )
    
    m25 = M.filter.v3(
        input_data=m12.data,
        expr='rank(板块流通收益率)>0.95',
        output_left_data=False
    )
    
    m21 = M.auto_labeler_on_datasource.v1(
        input_data=m25.data,
        label_expr="""
    
    # 计算收益:3日最高价(作为卖出价格)      除以明日开盘价(作为买入价格)   /  未来3日 的板块流通收益率
    -1*(shift(high, -3)/shift(open, -1))/shift(板块流通收益率, -3) -1
    # 极值处理:用1%和99%分位的值做clip
    clip(label, all_quantile(label, 0.01), all_quantile(label, 0.99))
    
    # 将分数映射到分类,这里使用20个分类
    all_wbins(label, 20)
    
    # 过滤掉一字涨停的情况 (设置label为NaN,在后续处理和训练中会忽略NaN的label)
    where(shift(high, -1) == shift(low, -1), NaN, label)""",
        drop_na_label=True,
        cast_label_int=True,
        date_col='date',
        instrument_col='instrument',
        user_functions={}
    )
    
    m7 = M.join.v3(
        data1=m21.data,
        data2=m16.data,
        on='date,instrument',
        how='inner',
        sort=False
    )
    
    m29 = M.filtet_st_stock.v7(
        input_1=m7.data
    )
    
    m13 = M.dropnan.v1(
        input_data=m29.data_1
    )
    
    m6 = M.stock_ranker_train.v6(
        training_ds=m13.data,
        features=m2.data,
        test_ds=m13.data,
        learning_algorithm='排序',
        number_of_leaves=30,
        minimum_docs_per_leaf=1000,
        number_of_trees=50,
        learning_rate=0.1,
        max_bins=1023,
        feature_fraction=1,
        data_row_fraction=1,
        plot_charts=True,
        ndcg_discount_base=1,
        m_lazy_run=False
    )
    
    m4 = M.input_features.v1(
        features="""industry_sw_level2_0
    market_cap_float_0
    daily_return_0
    """
    )
    
    m11 = M.general_feature_extractor.v7(
        instruments=m9.data,
        features=m4.data,
        start_date='',
        end_date='',
        before_start_days=90
    )
    
    m19 = M.cached.v3(
        input_1=m11.data,
        run=m19_run_bigquant_run,
        post_run=m19_post_run_bigquant_run,
        input_ports='',
        params="""{
        'SW_type':'name_SW2'
    }""",
        output_ports='data_1,data_2'
    )
    
    m22 = M.join.v3(
        data1=m19.data_1,
        data2=m5.data,
        on='date,instrument',
        how='inner',
        sort=False
    )
    
    m24 = M.join.v3(
        data1=m28.data,
        data2=m22.data,
        on='date,instrument',
        how='inner',
        sort=False
    )
    
    m30 = M.filter.v3(
        input_data=m24.data,
        expr='rank(板块流通收益率)>0.95',
        output_left_data=False
    )
    
    m14 = M.dropnan.v1(
        input_data=m30.data
    )
    
    m8 = M.stock_ranker_predict.v5(
        model=m6.model,
        data=m14.data,
        m_lazy_run=False
    )
    
    m23 = M.sort.v5(
        input_ds=m8.predictions,
        sort_by='score',
        group_by='date',
        keep_columns='--',
        ascending=False
    )
    
    m10 = M.trade.v4(
        instruments=m9.data,
        options_data=m23.sorted_data,
        start_date='',
        end_date='',
        initialize=m10_initialize_bigquant_run,
        handle_data=m10_handle_data_bigquant_run,
        prepare=m10_prepare_bigquant_run,
        volume_limit=0.025,
        order_price_field_buy='open',
        order_price_field_sell='close',
        capital_base=1000000,
        auto_cancel_non_tradable_orders=True,
        data_frequency='daily',
        price_type='真实价格',
        product_type='股票',
        plot_charts=True,
        backtest_only=False,
        benchmark='000300.SHA'
    )
    
    m3 = M.sort.v5(
        input_ds=m8.predictions,
        sort_by='score',
        group_by='date',
        keep_columns='--',
        ascending=True
    )
    
    m32 = M.trade.v4(
        instruments=m9.data,
        options_data=m3.sorted_data,
        start_date='',
        end_date='',
        initialize=m32_initialize_bigquant_run,
        handle_data=m32_handle_data_bigquant_run,
        prepare=m32_prepare_bigquant_run,
        volume_limit=0.025,
        order_price_field_buy='open',
        order_price_field_sell='close',
        capital_base=1000000,
        auto_cancel_non_tradable_orders=True,
        data_frequency='daily',
        price_type='真实价格',
        product_type='股票',
        plot_charts=True,
        backtest_only=False,
        benchmark='000300.SHA'
    )