{"description":"实验创建于2023/2/10","graph":{"edges":[{"to_node_id":"-127:sql","from_node_id":"-106:data"},{"to_node_id":"-106:sql1","from_node_id":"-115:data"},{"to_node_id":"-88:sql","from_node_id":"-118:data"},{"to_node_id":"-106:sql2","from_node_id":"-118:data"},{"to_node_id":"-103:data","from_node_id":"-110:predictions"},{"to_node_id":"-75:data","from_node_id":"-127:data"},{"to_node_id":"-75:validation_ds","from_node_id":"-127:data"},{"to_node_id":"-110:model","from_node_id":"-75:model"},{"to_node_id":"-110:data","from_node_id":"-88:data"}],"nodes":[{"node_id":"-106","module_id":"BigQuantSpace.sql_join_2.sql_join_2-v1","parameters":[{"name":"sql_join","value":"WITH\nsql1 AS (\n {sql1}\n),\nsql2 AS (\n {sql2}\n)\n\nSELECT * FROM sql1 JOIN sql2 USING (date, instrument)\nORDER BY date, instrument\n","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"sql1","node_id":"-106"},{"name":"sql2","node_id":"-106"}],"output_ports":[{"name":"data","node_id":"-106"}],"cacheable":true,"seq_num":3,"comment":"3. 合并因子和标注数据","comment_collapsed":false,"x":-453,"y":-242},{"node_id":"-115","module_id":"BigQuantSpace.input_features_dai.input_features_dai-v6","parameters":[{"name":"sql","value":"/*\n使用DAI SQL为量化模型预测生成标签数据。标签反映了未来5日的收益率,并且被离散化为20个桶,每个桶代表一个收益率范围。这样,我们就可以训练模型来预测未来的收益率范围,而不仅仅是具体的收益率值。\n\n1. 首先定义了一个名为label_data的临时表,用于计算和存储未来5日收益率,其1%和99%分位数,以及离散化后的收益率(被分为20个桶,每个桶代表一个收益率范围)。\n2. 对未来5日收益率进行了截断处理,只保留在1%和99%分位数之间的值。\n3. 选择了标签值不为空,并且非涨跌停(未来一天的最高价不等于最低价)的数据\n4. 从这个临时表中选择了日期、股票代码和标签字段,以供进模型训练。\n*/\n\nSELECT\n -- 计算的是未来5日的收益率。这是通过将5天后的收盘价除以第二天的开盘价得到的。这里使用的是一个叫做m_lead的函数,它可以获取某个字段在未来某天的值。\n -- _future_return 是一个中间变量名,以 _ 开始的别名列不会在最终结果返回\n m_lead(close, 5) / m_lead(open, 1) AS _future_return,\n\n -- 计算未来5日收益率的1%分位数。all_quantile_cont是一个分位数函数,它能够计算出某个字段值的分位数,这里是计算1%的分位数。\n c_quantile_cont(_future_return, 0.01) AS _future_return_1pct,\n\n -- 计算未来5日收益率的99%分位数。同样,all_quantile_cont函数用来计算99%的分位数。\n c_quantile_cont(_future_return, 0.99) AS _future_return_99pct,\n\n -- 对未来5日收益率进行截断处理,值位于1%和99%分位数之间的数据被保留,超过这个范围的值将被设为边界值。\n clip(_future_return, _future_return_1pct, _future_return_99pct) AS _clipped_return,\n\n -- 将截断后的未来5日收益率分为20个不同的桶,每个桶代表一个范围。all_wbins函数用于将数据离散化为多个桶。\n cbins(_clipped_return, 20) AS _binned_return,\n\n -- 将离散化后的数据作为标签使用,这是我们预测的目标。\n _binned_return AS label,\n\n -- 日期,这是每个股票每天的数据\n date,\n\n -- 股票代码,代表每一支股票\n instrument\n-- 从cn_stock_bar1d这个表中选择数据,这个表存储的是股票的日线数据\nFROM cn_stock_bar1d\nQUALIFY\n COLUMNS(*) IS NOT NULL\n -- 标签值不为空,且非涨跌停(未来一天的最高价不等于最低价)\n AND m_lead(high, 1) != m_lead(low, 1)\nORDER BY date, instrument","type":"Literal","bound_global_parameter":null}],"input_ports":[],"output_ports":[{"name":"data","node_id":"-115"}],"cacheable":true,"seq_num":1,"comment":"1. 设置预测目标,例如:根据未来5日收益率预测","comment_collapsed":false,"x":-453,"y":-383},{"node_id":"-118","module_id":"BigQuantSpace.input_features_dai.input_features_dai-v6","parameters":[{"name":"sql","value":"-- 使用DAI SQL获取数据,构建因子等,如下是一个例子作为参考\n-- DAI SQL 语法: https://bigquant.com/wiki/doc/dai-PLSbc1SbZX#h-sql%E5%85%A5%E9%97%A8%E6%95%99%E7%A8%8B\n\nSELECT\n -- 在这里输入因子表达式\n -- DAI SQL 算子/函数: https://bigquant.com/wiki/doc/dai-PLSbc1SbZX#h-%E5%87%BD%E6%95%B0\n -- 数据&字段: 数据文档 https://bigquant.com/data/home\n -- 在时间截面的total_market_cap排名\n -- _return_0 是中间变量,以下划线开始的变量是中间值,不会出现在最后的因子里\n close_0 / m_lag(close_0, 1) AS _return_0,\n close_0 / m_lag(close_0, 6) AS return_5,\n close_0 / m_lag(close_0, 11) AS return_10,\n close_0 / m_lag(close_0, 21) AS return_20,\n m_avg(amount_0, 5) AS avg_amount_5,\n m_avg(amount_0, 20) AS avg_amount_20,\n m_avg(amount_0, 10) AS _avg_amount_10,\n c_pct_rank(amount_0) / c_pct_rank(avg_amount_5),\n c_pct_rank(avg_amount_5) / c_pct_rank(_avg_amount_10),\n c_pct_rank(_return_0) AS rank_return_0,\n c_pct_rank(return_5) AS rank_return_5,\n c_pct_rank(return_10) AS rank_return_10,\n rank_return_0 / rank_return_5,\n rank_return_5 / rank_return_10,\n\n price_limit_status,\n IF(price_limit_status=3, 1, 0) AS f1,\n\n -- 日期和股票代码\n date,\n instrument\n-- 预计算因子和数据 cn_stock_factors https://bigquant.com/data/datasources/cn_stock_factors\nFROM cn_stock_factors\nWHERE\n -- 剔除ST股票\n st_status = 0\n -- 非停牌股\n AND suspended = 0\n -- 不属于北交所\n AND list_sector < 4\n-- 窗口函数内容过滤需要放在 QUALIFY 这里\nQUALIFY\n -- 去掉有空值的行\n COLUMNS(*) IS NOT NULL\n AND m_lag(price_limit_status, 1) == 3\n\n-- 按日期、股票代码排序\nORDER BY date, instrument\n","type":"Literal","bound_global_parameter":null}],"input_ports":[],"output_ports":[{"name":"data","node_id":"-118"}],"cacheable":true,"seq_num":2,"comment":"2. 编写因子","comment_collapsed":false,"x":-110,"y":-380},{"node_id":"-110","module_id":"BigQuantSpace.stock_ranker_dai_predict.stock_ranker_dai_predict-v3","parameters":[],"input_ports":[{"name":"model","node_id":"-110"},{"name":"data","node_id":"-110"}],"output_ports":[{"name":"predictions","node_id":"-110"}],"cacheable":true,"seq_num":14,"comment":"7. 预测","comment_collapsed":false,"x":-227,"y":153},{"node_id":"-103","module_id":"BigQuantSpace.bigtrader.bigtrader-v7","parameters":[{"name":"start_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"end_date","value":"","type":"Literal","bound_global_parameter":null},{"name":"initialize","value":"# 交易引擎:初始化函数,只执行一次\ndef bigquant_run(context):\n # context.ranker_prediction = context.options['data'].read_bdb(as_type=pd.DataFrame)\n # 系统已经设置了默认的交易手续费和滑点,要修改手续费可使用如下函数\n context.set_commission(PerOrder(buy_cost=0.0003, sell_cost=0.0013, min_cost=5))\n # 预测数据,通过 options 传入进来,使用 read_df 函数,加载到内存 (DataFrame)\n # 设置买入的股票数量,这里买入预测股票列表排名靠前的5只\n stock_count = 5\n # 每只的股票的权重,如下的权重分配会使得靠前的股票分配多一点的资金,[0.339160, 0.213986, 0.169580, ..]\n context.stock_weights = T.norm([1 / math.log(i + 2) for i in range(0, stock_count)])\n # 设置每只股票占用的最大资金比例\n context.max_cash_per_instrument = 0.2\n context.options['hold_days'] = 5\n","type":"Literal","bound_global_parameter":null},{"name":"before_trading_start","value":"# 交易引擎:每个单位时间开盘前调用一次。\ndef bigquant_run(context, data):\n # 盘前处理,订阅行情等\n pass\n","type":"Literal","bound_global_parameter":null},{"name":"handle_tick","value":"# 交易引擎:tick数据处理函数,每个tick执行一次\ndef bigquant_run(context, tick):\n pass\n","type":"Literal","bound_global_parameter":null},{"name":"handle_data","value":"# 回测引擎:每日数据处理函数,每天执行一次\ndef bigquant_run(context, data):\n # 按日期过滤得到今日的预测数据\n ranker_prediction = context.data[context.data.date == data.current_dt.strftime('%Y-%m-%d')]\n\n # 1. 资金分配\n # 平均持仓时间是hold_days,每日都将买入股票,每日预期使用 1/hold_days 的资金\n # 实际操作中,会存在一定的买入误差,所以在前hold_days天,等量使用资金;之后,尽量使用剩余资金(这里设置最多用等量的1.5倍)\n is_staging = context.trading_day_index < context.options['hold_days'] # 是否在建仓期间(前 hold_days 天)\n cash_avg = context.portfolio.portfolio_value / context.options['hold_days']\n cash_for_buy = min(context.portfolio.cash, (1 if is_staging else 1.5) * cash_avg)\n cash_for_sell = cash_avg - (context.portfolio.cash - cash_for_buy)\n positions = {e: p.amount * p.last_sale_price\n for e, p in context.portfolio.positions.items()}\n\n # 2. 生成卖出订单:hold_days天之后才开始卖出;对持仓的股票,按机器学习算法预测的排序末位淘汰\n if not is_staging and cash_for_sell > 0:\n equities = {e: e for e, p in context.portfolio.positions.items()}\n instruments = list(reversed(list(ranker_prediction.instrument[ranker_prediction.instrument.apply(\n lambda x: x in equities)])))\n\n for instrument in instruments:\n context.order_target(context.symbol(instrument), 0)\n cash_for_sell -= positions[instrument]\n if cash_for_sell <= 0:\n break\n\n # 3. 生成买入订单:按机器学习算法预测的排序,买入前面的stock_count只股票\n buy_cash_weights = context.stock_weights\n buy_instruments = list(ranker_prediction.instrument[:len(buy_cash_weights)])\n max_cash_per_instrument = context.portfolio.portfolio_value * context.max_cash_per_instrument\n for i, instrument in enumerate(buy_instruments):\n cash = cash_for_buy * buy_cash_weights[i]\n if cash > max_cash_per_instrument - positions.get(instrument, 0):\n # 确保股票持仓量不会超过每次股票最大的占用资金量\n cash = max_cash_per_instrument - positions.get(instrument, 0)\n if cash > 0:\n context.order_value(context.symbol(instrument), cash)\n","type":"Literal","bound_global_parameter":null},{"name":"handle_trade","value":"# 交易引擎:成交回报处理函数,每个成交发生时执行一次\ndef bigquant_run(context, trade):\n pass\n","type":"Literal","bound_global_parameter":null},{"name":"handle_order","value":"# 交易引擎:委托回报处理函数,每个委托变化时执行一次\ndef bigquant_run(context, order):\n pass\n","type":"Literal","bound_global_parameter":null},{"name":"after_trading","value":"# 交易引擎:盘后处理函数,每日盘后执行一次\ndef bigquant_run(context, data):\n pass\n","type":"Literal","bound_global_parameter":null},{"name":"capital_base","value":1000000,"type":"Literal","bound_global_parameter":null},{"name":"frequency","value":"daily","type":"Literal","bound_global_parameter":null},{"name":"product_type","value":"股票","type":"Literal","bound_global_parameter":null},{"name":"before_start_days","value":0,"type":"Literal","bound_global_parameter":null},{"name":"volume_limit","value":1,"type":"Literal","bound_global_parameter":null},{"name":"order_price_field_buy","value":"open","type":"Literal","bound_global_parameter":null},{"name":"order_price_field_sell","value":"close","type":"Literal","bound_global_parameter":null},{"name":"benchmark","value":"000300.SH","type":"Literal","bound_global_parameter":null},{"name":"plot_charts","value":"True","type":"Literal","bound_global_parameter":null},{"name":"disable_cache","value":"True","type":"Literal","bound_global_parameter":null},{"name":"debug","value":"False","type":"Literal","bound_global_parameter":null},{"name":"backtest_only","value":"False","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"data","node_id":"-103"},{"name":"options_data","node_id":"-103"},{"name":"history_ds","node_id":"-103"},{"name":"benchmark_ds","node_id":"-103"}],"output_ports":[{"name":"raw_perf","node_id":"-103"}],"cacheable":true,"seq_num":8,"comment":"8. 交易,设置处理函数等","comment_collapsed":false,"x":-129,"y":293},{"node_id":"-127","module_id":"BigQuantSpace.extract_data_dai.extract_data_dai-v7","parameters":[{"name":"start_date","value":"2018-01-01","type":"Literal","bound_global_parameter":null},{"name":"start_date_bound_to_trading_date","value":"False","type":"Literal","bound_global_parameter":null},{"name":"end_date","value":"2020-12-31","type":"Literal","bound_global_parameter":null},{"name":"end_date_bound_to_trading_date","value":"False","type":"Literal","bound_global_parameter":null},{"name":"before_start_days","value":90,"type":"Literal","bound_global_parameter":null},{"name":"debug","value":"False","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"sql","node_id":"-127"}],"output_ports":[{"name":"data","node_id":"-127"}],"cacheable":true,"seq_num":4,"comment":"4. 训练数据,设置训练开始时间和结束时间","comment_collapsed":false,"x":-451,"y":-120},{"node_id":"-75","module_id":"BigQuantSpace.stock_ranker_dai_train.stock_ranker_dai_train-v2","parameters":[{"name":"learning_algorithm","value":"排序","type":"Literal","bound_global_parameter":null},{"name":"number_of_leaves","value":"30","type":"Literal","bound_global_parameter":null},{"name":"min_docs_per_leaf","value":"200","type":"Literal","bound_global_parameter":null},{"name":"number_of_trees","value":"10","type":"Literal","bound_global_parameter":null},{"name":"learning_rate","value":"0.1","type":"Literal","bound_global_parameter":null},{"name":"max_bins","value":"1023","type":"Literal","bound_global_parameter":null},{"name":"feature_fraction","value":"1","type":"Literal","bound_global_parameter":null},{"name":"data_row_fraction","value":"1","type":"Literal","bound_global_parameter":null},{"name":"plot_charts","value":"True","type":"Literal","bound_global_parameter":null},{"name":"ndcg_discount_base","value":"1","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"data","node_id":"-75"},{"name":"validation_ds","node_id":"-75"},{"name":"base_model","node_id":"-75"}],"output_ports":[{"name":"model","node_id":"-75"}],"cacheable":true,"seq_num":5,"comment":"5. 使用StockRanker算法训练","comment_collapsed":false,"x":-447,"y":28},{"node_id":"-88","module_id":"BigQuantSpace.extract_data_dai.extract_data_dai-v7","parameters":[{"name":"start_date","value":"2021-01-01","type":"Literal","bound_global_parameter":null},{"name":"start_date_bound_to_trading_date","value":"True","type":"Literal","bound_global_parameter":null},{"name":"end_date","value":"2021-12-31","type":"Literal","bound_global_parameter":null},{"name":"end_date_bound_to_trading_date","value":"True","type":"Literal","bound_global_parameter":null},{"name":"before_start_days","value":90,"type":"Literal","bound_global_parameter":null},{"name":"debug","value":"False","type":"Literal","bound_global_parameter":null}],"input_ports":[{"name":"sql","node_id":"-88"}],"output_ports":[{"name":"data","node_id":"-88"}],"cacheable":true,"seq_num":6,"comment":"6. 预测数据,设置预测时间,开启模拟交易时绑定交易日期","comment_collapsed":false,"x":-107,"y":-119}],"node_layout":"<node_postions><node_position Node='-106' Position='-453,-242,200,200'/><node_position Node='-115' Position='-453,-383,200,200'/><node_position Node='-118' Position='-110,-380,200,200'/><node_position Node='-110' Position='-227,153,200,200'/><node_position Node='-103' Position='-129,293,200,200'/><node_position Node='-127' Position='-451,-120,200,200'/><node_position Node='-75' Position='-447,28,200,200'/><node_position Node='-88' Position='-107,-119,200,200'/></node_postions>"},"nodes_readonly":false,"studio_version":"v2"}
    In [1]:
    # 本代码由可视化策略环境自动生成 2023年11月21日 20:54
    # 本代码单元只能在可视化模式下编辑。您也可以拷贝代码,粘贴到新建的代码单元或者策略,然后修改。
     
    # 显式导入 BigQuant 相关 SDK 模块
    from bigdatasource.api import DataSource
    from bigdata.api.datareader import D
    from biglearning.api import M
    from biglearning.api import tools as T
    from biglearning.module2.common.data import Outputs
     
    import pandas as pd
    import numpy as np
    import math
    import warnings
    import datetime
     
    from zipline.finance.commission import PerOrder
    from zipline.api import get_open_orders
    from zipline.api import symbol
     
    from bigtrader.sdk import *
    from bigtrader.utils.my_collections import NumPyDeque
    from bigtrader.constant import OrderType
    from bigtrader.constant import Direction
    
    # <aistudiograph>
    
    # @param(id="m8", name="initialize")
    # 交易引擎:初始化函数,只执行一次
    def m8_initialize_bigquant_run(context):
        # context.ranker_prediction = context.options['data'].read_bdb(as_type=pd.DataFrame)
        # 系统已经设置了默认的交易手续费和滑点,要修改手续费可使用如下函数
        context.set_commission(PerOrder(buy_cost=0.0003, sell_cost=0.0013, min_cost=5))
        # 预测数据,通过 options 传入进来,使用 read_df 函数,加载到内存 (DataFrame)
        # 设置买入的股票数量,这里买入预测股票列表排名靠前的5只
        stock_count = 5
        # 每只的股票的权重,如下的权重分配会使得靠前的股票分配多一点的资金,[0.339160, 0.213986, 0.169580, ..]
        context.stock_weights = T.norm([1 / math.log(i + 2) for i in range(0, stock_count)])
        # 设置每只股票占用的最大资金比例
        context.max_cash_per_instrument = 0.2
        context.options['hold_days'] = 5
    
    # @param(id="m8", name="before_trading_start")
    # 交易引擎:每个单位时间开盘前调用一次。
    def m8_before_trading_start_bigquant_run(context, data):
        # 盘前处理,订阅行情等
        pass
    
    # @param(id="m8", name="handle_tick")
    # 交易引擎:tick数据处理函数,每个tick执行一次
    def m8_handle_tick_bigquant_run(context, tick):
        pass
    
    # @param(id="m8", name="handle_data")
    # 回测引擎:每日数据处理函数,每天执行一次
    def m8_handle_data_bigquant_run(context, data):
        # 按日期过滤得到今日的预测数据
        ranker_prediction = context.data[context.data.date == data.current_dt.strftime('%Y-%m-%d')]
    
        # 1. 资金分配
        # 平均持仓时间是hold_days,每日都将买入股票,每日预期使用 1/hold_days 的资金
        # 实际操作中,会存在一定的买入误差,所以在前hold_days天,等量使用资金;之后,尽量使用剩余资金(这里设置最多用等量的1.5倍)
        is_staging = context.trading_day_index < context.options['hold_days'] # 是否在建仓期间(前 hold_days 天)
        cash_avg = context.portfolio.portfolio_value / context.options['hold_days']
        cash_for_buy = min(context.portfolio.cash, (1 if is_staging else 1.5) * cash_avg)
        cash_for_sell = cash_avg - (context.portfolio.cash - cash_for_buy)
        positions = {e: p.amount * p.last_sale_price
                     for e, p in context.portfolio.positions.items()}
    
        # 2. 生成卖出订单:hold_days天之后才开始卖出;对持仓的股票,按机器学习算法预测的排序末位淘汰
        if not is_staging and cash_for_sell > 0:
            equities = {e: e for e, p in context.portfolio.positions.items()}
            instruments = list(reversed(list(ranker_prediction.instrument[ranker_prediction.instrument.apply(
                    lambda x: x in equities)])))
    
            for instrument in instruments:
                context.order_target(context.symbol(instrument), 0)
                cash_for_sell -= positions[instrument]
                if cash_for_sell <= 0:
                    break
    
        # 3. 生成买入订单:按机器学习算法预测的排序,买入前面的stock_count只股票
        buy_cash_weights = context.stock_weights
        buy_instruments = list(ranker_prediction.instrument[:len(buy_cash_weights)])
        max_cash_per_instrument = context.portfolio.portfolio_value * context.max_cash_per_instrument
        for i, instrument in enumerate(buy_instruments):
            cash = cash_for_buy * buy_cash_weights[i]
            if cash > max_cash_per_instrument - positions.get(instrument, 0):
                # 确保股票持仓量不会超过每次股票最大的占用资金量
                cash = max_cash_per_instrument - positions.get(instrument, 0)
            if cash > 0:
                context.order_value(context.symbol(instrument), cash)
    
    # @param(id="m8", name="handle_trade")
    # 交易引擎:成交回报处理函数,每个成交发生时执行一次
    def m8_handle_trade_bigquant_run(context, trade):
        pass
    
    # @param(id="m8", name="handle_order")
    # 交易引擎:委托回报处理函数,每个委托变化时执行一次
    def m8_handle_order_bigquant_run(context, order):
        pass
    
    # @param(id="m8", name="after_trading")
    # 交易引擎:盘后处理函数,每日盘后执行一次
    def m8_after_trading_bigquant_run(context, data):
        pass
    
    
    # @module(position="-453,-383", comment='1. 设置预测目标,例如:根据未来5日收益率预测', comment_collapsed=False)
    m1 = M.input_features_dai.v6(
        sql="""/*
    使用DAI SQL为量化模型预测生成标签数据。标签反映了未来5日的收益率,并且被离散化为20个桶,每个桶代表一个收益率范围。这样,我们就可以训练模型来预测未来的收益率范围,而不仅仅是具体的收益率值。
    
    1. 首先定义了一个名为label_data的临时表,用于计算和存储未来5日收益率,其1%和99%分位数,以及离散化后的收益率(被分为20个桶,每个桶代表一个收益率范围)。
    2. 对未来5日收益率进行了截断处理,只保留在1%和99%分位数之间的值。
    3. 选择了标签值不为空,并且非涨跌停(未来一天的最高价不等于最低价)的数据
    4. 从这个临时表中选择了日期、股票代码和标签字段,以供进模型训练。
    */
    
    SELECT
        -- 计算的是未来5日的收益率。这是通过将5天后的收盘价除以第二天的开盘价得到的。这里使用的是一个叫做m_lead的函数,它可以获取某个字段在未来某天的值。
        -- _future_return 是一个中间变量名,以 _ 开始的别名列不会在最终结果返回
        m_lead(close, 5) / m_lead(open, 1) AS _future_return,
    
        -- 计算未来5日收益率的1%分位数。all_quantile_cont是一个分位数函数,它能够计算出某个字段值的分位数,这里是计算1%的分位数。
        c_quantile_cont(_future_return, 0.01) AS _future_return_1pct,
    
        -- 计算未来5日收益率的99%分位数。同样,all_quantile_cont函数用来计算99%的分位数。
        c_quantile_cont(_future_return, 0.99) AS _future_return_99pct,
    
        -- 对未来5日收益率进行截断处理,值位于1%和99%分位数之间的数据被保留,超过这个范围的值将被设为边界值。
        clip(_future_return, _future_return_1pct, _future_return_99pct) AS _clipped_return,
    
        -- 将截断后的未来5日收益率分为20个不同的桶,每个桶代表一个范围。all_wbins函数用于将数据离散化为多个桶。
        cbins(_clipped_return, 20) AS _binned_return,
    
        -- 将离散化后的数据作为标签使用,这是我们预测的目标。
        _binned_return AS label,
    
        -- 日期,这是每个股票每天的数据
        date,
    
        -- 股票代码,代表每一支股票
        instrument
    -- 从cn_stock_bar1d这个表中选择数据,这个表存储的是股票的日线数据
    FROM cn_stock_bar1d
    QUALIFY
        COLUMNS(*) IS NOT NULL
        -- 标签值不为空,且非涨跌停(未来一天的最高价不等于最低价)
        AND m_lead(high, 1) != m_lead(low, 1)
    ORDER BY date, instrument"""
    )
    
    # @module(position="-110,-380", comment='2. 编写因子', comment_collapsed=False)
    m2 = M.input_features_dai.v6(
        sql="""-- 使用DAI SQL获取数据,构建因子等,如下是一个例子作为参考
    -- DAI SQL 语法: https://bigquant.com/wiki/doc/dai-PLSbc1SbZX#h-sql%E5%85%A5%E9%97%A8%E6%95%99%E7%A8%8B
    
    SELECT
        -- 在这里输入因子表达式
        -- DAI SQL 算子/函数: https://bigquant.com/wiki/doc/dai-PLSbc1SbZX#h-%E5%87%BD%E6%95%B0
        -- 数据&字段: 数据文档 https://bigquant.com/data/home
        -- 在时间截面的total_market_cap排名
        -- _return_0 是中间变量,以下划线开始的变量是中间值,不会出现在最后的因子里
        close_0 / m_lag(close_0, 1) AS _return_0,
        close_0 / m_lag(close_0, 6) AS return_5,
        close_0 / m_lag(close_0, 11) AS return_10,
        close_0 / m_lag(close_0, 21) AS return_20,
        m_avg(amount_0, 5) AS avg_amount_5,
        m_avg(amount_0, 20) AS avg_amount_20,
        m_avg(amount_0, 10) AS _avg_amount_10,
        c_pct_rank(amount_0) / c_pct_rank(avg_amount_5),
        c_pct_rank(avg_amount_5) / c_pct_rank(_avg_amount_10),
        c_pct_rank(_return_0) AS rank_return_0,
        c_pct_rank(return_5) AS rank_return_5,
        c_pct_rank(return_10) AS rank_return_10,
        rank_return_0 / rank_return_5,
        rank_return_5 / rank_return_10,
    
        price_limit_status,
        IF(price_limit_status=3, 1, 0) AS f1,
    
        -- 日期和股票代码
        date,
        instrument
    -- 预计算因子和数据 cn_stock_factors https://bigquant.com/data/datasources/cn_stock_factors
    FROM cn_stock_factors
    WHERE
        -- 剔除ST股票
        st_status = 0
        -- 非停牌股
        AND suspended = 0
        -- 不属于北交所
        AND list_sector < 4
    -- 窗口函数内容过滤需要放在 QUALIFY 这里
    QUALIFY
        -- 去掉有空值的行
        COLUMNS(*) IS NOT NULL
        AND m_lag(price_limit_status, 1) == 3
    
    -- 按日期、股票代码排序
    ORDER BY date, instrument
    """
    )
    
    # @module(position="-107,-119", comment='6. 预测数据,设置预测时间,开启模拟交易时绑定交易日期', comment_collapsed=False)
    m6 = M.extract_data_dai.v7(
        sql=m2.data,
        start_date='2021-01-01',
        start_date_bound_to_trading_date=True,
        end_date='2021-12-31',
        end_date_bound_to_trading_date=True,
        before_start_days=90,
        debug=False
    )
    
    # @module(position="-453,-242", comment='3. 合并因子和标注数据', comment_collapsed=False)
    m3 = M.sql_join_2.v1(
        sql1=m1.data,
        sql2=m2.data,
        sql_join="""WITH
    sql1 AS (
        {sql1}
    ),
    sql2 AS (
        {sql2}
    )
    
    SELECT * FROM sql1 JOIN sql2 USING (date, instrument)
    ORDER BY date, instrument
    """
    )
    
    # @module(position="-451,-120", comment='4. 训练数据,设置训练开始时间和结束时间', comment_collapsed=False)
    m4 = M.extract_data_dai.v7(
        sql=m3.data,
        start_date='2018-01-01',
        start_date_bound_to_trading_date=False,
        end_date='2020-12-31',
        end_date_bound_to_trading_date=False,
        before_start_days=90,
        debug=False
    )
    
    # @module(position="-447,28", comment='5. 使用StockRanker算法训练', comment_collapsed=False)
    m5 = M.stock_ranker_dai_train.v2(
        data=m4.data,
        validation_ds=m4.data,
        learning_algorithm='排序',
        number_of_leaves=30,
        min_docs_per_leaf=200,
        number_of_trees=10,
        learning_rate=0.1,
        max_bins=1023,
        feature_fraction=1,
        data_row_fraction=1,
        plot_charts=True,
        ndcg_discount_base=1
    )
    
    # @module(position="-227,153", comment='7. 预测', comment_collapsed=False)
    m14 = M.stock_ranker_dai_predict.v3(
        model=m5.model,
        data=m6.data
    )
    
    # @module(position="-129,293", comment='8. 交易,设置处理函数等', comment_collapsed=False)
    m8 = M.bigtrader.v7(
        data=m14.predictions,
        start_date='',
        end_date='',
        initialize=m8_initialize_bigquant_run,
        before_trading_start=m8_before_trading_start_bigquant_run,
        handle_tick=m8_handle_tick_bigquant_run,
        handle_data=m8_handle_data_bigquant_run,
        handle_trade=m8_handle_trade_bigquant_run,
        handle_order=m8_handle_order_bigquant_run,
        after_trading=m8_after_trading_bigquant_run,
        capital_base=1000000,
        frequency='daily',
        product_type='股票',
        before_start_days=0,
        volume_limit=1,
        order_price_field_buy='open',
        order_price_field_sell='close',
        benchmark='000300.SH',
        plot_charts=True,
        disable_cache=True,
        debug=False,
        backtest_only=False
    )
    # </aistudiograph>
    
    • 收益率51.61%
    • 年化收益率51.35%
    • 基准收益率-6.21%
    • 阿尔法0.58
    • 贝塔0.45
    • 夏普比率2.23
    • 胜率0.82
    • 盈亏比10.15
    • 收益波动率18.8%
    • 信息比率0.16
    • 最大回撤12.37%
    日期 时间 股票代码 股票名称 买/卖 数量 成交价 总成本 交易佣金
    Loading... (need help?)
    日期 标的代码 标的名称 持仓均价 收盘价 数量 持仓价值 收益
    Loading... (need help?)
    时间 级别 内容
    Loading... (need help?)
    In [5]:
    m5.validation_ndcg.read()
    
    Out[5]:
    NDCG@1 NDCG@2 NDCG@3 NDCG@4 NDCG@5 NDCG@6 NDCG@7 NDCG@8 NDCG@9 NDCG@10
    0 0.4034 0.4311 0.4500 0.4637 0.4812 0.5005 0.5182 0.5360 0.5537 0.5714
    1 0.5307 0.5274 0.5314 0.5443 0.5585 0.5756 0.5906 0.6060 0.6195 0.6316
    2 0.5718 0.5610 0.5588 0.5699 0.5800 0.5936 0.6104 0.6233 0.6358 0.6483
    3 0.5989 0.5688 0.5710 0.5783 0.5926 0.6064 0.6225 0.6337 0.6467 0.6592
    4 0.6211 0.5833 0.5811 0.5855 0.5982 0.6122 0.6256 0.6402 0.6532 0.6660
    5 0.6318 0.5943 0.5917 0.5968 0.6043 0.6198 0.6302 0.6450 0.6579 0.6704
    6 0.6416 0.6058 0.5984 0.6028 0.6106 0.6247 0.6373 0.6500 0.6629 0.6752
    7 0.6527 0.6135 0.6047 0.6053 0.6148 0.6287 0.6398 0.6539 0.6658 0.6782
    8 0.6603 0.6180 0.6055 0.6095 0.6164 0.6306 0.6417 0.6537 0.6667 0.6793
    9 0.6704 0.6292 0.6158 0.6143 0.6217 0.6346 0.6451 0.6580 0.6699 0.6815
    In [6]:
    %%chart
    from _
    
    In [ ]: