特征数据中不可以使用 >或< 吗


(yilong10) #1

比如 close_0>close_1


(lu0817) #2

为什么不用大于0或小于0的数字做标签呢,不是一样吗


(yilong10) #3

您的意思是 close_0-close_1>0 吗 我试了没有显示出错 但也没有效果


(iQuant) #4

可以的,这样就相当于是布尔类型的特征。


(lu0817) #5

把n=close_0-close_1,使用n就行了


(神龙斗士) #6

相减的话,股票价格很大的和很小的不在统一比较尺度。可以考虑用 close_0 / close_1


(yilong10) #7

如果我希望 特征是最近连涨3天 能否 使用 close_0>close_1>close_2>close_3 表示?


(iQuant) #8
close_0>close_1>close_2>close_3

这样的特征是可以提取出来并参与模型训练,但这和希望机器找出这样形态股票还不能划等号。


(yilong10) #9

为什么会产生这种落差呢,或者说有没有什么方式可减少这种落差的发生


(神龙斗士) #10

如果这个特征很有效,算法一般会很好的使用,在最后的模型中的贡献度也很高。

如下做了一个非常简单的因子和收益的相关性分析,数据显示相关度低(-0.0069482)。

克隆策略

    {"Description":"实验创建于2018/4/22","Summary":"","Graph":{"EdgesInternal":[{"DestinationInputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-29:instruments","SourceOutputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-8:data"},{"DestinationInputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-29:features","SourceOutputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24:data"},{"DestinationInputPortId":"-747:features","SourceOutputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24:data"},{"DestinationInputPortId":"-757:input_2","SourceOutputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24:data"},{"DestinationInputPortId":"-747:input_data","SourceOutputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-29:data"},{"DestinationInputPortId":"-757:input_1","SourceOutputPortId":"-747:data"},{"DestinationInputPortId":"-765:input_1","SourceOutputPortId":"-757:data_1"}],"ModuleNodes":[{"Id":"287d2cb0-f53c-4101-bdf8-104b137c8601-8","ModuleId":"BigQuantSpace.instruments.instruments-v2","ModuleParameters":[{"Name":"start_date","Value":"2010-01-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"2015-01-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"market","Value":"CN_STOCK_A","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"instrument_list","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"max_count","Value":"0","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"rolling_conf","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-8"}],"OutputPortsInternal":[{"Name":"data","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-8","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":1,"Comment":"","CommentCollapsed":true},{"Id":"287d2cb0-f53c-4101-bdf8-104b137c8601-24","ModuleId":"BigQuantSpace.input_features.input_features-v1","ModuleParameters":[{"Name":"features","Value":"# #号开始的表示注释\n# 多个特征,每行一个,可以包含基础特征和衍生特征\nshift(close_0, -5) / shift(open_0, -1) - 1\nwhere((close_0 > close_1) & (close_1 > close_2) & (close_2 > close_3), 1, 0)\n","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features_ds","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24"}],"OutputPortsInternal":[{"Name":"data","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":3,"Comment":"","CommentCollapsed":true},{"Id":"287d2cb0-f53c-4101-bdf8-104b137c8601-29","ModuleId":"BigQuantSpace.general_feature_extractor.general_feature_extractor-v6","ModuleParameters":[{"Name":"start_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"before_start_days","Value":0,"ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"instruments","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-29"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-29"}],"OutputPortsInternal":[{"Name":"data","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-29","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":4,"Comment":"","CommentCollapsed":true},{"Id":"-747","ModuleId":"BigQuantSpace.derived_feature_extractor.derived_feature_extractor-v2","ModuleParameters":[{"Name":"date_col","Value":"date","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"instrument_col","Value":"instrument","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"user_functions","Value":"{}","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_data","NodeId":"-747"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-747"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-747","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":5,"Comment":"","CommentCollapsed":true},{"Id":"-757","ModuleId":"BigQuantSpace.cached.cached-v3","ModuleParameters":[{"Name":"run","Value":"# Python 代码入口函数,input_1/2/3 对应三个输入端,data_1/2/3 对应三个输出端\ndef bigquant_run(input_1, input_2, input_3):\n # 示例代码如下。在这里编写您的代码\n df = input_1.read_df()\n df = df.dropna()\n fields = input_2.read_pickle()\n result = np.corrcoef(df[fields].values.transpose())\n data_1 = DataSource.write_pickle(result)\n return Outputs(data_1=data_1, data_2=None, data_3=None)\n","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_1","NodeId":"-757"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_2","NodeId":"-757"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_3","NodeId":"-757"}],"OutputPortsInternal":[{"Name":"data_1","NodeId":"-757","OutputType":null},{"Name":"data_2","NodeId":"-757","OutputType":null},{"Name":"data_3","NodeId":"-757","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":6,"Comment":"计算相关性","CommentCollapsed":false},{"Id":"-765","ModuleId":"BigQuantSpace.cached.cached-v3","ModuleParameters":[{"Name":"run","Value":"# Python 代码入口函数,input_1/2/3 对应三个输入端,data_1/2/3 对应三个输出端\ndef bigquant_run(input_1, input_2, input_3):\n # 示例代码如下。在这里编写您的代码\n print(input_1.read_pickle())\n return Outputs(data_1=None, data_2=None, data_3=None)\n","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_1","NodeId":"-765"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_2","NodeId":"-765"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_3","NodeId":"-765"}],"OutputPortsInternal":[{"Name":"data_1","NodeId":"-765","OutputType":null},{"Name":"data_2","NodeId":"-765","OutputType":null},{"Name":"data_3","NodeId":"-765","OutputType":null}],"UsePreviousResults":false,"moduleIdForCode":7,"Comment":"打印结果","CommentCollapsed":false}],"SerializedClientData":"<?xml version='1.0' encoding='utf-16'?><DataV1 xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'><Meta /><NodePositions><NodePosition Node='287d2cb0-f53c-4101-bdf8-104b137c8601-8' Position='24.471649169921875,25.450817108154297,200,200'/><NodePosition Node='287d2cb0-f53c-4101-bdf8-104b137c8601-24' Position='350.9071350097656,25.974109649658203,200,200'/><NodePosition Node='287d2cb0-f53c-4101-bdf8-104b137c8601-29' Position='184.52352905273438,140.74618530273438,200,200'/><NodePosition Node='-747' Position='270.36236572265625,263.4972229003906,200,200'/><NodePosition Node='-757' Position='442.6271057128906,368.80059814453125,200,200'/><NodePosition Node='-765' Position='467.49755859375,505.5880126953125,200,200'/></NodePositions><NodeGroups /></DataV1>"},"IsDraft":true,"ParentExperimentId":null,"WebService":{"IsWebServiceExperiment":false,"Inputs":[],"Outputs":[],"Parameters":[{"Name":"交易日期","Value":"","ParameterDefinition":{"Name":"交易日期","FriendlyName":"交易日期","DefaultValue":"","ParameterType":"String","HasDefaultValue":true,"IsOptional":true,"ParameterRules":[],"HasRules":false,"MarkupType":0,"CredentialDescriptor":null}}],"WebServiceGroupId":null,"SerializedClientData":"<?xml version='1.0' encoding='utf-16'?><DataV1 xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'><Meta /><NodePositions></NodePositions><NodeGroups /></DataV1>"},"DisableNodesUpdate":false,"Category":"user","Tags":[],"IsPartialRun":true}
    In [33]:
    # 本代码由可视化策略环境自动生成 2018年4月22日 19:29
    # 本代码单元只能在可视化模式下编辑。您也可以拷贝代码,粘贴到新建的代码单元或者策略,然后修改。
    
    
    m1 = M.instruments.v2(
        start_date='2010-01-01',
        end_date='2015-01-01',
        market='CN_STOCK_A',
        instrument_list='',
        max_count=0
    )
    
    m3 = M.input_features.v1(
        features="""# #号开始的表示注释
    # 多个特征,每行一个,可以包含基础特征和衍生特征
    shift(close_0, -5) / shift(open_0, -1) - 1
    where((close_0 > close_1) & (close_1 > close_2) & (close_2 > close_3), 1, 0)
    """
    )
    
    m4 = M.general_feature_extractor.v6(
        instruments=m1.data,
        features=m3.data,
        start_date='',
        end_date='',
        before_start_days=0
    )
    
    m5 = M.derived_feature_extractor.v2(
        input_data=m4.data,
        features=m3.data,
        date_col='date',
        instrument_col='instrument',
        user_functions={}
    )
    
    # Python 代码入口函数,input_1/2/3 对应三个输入端,data_1/2/3 对应三个输出端
    def m6_run_bigquant_run(input_1, input_2, input_3):
        # 示例代码如下。在这里编写您的代码
        df = input_1.read_df()
        df = df.dropna()
        fields = input_2.read_pickle()
        result = np.corrcoef(df[fields].values.transpose())
        data_1 = DataSource.write_pickle(result)
        return Outputs(data_1=data_1, data_2=None, data_3=None)
    
    m6 = M.cached.v3(
        input_1=m5.data,
        input_2=m3.data,
        run=m6_run_bigquant_run
    )
    
    # Python 代码入口函数,input_1/2/3 对应三个输入端,data_1/2/3 对应三个输出端
    def m7_run_bigquant_run(input_1, input_2, input_3):
        # 示例代码如下。在这里编写您的代码
        print(input_1.read_pickle())
        return Outputs(data_1=None, data_2=None, data_3=None)
    
    m7 = M.cached.v3(
        input_1=m6.data_1,
        run=m7_run_bigquant_run,
        m_cached=False
    )
    
    [2018-04-22 19:28:27.594433] INFO: bigquant: instruments.v2 开始运行..
    [2018-04-22 19:28:27.617732] INFO: bigquant: 命中缓存
    [2018-04-22 19:28:27.618954] INFO: bigquant: instruments.v2 运行完成[0.024569s].
    [2018-04-22 19:28:27.624114] INFO: bigquant: input_features.v1 开始运行..
    [2018-04-22 19:28:27.627246] INFO: bigquant: 命中缓存
    [2018-04-22 19:28:27.628227] INFO: bigquant: input_features.v1 运行完成[0.004116s].
    [2018-04-22 19:28:27.637351] INFO: bigquant: general_feature_extractor.v6 开始运行..
    [2018-04-22 19:28:27.639497] INFO: bigquant: 命中缓存
    [2018-04-22 19:28:27.640439] INFO: bigquant: general_feature_extractor.v6 运行完成[0.003092s].
    [2018-04-22 19:28:27.647720] INFO: bigquant: derived_feature_extractor.v2 开始运行..
    [2018-04-22 19:28:27.650273] INFO: bigquant: 命中缓存
    [2018-04-22 19:28:27.651331] INFO: bigquant: derived_feature_extractor.v2 运行完成[0.003578s].
    [2018-04-22 19:28:27.661262] INFO: bigquant: cached.v3 开始运行..
    [2018-04-22 19:28:31.614095] INFO: bigquant: cached.v3 运行完成[3.952809s].
    [2018-04-22 19:28:31.622849] INFO: bigquant: cached.v3 开始运行..
    [[ 1.        -0.0069482]
     [-0.0069482  1.       ]]
    [2018-04-22 19:28:31.625617] INFO: bigquant: cached.v3 运行完成[0.002767s].
    
    In [ ]:
     
    

    (yilong10) #11

    原来如此,谢谢啦