因子分析报错:原始因子值覆盖率为


(侯) #1

特征: -1*delta(((close_0-low_0)-(high_0-close_0))/(high_0-low_0),1),用衍生特征抽取可以正常抽取到数据,但是因子分析传入这个特征就会报错,错误如下:

[2020-08-04 13:29:42.658947] ERROR: 因子分析: 原始因子值覆盖率为 0.4201010369582558,小于 50%

[2020-08-04 13:29:42.660126] ERROR: moduleinvoker: module name: factorlens, module version: v1, trackeback: Traceback (most recent call last): Exception: 原始因子值 NaN 值比例超过 50%,无法进行后续指标计算


(adhaha111) #2

您可以分享下策略吗
根据错误来看是50%以上的样本都无法计算出该因子值


(侯) #3
克隆策略

    {"Description":"实验创建于2020/2/14","Summary":"","Graph":{"EdgesInternal":[{"DestinationInputPortId":"-115:features","SourceOutputPortId":"-70:data"},{"DestinationInputPortId":"-368:features","SourceOutputPortId":"-70:data"},{"DestinationInputPortId":"-484:features","SourceOutputPortId":"-70:data"},{"DestinationInputPortId":"-368:instruments","SourceOutputPortId":"-359:data"},{"DestinationInputPortId":"-484:input_data","SourceOutputPortId":"-368:data"}],"ModuleNodes":[{"Id":"-70","ModuleId":"BigQuantSpace.input_features.input_features-v1","ModuleParameters":[{"Name":"features","Value":"#std(mean(volume_0, 5),5)\n#std(mean(volume_0, 10),10)\n(mean(close_0,3)+mean(close_0,6)+mean(close_0,12)+mean(close_0,24))/(4*close_0)\n-1*delta(((close_0-low_0)-(high_0-close_0))/(high_0-low_0),1)","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features_ds","NodeId":"-70"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-70","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":1,"IsPartOfPartialRun":null,"Comment":"","CommentCollapsed":true},{"Id":"-115","ModuleId":"BigQuantSpace.factorlens.factorlens-v1","ModuleParameters":[{"Name":"title","Value":"因子分析: {factor_name}","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"start_date","Value":"2019-01-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"2020-8-03","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"rebalance_period","Value":"1","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"stock_pool","Value":"全市场","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"quantile_count","Value":"3","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"commission_rate","Value":0.0016,"ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"returns_calculation_method","Value":"累乘","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_price_limit_stocks","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_st_stocks","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_new_stocks","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"normalization","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"neutralization","Value":"%7B%22enumItems%22%3A%5B%7B%22value%22%3A%22%E8%A1%8C%E4%B8%9A%22%2C%22displayValue%22%3A%22%E8%A1%8C%E4%B8%9A%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E5%B8%82%E5%80%BC%22%2C%22displayValue%22%3A%22%E5%B8%82%E5%80%BC%22%2C%22selected%22%3Atrue%7D%5D%7D","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"metrics","Value":"%7B%22enumItems%22%3A%5B%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E8%A1%A8%E7%8E%B0%E6%A6%82%E8%A7%88%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E8%A1%A8%E7%8E%B0%E6%A6%82%E8%A7%88%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E5%88%86%E5%B8%83%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E5%88%86%E5%B8%83%22%2C%22selected%22%3Afalse%7D%2C%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E8%A1%8C%E4%B8%9A%E5%88%86%E5%B8%83%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E8%A1%8C%E4%B8%9A%E5%88%86%E5%B8%83%22%2C%22selected%22%3Afalse%7D%2C%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E5%B8%82%E5%80%BC%E5%88%86%E5%B8%83%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E5%B8%82%E5%80%BC%E5%88%86%E5%B8%83%22%2C%22selected%22%3Afalse%7D%2C%7B%22value%22%3A%22IC%E5%88%86%E6%9E%90%22%2C%22displayValue%22%3A%22IC%E5%88%86%E6%9E%90%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E4%B9%B0%E5%85%A5%E4%BF%A1%E5%8F%B7%E9%87%8D%E5%90%88%E5%88%86%E6%9E%90%22%2C%22displayValue%22%3A%22%E4%B9%B0%E5%85%A5%E4%BF%A1%E5%8F%B7%E9%87%8D%E5%90%88%E5%88%86%E6%9E%90%22%2C%22selected%22%3Afalse%7D%2C%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E4%BC%B0%E5%80%BC%E5%88%86%E6%9E%90%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E4%BC%B0%E5%80%BC%E5%88%86%E6%9E%90%22%2C%22selected%22%3Afalse%7D%2C%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E6%8B%A5%E6%8C%A4%E5%BA%A6%E5%88%86%E6%9E%90%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E6%8B%A5%E6%8C%A4%E5%BA%A6%E5%88%86%E6%9E%90%22%2C%22selected%22%3Afalse%7D%2C%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E5%80%BC%E6%9C%80%E5%A4%A7%2F%E6%9C%80%E5%B0%8F%E8%82%A1%E7%A5%A8%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E5%80%BC%E6%9C%80%E5%A4%A7%2F%E6%9C%80%E5%B0%8F%E8%82%A1%E7%A5%A8%22%2C%22selected%22%3Afalse%7D%2C%7B%22value%22%3A%22%E5%A4%9A%E5%9B%A0%E5%AD%90%E7%9B%B8%E5%85%B3%E6%80%A7%E5%88%86%E6%9E%90%22%2C%22displayValue%22%3A%22%E5%A4%9A%E5%9B%A0%E5%AD%90%E7%9B%B8%E5%85%B3%E6%80%A7%E5%88%86%E6%9E%90%22%2C%22selected%22%3Afalse%7D%5D%7D","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-115"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"user_factor_data","NodeId":"-115"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-115","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":2,"IsPartOfPartialRun":null,"Comment":"","CommentCollapsed":true},{"Id":"-359","ModuleId":"BigQuantSpace.instruments.instruments-v2","ModuleParameters":[{"Name":"start_date","Value":"2019-01-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"2019-03-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"market","Value":"CN_STOCK_A","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"instrument_list","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"max_count","Value":0,"ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"rolling_conf","NodeId":"-359"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-359","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":7,"IsPartOfPartialRun":null,"Comment":"","CommentCollapsed":true},{"Id":"-368","ModuleId":"BigQuantSpace.general_feature_extractor.general_feature_extractor-v7","ModuleParameters":[{"Name":"start_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"before_start_days","Value":90,"ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"instruments","NodeId":"-368"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-368"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-368","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":8,"IsPartOfPartialRun":null,"Comment":"","CommentCollapsed":true},{"Id":"-484","ModuleId":"BigQuantSpace.derived_feature_extractor.derived_feature_extractor-v3","ModuleParameters":[{"Name":"date_col","Value":"date","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"instrument_col","Value":"instrument","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_na","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"remove_extra_columns","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"user_functions","Value":"{}","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_data","NodeId":"-484"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-484"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-484","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":9,"IsPartOfPartialRun":null,"Comment":"","CommentCollapsed":true}],"SerializedClientData":"<?xml version='1.0' encoding='utf-16'?><DataV1 xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'><Meta /><NodePositions><NodePosition Node='-70' Position='229,33,200,200'/><NodePosition Node='-115' Position='458,289,200,200'/><NodePosition Node='-359' Position='100,332,200,200'/><NodePosition Node='-368' Position='156,533,200,200'/><NodePosition Node='-484' Position='233,636,200,200'/></NodePositions><NodeGroups /></DataV1>"},"IsDraft":true,"ParentExperimentId":null,"WebService":{"IsWebServiceExperiment":false,"Inputs":[],"Outputs":[],"Parameters":[{"Name":"交易日期","Value":"","ParameterDefinition":{"Name":"交易日期","FriendlyName":"交易日期","DefaultValue":"","ParameterType":"String","HasDefaultValue":true,"IsOptional":true,"ParameterRules":[],"HasRules":false,"MarkupType":0,"CredentialDescriptor":null}}],"WebServiceGroupId":null,"SerializedClientData":"<?xml version='1.0' encoding='utf-16'?><DataV1 xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'><Meta /><NodePositions></NodePositions><NodeGroups /></DataV1>"},"DisableNodesUpdate":false,"Category":"user","Tags":[],"IsPartialRun":true}
    In [12]:
    # 本代码由可视化策略环境自动生成 2020年8月4日 14:05
    # 本代码单元只能在可视化模式下编辑。您也可以拷贝代码,粘贴到新建的代码单元或者策略,然后修改。
    
    
    m1 = M.input_features.v1(
        features="""#std(mean(volume_0, 5),5)
    #std(mean(volume_0, 10),10)
    (mean(close_0,3)+mean(close_0,6)+mean(close_0,12)+mean(close_0,24))/(4*close_0)
    -1*delta(((close_0-low_0)-(high_0-close_0))/(high_0-low_0),1)"""
    )
    
    m2 = M.factorlens.v1(
        features=m1.data,
        title='因子分析: {factor_name}',
        start_date='2019-01-01',
        end_date='2020-8-03',
        rebalance_period=1,
        stock_pool='全市场',
        quantile_count=3,
        commission_rate=0.0016,
        returns_calculation_method='累乘',
        drop_price_limit_stocks=True,
        drop_st_stocks=True,
        drop_new_stocks=True,
        normalization=True,
        neutralization=['行业', '市值'],
        metrics=['因子表现概览', 'IC分析']
    )
    
    m7 = M.instruments.v2(
        start_date='2019-01-01',
        end_date='2019-03-01',
        market='CN_STOCK_A',
        instrument_list='',
        max_count=0
    )
    
    m8 = M.general_feature_extractor.v7(
        instruments=m7.data,
        features=m1.data,
        start_date='',
        end_date='',
        before_start_days=90
    )
    
    m9 = M.derived_feature_extractor.v3(
        input_data=m8.data,
        features=m1.data,
        date_col='date',
        instrument_col='instrument',
        drop_na=False,
        remove_extra_columns=False,
        user_functions={}
    )
    
    ---------------------------------------------------------------------------
    Exception                                 Traceback (most recent call last)
    <ipython-input-12-ab1c5354021e> in <module>()
         25     normalization=True,
         26     neutralization=['行业', '市值'],
    ---> 27     metrics=['因子表现概览', 'IC分析']
         28 )
    
    Exception: 原始因子值 NaN 值比例超过 50%,无法进行后续指标计算

    (adhaha111) #4

    您好,初步筛查是因为**-1*delta(((close_0-low_0)-(high_0-close_0))/(high_0-low_0),1)**因子在2020-02-03这一天全部为nan,可以能是数据原因造成的,您可以避开这一天进行分析


    (侯) #5

    好的,谢谢老师


    (adhaha111) #6

    您好,这里数据是没问题的,2020.2.3是新年开市第一天,由于今年疫情原因,大面积股票一字跌停,所以导致您的因子分母 high - low 为0,您可以在分母加上一个极小值 0.00001,避免它报错