原始因子值 NaN 值比例超过 50%,无法进行后续指标计算

策略分享
标签: #<Tag:0x00007fa1af525e10>

(cash01) #1

找了半天压根就没找到NaN…

克隆策略
In [150]:
df=m8.data.read()
#nan_rows = df[df.isnull().T.any().T]
#nan_rows
df[df['volume2']==0]
Out[150]:
date instrument volume2 volume_0 volume2/volume_0

    {"Description":"实验创建于2020/8/11","Summary":"","Graph":{"EdgesInternal":[{"DestinationInputPortId":"-306:input_1","SourceOutputPortId":"-597:data"},{"DestinationInputPortId":"-597:instruments","SourceOutputPortId":"-603:data"},{"DestinationInputPortId":"-1085:instruments","SourceOutputPortId":"-603:data"},{"DestinationInputPortId":"-597:features","SourceOutputPortId":"-615:data"},{"DestinationInputPortId":"-1111:input_data","SourceOutputPortId":"-306:data_1"},{"DestinationInputPortId":"-1081:input_data","SourceOutputPortId":"-1068:data"},{"DestinationInputPortId":"-1085:features","SourceOutputPortId":"-1076:data"},{"DestinationInputPortId":"-559:user_factor_data","SourceOutputPortId":"-1081:data"},{"DestinationInputPortId":"-1092:input_2","SourceOutputPortId":"-1085:data"},{"DestinationInputPortId":"-1068:input_data","SourceOutputPortId":"-1092:data"},{"DestinationInputPortId":"-1068:features","SourceOutputPortId":"-1098:data"},{"DestinationInputPortId":"-1092:input_1","SourceOutputPortId":"-1111:data"},{"DestinationInputPortId":"-559:features","SourceOutputPortId":"-1116:data"}],"ModuleNodes":[{"Id":"-597","ModuleId":"BigQuantSpace.use_datasource.use_datasource-v1","ModuleParameters":[{"Name":"datasource_id","Value":"bar30m_CN_STOCK_A","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"start_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"instruments","NodeId":"-597"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-597"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-597","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":1,"Comment":"","CommentCollapsed":true},{"Id":"-603","ModuleId":"BigQuantSpace.instruments.instruments-v2","ModuleParameters":[{"Name":"start_date","Value":"2013-01-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"2018-02-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"market","Value":"CN_STOCK_A","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"instrument_list","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"max_count","Value":0,"ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"rolling_conf","NodeId":"-603"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-603","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":2,"Comment":"","CommentCollapsed":true},{"Id":"-615","ModuleId":"BigQuantSpace.input_features.input_features-v1","ModuleParameters":[{"Name":"features","Value":"\n# #号开始的表示注释,注释需单独一行\n# 多个特征,每行一个,可以包含基础特征和衍生特征,特征须为本平台特征\n","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features_ds","NodeId":"-615"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-615","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":3,"Comment":"","CommentCollapsed":true},{"Id":"-559","ModuleId":"BigQuantSpace.factorlens.factorlens-v1","ModuleParameters":[{"Name":"title","Value":"因子分析: {factor_name}","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"start_date","Value":"2013-03-26","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"2017-05-31","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"rebalance_period","Value":"5","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"stock_pool","Value":"中证500","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"quantile_count","Value":"5","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"commission_rate","Value":0.0016,"ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"returns_calculation_method","Value":"累乘","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"benchmark","Value":"无","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_price_limit_stocks","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_st_stocks","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_new_stocks","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"normalization","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"neutralization","Value":"%7B%22enumItems%22%3A%5B%7B%22value%22%3A%22%E8%A1%8C%E4%B8%9A%22%2C%22displayValue%22%3A%22%E8%A1%8C%E4%B8%9A%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E5%B8%82%E5%80%BC%22%2C%22displayValue%22%3A%22%E5%B8%82%E5%80%BC%22%2C%22selected%22%3Atrue%7D%5D%7D","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"metrics","Value":"%7B%22enumItems%22%3A%5B%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E8%A1%A8%E7%8E%B0%E6%A6%82%E8%A7%88%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E8%A1%A8%E7%8E%B0%E6%A6%82%E8%A7%88%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E5%88%86%E5%B8%83%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E5%88%86%E5%B8%83%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E8%A1%8C%E4%B8%9A%E5%88%86%E5%B8%83%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E8%A1%8C%E4%B8%9A%E5%88%86%E5%B8%83%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E5%B8%82%E5%80%BC%E5%88%86%E5%B8%83%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E5%B8%82%E5%80%BC%E5%88%86%E5%B8%83%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22IC%E5%88%86%E6%9E%90%22%2C%22displayValue%22%3A%22IC%E5%88%86%E6%9E%90%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E4%B9%B0%E5%85%A5%E4%BF%A1%E5%8F%B7%E9%87%8D%E5%90%88%E5%88%86%E6%9E%90%22%2C%22displayValue%22%3A%22%E4%B9%B0%E5%85%A5%E4%BF%A1%E5%8F%B7%E9%87%8D%E5%90%88%E5%88%86%E6%9E%90%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E4%BC%B0%E5%80%BC%E5%88%86%E6%9E%90%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E4%BC%B0%E5%80%BC%E5%88%86%E6%9E%90%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E6%8B%A5%E6%8C%A4%E5%BA%A6%E5%88%86%E6%9E%90%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E6%8B%A5%E6%8C%A4%E5%BA%A6%E5%88%86%E6%9E%90%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E5%9B%A0%E5%AD%90%E5%80%BC%E6%9C%80%E5%A4%A7%2F%E6%9C%80%E5%B0%8F%E8%82%A1%E7%A5%A8%22%2C%22displayValue%22%3A%22%E5%9B%A0%E5%AD%90%E5%80%BC%E6%9C%80%E5%A4%A7%2F%E6%9C%80%E5%B0%8F%E8%82%A1%E7%A5%A8%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E8%A1%A8%E8%BE%BE%E5%BC%8F%E5%9B%A0%E5%AD%90%E5%80%BC%22%2C%22displayValue%22%3A%22%E8%A1%A8%E8%BE%BE%E5%BC%8F%E5%9B%A0%E5%AD%90%E5%80%BC%22%2C%22selected%22%3Atrue%7D%2C%7B%22value%22%3A%22%E5%A4%9A%E5%9B%A0%E5%AD%90%E7%9B%B8%E5%85%B3%E6%80%A7%E5%88%86%E6%9E%90%22%2C%22displayValue%22%3A%22%E5%A4%9A%E5%9B%A0%E5%AD%90%E7%9B%B8%E5%85%B3%E6%80%A7%E5%88%86%E6%9E%90%22%2C%22selected%22%3Atrue%7D%5D%7D","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-559"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"user_factor_data","NodeId":"-559"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-559","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":4,"Comment":"","CommentCollapsed":true},{"Id":"-306","ModuleId":"BigQuantSpace.cached.cached-v3","ModuleParameters":[{"Name":"run","Value":"# Python 代码入口函数,input_1/2/3 对应三个输入端,data_1/2/3 对应三个输出端\ndef bigquant_run(input_1, input_2, input_3):\n # 示例代码如下。在这里编写您的代码\n df=input_1.read()\n df['Time'],df['Date']= df['date'].apply(lambda x:x.time()), df['date'].apply(lambda x:x.date())\n df['Time']=df['Time'].astype('str')\n df['Date']=df['Date'].astype(np.datetime64)\n#df.sort_values(by=['instrument'])\n#df['Volume_2']=pd.rolling_apply(df['volume'],3,np.sum)\n df.sort_values(['instrument','Date','Time'])\n#print(df.Time.dtypes)\n df2=df[df.Time=='14:30:00'][['Date','instrument','volume']]\n df2.rename(columns={'Date':'date','volume':'volume2'},inplace=True)\n data_1 = DataSource.write_df(df2)\n data_2 = DataSource.write_pickle(df2)\n return Outputs(data_1=data_1, data_2=data_2, data_3=None)\n\n","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"post_run","Value":"# 后处理函数,可选。输入是主函数的输出,可以在这里对数据做处理,或者返回更友好的outputs数据格式。此函数输出不会被缓存。\ndef bigquant_run(outputs):\n return outputs\n","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"input_ports","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"params","Value":"{}","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"output_ports","Value":"","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_1","NodeId":"-306"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_2","NodeId":"-306"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_3","NodeId":"-306"}],"OutputPortsInternal":[{"Name":"data_1","NodeId":"-306","OutputType":null},{"Name":"data_2","NodeId":"-306","OutputType":null},{"Name":"data_3","NodeId":"-306","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":6,"Comment":"","CommentCollapsed":true},{"Id":"-1068","ModuleId":"BigQuantSpace.derived_feature_extractor.derived_feature_extractor-v3","ModuleParameters":[{"Name":"date_col","Value":"date","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"instrument_col","Value":"instrument","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_na","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"remove_extra_columns","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"user_functions","Value":"{}","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_data","NodeId":"-1068"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-1068"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-1068","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":5,"Comment":"","CommentCollapsed":true},{"Id":"-1076","ModuleId":"BigQuantSpace.input_features.input_features-v1","ModuleParameters":[{"Name":"features","Value":"\n# #号开始的表示注释,注释需单独一行\n# 多个特征,每行一个,可以包含基础特征和衍生特征,特征须为本平台特征\nvolume_0","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features_ds","NodeId":"-1076"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-1076","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":7,"Comment":"","CommentCollapsed":true},{"Id":"-1081","ModuleId":"BigQuantSpace.dropnan.dropnan-v2","ModuleParameters":[],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_data","NodeId":"-1081"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-1081"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-1081","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":8,"Comment":"","CommentCollapsed":true},{"Id":"-1085","ModuleId":"BigQuantSpace.general_feature_extractor.general_feature_extractor-v7","ModuleParameters":[{"Name":"start_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"before_start_days","Value":90,"ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"instruments","NodeId":"-1085"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-1085"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-1085","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":9,"Comment":"","CommentCollapsed":true},{"Id":"-1092","ModuleId":"BigQuantSpace.data_join.data_join-v3","ModuleParameters":[{"Name":"on","Value":"date,instrument","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"how","Value":"inner","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"sort","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_1","NodeId":"-1092"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_2","NodeId":"-1092"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-1092","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":10,"Comment":"","CommentCollapsed":true},{"Id":"-1098","ModuleId":"BigQuantSpace.input_features.input_features-v1","ModuleParameters":[{"Name":"features","Value":"\n# #号开始的表示注释,注释需单独一行\n# 多个特征,每行一个,可以包含基础特征和衍生特征,特征须为本平台特征\na1=volume2/volume_0","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features_ds","NodeId":"-1098"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-1098","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":11,"Comment":"","CommentCollapsed":true},{"Id":"-1111","ModuleId":"BigQuantSpace.filter.filter-v3","ModuleParameters":[{"Name":"expr","Value":"volume2>0","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"output_left_data","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_data","NodeId":"-1111"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-1111","OutputType":null},{"Name":"left_data","NodeId":"-1111","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":14,"Comment":"","CommentCollapsed":true},{"Id":"-1116","ModuleId":"BigQuantSpace.input_features.input_features-v1","ModuleParameters":[{"Name":"features","Value":"\n# #号开始的表示注释,注释需单独一行\n# 多个特征,每行一个,可以包含基础特征和衍生特征,特征须为本平台特征\na1","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features_ds","NodeId":"-1116"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-1116","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":15,"Comment":"","CommentCollapsed":true}],"SerializedClientData":"<?xml version='1.0' encoding='utf-16'?><DataV1 xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'><Meta /><NodePositions><NodePosition Node='-597' Position='-504.03607177734375,-39.92460107803345,200,200'/><NodePosition Node='-603' Position='-670.5979614257812,-223.88739013671875,200,200'/><NodePosition Node='-615' Position='-274.6340637207031,-224.79393005371094,200,200'/><NodePosition Node='-559' Position='-190.21400356292725,673.2793273925781,200,200'/><NodePosition Node='-306' Position='-490.36375427246094,81.0101089477539,200,200'/><NodePosition Node='-1068' Position='-215.35829734802246,379.9960632324219,200,200'/><NodePosition Node='-1076' Position='84.5045223236084,-222.672345161438,200,200'/><NodePosition Node='-1081' Position='-238.23666965961456,499.9342041015625,200,200'/><NodePosition Node='-1085' Position='3.892852783203125,16.888385772705078,200,200'/><NodePosition Node='-1092' Position='-263.0216064453125,273.05699157714844,200,200'/><NodePosition Node='-1098' Position='258.7094783782959,221.76465129852295,200,200'/><NodePosition Node='-1111' Position='-476.0332946777344,183.44998168945312,200,200'/><NodePosition Node='-1116' Position='167.90653038024902,359.2500638961792,200,200'/></NodePositions><NodeGroups /></DataV1>"},"IsDraft":true,"ParentExperimentId":null,"WebService":{"IsWebServiceExperiment":false,"Inputs":[],"Outputs":[],"Parameters":[{"Name":"交易日期","Value":"","ParameterDefinition":{"Name":"交易日期","FriendlyName":"交易日期","DefaultValue":"","ParameterType":"String","HasDefaultValue":true,"IsOptional":true,"ParameterRules":[],"HasRules":false,"MarkupType":0,"CredentialDescriptor":null}}],"WebServiceGroupId":null,"SerializedClientData":"<?xml version='1.0' encoding='utf-16'?><DataV1 xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'><Meta /><NodePositions></NodePositions><NodeGroups /></DataV1>"},"DisableNodesUpdate":false,"Category":"user","Tags":[],"IsPartialRun":true}
    In [157]:
    # 本代码由可视化策略环境自动生成 2020年8月11日 21:21
    # 本代码单元只能在可视化模式下编辑。您也可以拷贝代码,粘贴到新建的代码单元或者策略,然后修改。
    
    
    # Python 代码入口函数,input_1/2/3 对应三个输入端,data_1/2/3 对应三个输出端
    def m6_run_bigquant_run(input_1, input_2, input_3):
        # 示例代码如下。在这里编写您的代码
        df=input_1.read()
        df['Time'],df['Date']= df['date'].apply(lambda x:x.time()), df['date'].apply(lambda x:x.date())
        df['Time']=df['Time'].astype('str')
        df['Date']=df['Date'].astype(np.datetime64)
    #df.sort_values(by=['instrument'])
    #df['Volume_2']=pd.rolling_apply(df['volume'],3,np.sum)
        df.sort_values(['instrument','Date','Time'])
    #print(df.Time.dtypes)
        df2=df[df.Time=='14:30:00'][['Date','instrument','volume']]
        df2.rename(columns={'Date':'date','volume':'volume2'},inplace=True)
        data_1 = DataSource.write_df(df2)
        data_2 = DataSource.write_pickle(df2)
        return Outputs(data_1=data_1, data_2=data_2, data_3=None)
    
    
    # 后处理函数,可选。输入是主函数的输出,可以在这里对数据做处理,或者返回更友好的outputs数据格式。此函数输出不会被缓存。
    def m6_post_run_bigquant_run(outputs):
        return outputs
    
    
    m2 = M.instruments.v2(
        start_date='2013-01-01',
        end_date='2018-02-01',
        market='CN_STOCK_A',
        instrument_list='',
        max_count=0
    )
    
    m3 = M.input_features.v1(
        features="""
    # #号开始的表示注释,注释需单独一行
    # 多个特征,每行一个,可以包含基础特征和衍生特征,特征须为本平台特征
    """
    )
    
    m1 = M.use_datasource.v1(
        instruments=m2.data,
        features=m3.data,
        datasource_id='bar30m_CN_STOCK_A',
        start_date='',
        end_date=''
    )
    
    m6 = M.cached.v3(
        input_1=m1.data,
        run=m6_run_bigquant_run,
        post_run=m6_post_run_bigquant_run,
        input_ports='',
        params='{}',
        output_ports=''
    )
    
    m14 = M.filter.v3(
        input_data=m6.data_1,
        expr='volume2>0',
        output_left_data=False
    )
    
    m7 = M.input_features.v1(
        features="""
    # #号开始的表示注释,注释需单独一行
    # 多个特征,每行一个,可以包含基础特征和衍生特征,特征须为本平台特征
    volume_0"""
    )
    
    m9 = M.general_feature_extractor.v7(
        instruments=m2.data,
        features=m7.data,
        start_date='',
        end_date='',
        before_start_days=90
    )
    
    m10 = M.data_join.v3(
        input_1=m14.data,
        input_2=m9.data,
        on='date,instrument',
        how='inner',
        sort=False
    )
    
    m11 = M.input_features.v1(
        features="""
    # #号开始的表示注释,注释需单独一行
    # 多个特征,每行一个,可以包含基础特征和衍生特征,特征须为本平台特征
    a1=volume2/volume_0"""
    )
    
    m5 = M.derived_feature_extractor.v3(
        input_data=m10.data,
        features=m11.data,
        date_col='date',
        instrument_col='instrument',
        drop_na=False,
        remove_extra_columns=False,
        user_functions={}
    )
    
    m8 = M.dropnan.v2(
        input_data=m5.data
    )
    
    m15 = M.input_features.v1(
        features="""
    # #号开始的表示注释,注释需单独一行
    # 多个特征,每行一个,可以包含基础特征和衍生特征,特征须为本平台特征
    a1"""
    )
    
    m4 = M.factorlens.v1(
        features=m15.data,
        user_factor_data=m8.data,
        title='因子分析: {factor_name}',
        start_date='2013-03-26',
        end_date='2017-05-31',
        rebalance_period=5,
        stock_pool='中证500',
        quantile_count=5,
        commission_rate=0.0016,
        returns_calculation_method='累乘',
        benchmark='无',
        drop_price_limit_stocks=True,
        drop_st_stocks=True,
        drop_new_stocks=True,
        normalization=True,
        neutralization=['行业', '市值'],
        metrics=['因子表现概览', '因子分布', '因子行业分布', '因子市值分布', 'IC分析', '买入信号重合分析', '因子估值分析', '因子拥挤度分析', '因子值最大/最小股票', '表达式因子值', '多因子相关性分析']
    )
    
    ---------------------------------------------------------------------------
    Exception                                 Traceback (most recent call last)
    <ipython-input-157-44504b126527> in <module>()
        132     normalization=True,
        133     neutralization=['行业', '市值'],
    --> 134     metrics=['因子表现概览', '因子分布', '因子行业分布', '因子市值分布', 'IC分析', '买入信号重合分析', '因子估值分析', '因子拥挤度分析', '因子值最大/最小股票', '表达式因子值', '多因子相关性分析']
        135 )
    
    Exception: 原始因子值 NaN 值比例超过 50%,无法进行后续指标计算

    (adhaha111) #2

    您好,我先找找原因哈


    (cash01) #3

    找到原因了吗?


    (adhaha111) #4

    您好,因为因子分析模块内部还会去获取股票的涨停状态,再进行merge,该过程可能会产生很多nan
    image


    (cash01) #5

    其他因子应该也要merge,为啥没问题呢?那我这个怎么解决呢?谢谢哈


    (adhaha111) #6

    您好,您主要的问题是把30分钟数据和日线数据连接了,由于30min表中某些股票某些天并没有数据:
    image
    所以,在m10连接数据模块自然也会缺少这些样本(date、instrument),所以,这也会导致因子分析模块中merge后会产生很多nan

    建议可以在m10模块中使用 right 的连接方式,并且对其进行向前或向后填充nan值的操作


    (cash01) #7


    数据有问题。。。。。日线的成交量为负数,30分钟的成交量异常。。。能检查下吗,这种情况不在少数


    (cash01) #8


    30分钟表成交量汇总和日线volume因子去比较,差异很大!!


    (adhaha111) #9

    您好,数据已修复


    (cash01) #10

    2019-06-14那天数据还是有问题,如果有时间你可以看下。。
    image