在BigQuant上做股票数据分析和可视化展示真是太方便了。只需要写两行代码,其他都是搭积木。
在交易中是否我们常常会想一个已经连续涨的股票(比如3天),继续上涨的概率大么?(好吧,这很散户思维,只考虑了一个维度)。除了看连续上涨的股票上涨概率,我们还需要用全市场股票上涨情况做为对比(不然就很业余了)。
- 构造如下模型
- 特征输入
-
cond = (close_0 / close_1 > 1.01) & (close_1 / close_2 > 1.01) & (close_2 / close_3 > 1.01)
:结果为True这三日都在上涨(超过1%),否则为False -
label = where(shift(return_0, -1) > 1.01, 1, 0)
:1 表示明天上涨超过1%,0表示明天下跌
-
- 用自定义模块计算:主要是使用 pandas groupby
g1 = df[‘label’].groupby(pd.Grouper(freq=‘M’)).apply(lambda x: x.sum() / len(x))
g2 = df[‘label’][df[‘cond’]].groupby(pd.Grouper(freq=‘M’)).apply(lambda x: x.sum() / len(x)) - 最后使用绘制DataFrame模块
- 结果:初步看起来没有必然上涨关系,和市场强弱有关系
- 改进:
- 可以看更长时间周期
- 考虑不同日期参数等
- 考虑引入其他维度
克隆策略
{"Description":"实验创建于2017/8/26","Summary":"","Graph":{"EdgesInternal":[{"DestinationInputPortId":"-215:instruments","SourceOutputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-8:data"},{"DestinationInputPortId":"-222:input_data","SourceOutputPortId":"-215:data"},{"DestinationInputPortId":"-191:input_1","SourceOutputPortId":"-222:data"},{"DestinationInputPortId":"-215:features","SourceOutputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24:data"},{"DestinationInputPortId":"-222:features","SourceOutputPortId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24:data"},{"DestinationInputPortId":"-202:input_data","SourceOutputPortId":"-191:data_1"}],"ModuleNodes":[{"Id":"287d2cb0-f53c-4101-bdf8-104b137c8601-8","ModuleId":"BigQuantSpace.instruments.instruments-v2","ModuleParameters":[{"Name":"start_date","Value":"2016-01-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"2019-03-01","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"market","Value":"CN_STOCK_A","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"instrument_list","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"max_count","Value":"0","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"rolling_conf","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-8"}],"OutputPortsInternal":[{"Name":"data","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-8","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":1,"IsPartOfPartialRun":null,"Comment":"","CommentCollapsed":true},{"Id":"-215","ModuleId":"BigQuantSpace.general_feature_extractor.general_feature_extractor-v7","ModuleParameters":[{"Name":"start_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"end_date","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"before_start_days","Value":0,"ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"instruments","NodeId":"-215"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-215"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-215","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":15,"IsPartOfPartialRun":null,"Comment":"","CommentCollapsed":true},{"Id":"-222","ModuleId":"BigQuantSpace.derived_feature_extractor.derived_feature_extractor-v3","ModuleParameters":[{"Name":"date_col","Value":"date","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"instrument_col","Value":"instrument","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"drop_na","Value":"True","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"remove_extra_columns","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"user_functions","Value":"","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_data","NodeId":"-222"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features","NodeId":"-222"}],"OutputPortsInternal":[{"Name":"data","NodeId":"-222","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":16,"IsPartOfPartialRun":null,"Comment":"","CommentCollapsed":true},{"Id":"287d2cb0-f53c-4101-bdf8-104b137c8601-24","ModuleId":"BigQuantSpace.input_features.input_features-v1","ModuleParameters":[{"Name":"features","Value":"# #号开始的表示注释\n# 多个特征,每行一个,可以包含基础特征和衍生特征\ncond = (close_0 / close_1 > 1.01) & (close_1 / close_2 > 1.01) & (close_2 / close_3 > 1.01)\nlabel = where(shift(return_0, -1) > 1.01, 1, 0)\n","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"features_ds","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24"}],"OutputPortsInternal":[{"Name":"data","NodeId":"287d2cb0-f53c-4101-bdf8-104b137c8601-24","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":3,"IsPartOfPartialRun":null,"Comment":"","CommentCollapsed":true},{"Id":"-191","ModuleId":"BigQuantSpace.cached.cached-v3","ModuleParameters":[{"Name":"run","Value":"# Python 代码入口函数,input_1/2/3 对应三个输入端,data_1/2/3 对应三个输出端\ndef bigquant_run(input_1):\n # 示例代码如下。在这里编写您的代码\n df = input_1.read()\n df.set_index('date', inplace=True)\n g1 = df['label'].groupby(pd.Grouper(freq='M')).apply(lambda x: x.sum() / len(x))\n g2 = df['label'][df['cond']].groupby(pd.Grouper(freq='M')).apply(lambda x: x.sum() / len(x))\n ds = DataSource.write_df(pd.DataFrame({'全市场上涨比例': g1, '连续上涨三天后上涨比例': g2}))\n return Outputs(data_1=ds)\n","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"post_run","Value":"# 后处理函数,可选。输入是主函数的输出,可以在这里对数据做处理,或者返回更友好的outputs数据格式。此函数输出不会被缓存。\ndef bigquant_run(outputs):\n return outputs\n","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"input_ports","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"params","Value":"{}","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"output_ports","Value":"","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_1","NodeId":"-191"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_2","NodeId":"-191"},{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_3","NodeId":"-191"}],"OutputPortsInternal":[{"Name":"data_1","NodeId":"-191","OutputType":null},{"Name":"data_2","NodeId":"-191","OutputType":null},{"Name":"data_3","NodeId":"-191","OutputType":null}],"UsePreviousResults":true,"moduleIdForCode":2,"IsPartOfPartialRun":null,"Comment":"数据聚合计算","CommentCollapsed":false},{"Id":"-202","ModuleId":"BigQuantSpace.plot_dataframe.plot_dataframe-v1","ModuleParameters":[{"Name":"title","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"chart_type","Value":"column","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"x","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"y","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"options","Value":"{\n 'chart': {\n 'height': 400\n }\n}","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"candlestick","Value":"False","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"pane_1","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"pane_2","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"pane_3","Value":"","ValueType":"Literal","LinkedGlobalParameter":null},{"Name":"pane_4","Value":"","ValueType":"Literal","LinkedGlobalParameter":null}],"InputPortsInternal":[{"DataSourceId":null,"TrainedModelId":null,"TransformModuleId":null,"Name":"input_data","NodeId":"-202"}],"OutputPortsInternal":[],"UsePreviousResults":false,"moduleIdForCode":4,"IsPartOfPartialRun":null,"Comment":"","CommentCollapsed":true}],"SerializedClientData":"<?xml version='1.0' encoding='utf-16'?><DataV1 xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'><Meta /><NodePositions><NodePosition Node='287d2cb0-f53c-4101-bdf8-104b137c8601-8' Position='264,36,200,200'/><NodePosition Node='-215' Position='434,160,200,200'/><NodePosition Node='-222' Position='438,252,200,200'/><NodePosition Node='287d2cb0-f53c-4101-bdf8-104b137c8601-24' Position='626,38,200,200'/><NodePosition Node='-191' Position='498,339,200,200'/><NodePosition Node='-202' Position='567,458,200,200'/></NodePositions><NodeGroups /></DataV1>"},"IsDraft":true,"ParentExperimentId":null,"WebService":{"IsWebServiceExperiment":false,"Inputs":[],"Outputs":[],"Parameters":[{"Name":"交易日期","Value":"","ParameterDefinition":{"Name":"交易日期","FriendlyName":"交易日期","DefaultValue":"","ParameterType":"String","HasDefaultValue":true,"IsOptional":true,"ParameterRules":[],"HasRules":false,"MarkupType":0,"CredentialDescriptor":null}}],"WebServiceGroupId":null,"SerializedClientData":"<?xml version='1.0' encoding='utf-16'?><DataV1 xmlns:xsd='http://www.w3.org/2001/XMLSchema' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'><Meta /><NodePositions></NodePositions><NodeGroups /></DataV1>"},"DisableNodesUpdate":false,"Category":"user","Tags":[],"IsPartialRun":true}
In [52]:
# 本代码由可视化策略环境自动生成 2019年3月13日 17:16
# 本代码单元只能在可视化模式下编辑。您也可以拷贝代码,粘贴到新建的代码单元或者策略,然后修改。
# Python 代码入口函数,input_1/2/3 对应三个输入端,data_1/2/3 对应三个输出端
def m2_run_bigquant_run(input_1):
# 示例代码如下。在这里编写您的代码
df = input_1.read()
df.set_index('date', inplace=True)
g1 = df['label'].groupby(pd.Grouper(freq='M')).apply(lambda x: x.sum() / len(x))
g2 = df['label'][df['cond']].groupby(pd.Grouper(freq='M')).apply(lambda x: x.sum() / len(x))
ds = DataSource.write_df(pd.DataFrame({'全市场上涨比例': g1, '连续上涨三天后上涨比例': g2}))
return Outputs(data_1=ds)
# 后处理函数,可选。输入是主函数的输出,可以在这里对数据做处理,或者返回更友好的outputs数据格式。此函数输出不会被缓存。
def m2_post_run_bigquant_run(outputs):
return outputs
m1 = M.instruments.v2(
start_date='2016-01-01',
end_date='2019-03-01',
market='CN_STOCK_A',
instrument_list='',
max_count=0
)
m3 = M.input_features.v1(
features="""# #号开始的表示注释
# 多个特征,每行一个,可以包含基础特征和衍生特征
cond = (close_0 / close_1 > 1.01) & (close_1 / close_2 > 1.01) & (close_2 / close_3 > 1.01)
label = where(shift(return_0, -1) > 1.01, 1, 0)
"""
)
m15 = M.general_feature_extractor.v7(
instruments=m1.data,
features=m3.data,
start_date='',
end_date='',
before_start_days=0
)
m16 = M.derived_feature_extractor.v3(
input_data=m15.data,
features=m3.data,
date_col='date',
instrument_col='instrument',
drop_na=True,
remove_extra_columns=False
)
m2 = M.cached.v3(
input_1=m16.data,
run=m2_run_bigquant_run,
post_run=m2_post_run_bigquant_run,
input_ports='',
params='{}',
output_ports=''
)
m4 = M.plot_dataframe.v1(
input_data=m2.data_1,
title='',
chart_type='column',
x='',
y='',
options={
'chart': {
'height': 400
}
},
candlestick=False,
pane_1='',
pane_2='',
pane_3='',
pane_4=''
)
日志 23 条,错误日志
0 条
2019-03-13 17:14:56.423313 INFO: bigquant: instruments.v2 开始运行.
2019-03-13 17:14:56.429720 INFO: bigquant: 命中缓
2019-03-13 17:14:56.431378 INFO: bigquant: instruments.v2 运行完成[0.008065s]
2019-03-13 17:14:56.434499 INFO: bigquant: input_features.v1 开始运行.
2019-03-13 17:14:56.439364 INFO: bigquant: 命中缓
2019-03-13 17:14:56.441131 INFO: bigquant: input_features.v1 运行完成[0.006617s]
2019-03-13 17:14:56.452588 INFO: bigquant: general_feature_extractor.v7 开始运行.
2019-03-13 17:14:57.818612 INFO: 基础特征抽取: 年份 2016, 特征行数=64154
2019-03-13 17:15:00.428573 INFO: 基础特征抽取: 年份 2017, 特征行数=74323
2019-03-13 17:15:02.978060 INFO: 基础特征抽取: 年份 2018, 特征行数=81698
2019-03-13 17:15:03.878210 INFO: 基础特征抽取: 年份 2019, 特征行数=13542
2019-03-13 17:15:03.936392 INFO: 基础特征抽取: 总行数: 233718
2019-03-13 17:15:03.940612 INFO: bigquant: general_feature_extractor.v7 运行完成[7.488016s]
2019-03-13 17:15:03.944955 INFO: bigquant: derived_feature_extractor.v3 开始运行.
2019-03-13 17:15:05.083051 INFO: general_feature_extractor: 提取完成 cond = (close_0 / close_1 > 1.01) & (close_1 / close_2 > 1.01) & (close_2 / close_3 > 1.01), 0.009
2019-03-13 17:15:05.654309 INFO: general_feature_extractor: 提取完成 label = where(shift(return_0, -1) > 1.01, 1, 0), 0.564
2019-03-13 17:15:06.075270 INFO: general_feature_extractor: /y_2016, 64154
2019-03-13 17:15:06.814127 INFO: general_feature_extractor: /y_2017, 74323
2019-03-13 17:15:07.814837 INFO: general_feature_extractor: /y_2018, 81698
2019-03-13 17:15:08.673295 INFO: general_feature_extractor: /y_2019, 13542
2019-03-13 17:15:08.855577 INFO: bigquant: derived_feature_extractor.v3 运行完成[4.910603s]
2019-03-13 17:15:08.893505 INFO: bigquant: cached.v3 开始运行.
2019-03-13 17:15:10.962748 INFO: bigquant: cached.v3 运行完成[2.069224s]