作者:bigquant
阅读时间:5分钟
本文由BigQuant宽客学院推出,难度标签:☆☆☆
导语:本文目的是介绍如何使用bigexpr表达式对WorldQuant公开的101个alpha进行因子构建,并进行因子测试。
背景介绍
根据WorldQuant发表的论文《101 Formulaic Alphas 》 ,其中公式化地给出了101个alpha因子。与传统方法不一样的是,他们根据数据挖掘的方法构建了101个alpha,据说里面80%的因子仍然还行之有效并被运用在实盘项目中。
在BigQuant策略研究平台上,可通过表达式快速进行因子构建和数据标注,再也不需要自己手动编写冗长代码。
表达式简介
因为在机器学习和深度学习中,因子是一个很重要的概念,也被称为特征,开发AI算法的关键在于特征选择。如果是简单的基础因子,比如近5日收益率:$close\_5/close\_0-1$,因子构建比较简单,但是如果想构建近5日每日收益率和成交量的相关性这个因子就比较棘手,需要编写大量的代码来计算该因子。因此,我们设计了bigexpr表达式引擎。
bigexpr是BigQuant开发的表达式计算引擎,通过编写简单的表达式,就可以对数据做任何运算,而无需编写代码。
bigexpr在平台上被广泛使用,M.advanced_auto_labeler 和 M.derived_feature_extractor 都已经由bigexpr驱动,您可以用表达式就可以定义标注目标和完成后特征抽取。
正如刚刚提到的近5日每日收益率和成交量的相关性因子可以这样定义:
$$correlation(close\_0/shift(close\_0,1)-1,volume\_0,5)$$
其中,$correlation$表示求相关系数,$close\_0$表示当天收盘价,$shift(close\_0,1)$表示前一日收盘价,$volume\_0$表示当天成交量。因此,可以看出,并不需要编写大量代码计算该因子,通过表达式即可快速构建。
函数说明
表达式引擎中有不少简单函数,对其中的部分函数进行解释:
- 可分为横截面函数和时间序列函数两大类,其中时间序列函数名多为以$ts\_$开头
- 大部分函数命名方式较为直观
- $abs(x)$ 、$log(x)$分别表示$x$的绝对值和$x$的自然对数
- $rank(x)$表示某股票$x$值在横截面上的升序排名序号,并将排名归一到[0,1]的闭区间
- $delay(x,d)$表示$x$值在$d$天前的值
- $delta(x,d)$表示$x$值的最新值减去$x$值在$d$天前的值
- $correlation(x,y,d)$、$covariance(x,y,d)$分别表示$x$和$y$在长度为$d$的时间窗口上的皮尔逊相关系数和协方差
- $ts\_min(x,d)$、$ts\_max(x,d)$、$ts\_argmax(x,d)$、$ts\_argmin(x,d)$、$ts\_rank(x)$、$sum(x,d)$、$stddev(x,d)$等均可以通过函数名称了解其作用
因子说明
BigQuant平台上系统因子超过2000个,包括了基本信息因子、量价因子、估值因子、财报因子、技术指标因子等。本文简单举若干因子进行介绍。
基本信息因子
点击查看部分因子
- list_days # 上市天数
- list_board_0 # 上市板
- company_found_date_0 # 公司成立天数
- industry_sw_level1_0 # 申万一级行业类别
- st_status_0 # ST状态
- in_sse50_0 # 是否属于上证50指数成分
- in_csi300_0 # 是否属于沪深300指数成分
量价因子
点击查看部分因子
- open_0 # 当日开盘价
- open_1 # 前一日开盘价
- close_0 # 当日收盘价
- high_0 # 当日最高价
- low_0 # 当日最低价
- volume_0 # 当日成交量
- amount_0 # 当日成交额
- adjust_factor_0 # 复权因子
估值因子
点击查看部分因子
- market_cap_0 # 总市值
- rank_market_cap_0 # 总市值排序
- pe_ttm_0 # 市盈率(TTM)
- rank_pe_ttm_0 # 市盈率(TTM)升序百分比排名
- pe_lyr_0 # 市盈率(LYR)
- pb_lf_0 # 市净率(LF)
- ps_ttm_0 # 市销率(TTM)
财报因子
点击查看部分因子
- fs_net_profit_0 # 归属母公司股东的净利润
- fs_net_profit_yoy_0 # 归属母公司股东的净利润同比增长率
- fs_net_profit_qoq_0 # 归属母公司股东的净利润环比增长率
- fs_roe_0 # 净资产收益率
- fs_roa_0 # 总资产收益率
- fs_gross_profit_margin_0 # 销售毛利率
- fs_net_profit_margin_0 # 销售净利率
- fs_eps_0 # 每股收益
- fs_bps_0 # 每股净资产
- fs_cash_ratio_0 # 现金比率
数据标注
和因子构建一样,数据标注也是机器学习算法中非常重要的一部分,更详细的文档为:自定义标注。
之前没有表达式的时候,数据标注主要通过fast_auto_label实现,自从有了表达式以后,数据标注主要是通过advanced_auto_label实现。数据标注的整体思想和内容主要体现在label_expr上,label_expr是一个列表(list)。
具体实例代码,请点击下方 点击查看代码。
点击查看代码
conf.label_expr = [
# 计算相对收益
'shift(close, -5) / shift(open, -1) - shift(benchmark_close, -5) / shift(benchmark_open, -1)',
# 极值处理:用1%和99%分位的值做clip
'clip(label, all_quantile(label, 0.01), all_quantile(label, 0.99))',
# 将分数映射到分类,这里使用20个分类
'all_wbins(label, 20)',
# 过滤掉一字涨停的情况 (设置label为NaN,在后续处理和训练中会忽略NaN的label)
'where(shift(high, -1) == shift(low, -1), NaN, label)'
]
m1 = M.advanced_auto_labeler.v1(
instruments=conf.instruments, start_date=conf.start_date, end_date=conf.split_date,
label_expr=conf.label_expr, benchmark='000300.SHA')
接下来,我们对示例代码做解释:
- label_expr为一个list,列表里四个元素决定了标注的具体操作,详细文档见:表达式引擎
- 计算未来一段时间的相对收益作为标注的原始依据,这里可以使用bigexpr表达式,快速完成数据标注
- 使用clip和all_quantile函数做极值处理
- 将原始数据离散化,这里可以采取等宽离散化或者等频离散化,两者各有优劣
- 通过where函数过滤掉一字涨停的样本数据
单因子测试
这里我们以’shift(close_0,15) / close_0’因子为例,介绍如何进行单因子测试,开发基于单因子的AI策略。
(补充:如果不想编写代码,建议参考三楼 Yoga 会飞的鱼的回答)
点击查看代码
## 基础配置
class conf:
start_date = '2014-01-01'
end_date='2017-07-17'
split_date = '2015-01-01'
instruments = D.instruments(start_date, end_date)
hold_days = 5
features = ['shift(close_0,15) / close_0']
# 数据标注标注
label_expr = [
# 计算未来一段时间(hold_days)的相对收益
'shift(close, -5) / shift(open, -1) - shift(benchmark_close, -5) / shift(benchmark_open, -1)',
# 极值处理:用1%和99%分位的值做clip
'clip(label, all_quantile(label, 0.01), all_quantile(label, 0.99))',
# 将分数映射到分类,这里使用20个分类,这里采取等宽离散化
'all_wbins(label, 20)',
# 过滤掉一字涨停的情况 (设置label为NaN,在后续处理和训练中会忽略NaN的label)
'where(shift(high, -1) == shift(low, -1), NaN, label)'
]
## 量化回测 https://bigquant.com/docs/#/develop?id=%E5%9B%9E%E6%B5%8B%E6%9C%BA%E5%88%B6
# 回测引擎:准备数据,只执行一次
def prepare(context):
# context.start_date / end_date,回测的时候,为trader传入参数;在实盘运行的时候,由系统替换为实盘日期
instruments = D.instruments()
## 在样本外数据上进行预测
n0 = M.general_feature_extractor.v5(
instruments=D.instruments(),
start_date=context.start_date, end_date=context.end_date,
features=conf.features)
n1 = M.derived_feature_extractor.v1(
data=n0.data,
features= conf.features)
n2 = M.transform.v2(data=n1.data, transforms=None, drop_null=True)
n3 = M.stock_ranker_predict.v5(model=context.options['model'], data=n2.data)
context.instruments = n3.instruments
context.options['predictions'] = n3.predictions
# 回测引擎:初始化函数,只执行一次
def initialize(context):
# 加载预测数据
context.ranker_prediction = context.options['predictions'].read_df()
# 系统已经设置了默认的交易手续费和滑点,要修改手续费可使用如下函数
context.set_commission(PerOrder(buy_cost=0.0003, sell_cost=0.0013, min_cost=5))
# 预测数据,通过options传入进来,使用 read_df 函数,加载到内存 (DataFrame)
# 设置买入的股票数量,这里买入预测股票列表排名靠前的5只
stock_count = 3
# 每只的股票的权重,如下的权重分配会使得靠前的股票分配多一点的资金,[0.339160, 0.213986, 0.169580, ..]
context.stock_weights = T.norm([1 / math.log(i + 2) for i in range(0, stock_count)])
# 设置每只股票占用的最大资金比例
context.max_cash_per_instrument = 0.2
# 回测引擎:每日数据处理函数,每天执行一次
def handle_data(context, data):
# 按日期过滤得到今日的预测数据
ranker_prediction = context.ranker_prediction[
context.ranker_prediction.date == data.current_dt.strftime('%Y-%m-%d')]
# 1. 资金分配
# 平均持仓时间是hold_days,每日都将买入股票,每日预期使用 1/hold_days 的资金
# 实际操作中,会存在一定的买入误差,所以在前hold_days天,等量使用资金;之后,尽量使用剩余资金(这里设置最多用等量的1.5倍)
is_staging = context.trading_day_index < context.options['hold_days'] # 是否在建仓期间(前 hold_days 天)
cash_avg = context.portfolio.portfolio_value / context.options['hold_days']
cash_for_buy = min(context.portfolio.cash, (1 if is_staging else 1.5) * cash_avg)
cash_for_sell = cash_avg - (context.portfolio.cash - cash_for_buy)
positions = {e.symbol: p.amount * p.last_sale_price
for e, p in context.perf_tracker.position_tracker.positions.items()}
# 2. 生成卖出订单:hold_days天之后才开始卖出;对持仓的股票,按StockRanker预测的排序末位淘汰
if not is_staging and cash_for_sell > 0:
equities = {e.symbol: e for e, p in context.perf_tracker.position_tracker.positions.items()}
instruments = list(reversed(list(ranker_prediction.instrument[ranker_prediction.instrument.apply(
lambda x: x in equities and not context.has_unfinished_sell_order(equities[x]))])))
# print('rank order for sell %s' % instruments)
for instrument in instruments:
context.order_target(context.symbol(instrument), 0)
cash_for_sell -= positions[instrument]
if cash_for_sell <= 0:
break
# 3. 生成买入订单:按StockRanker预测的排序,买入前面的stock_count只股票
buy_cash_weights = context.stock_weights
buy_instruments = list(ranker_prediction.instrument[:len(buy_cash_weights)])
max_cash_per_instrument = context.portfolio.portfolio_value * context.max_cash_per_instrument
for i, instrument in enumerate(buy_instruments):
cash = cash_for_buy * buy_cash_weights[i]
if cash > max_cash_per_instrument - positions.get(instrument, 0):
# 确保股票持仓量不会超过每次股票最大的占用资金量
cash = max_cash_per_instrument - positions.get(instrument, 0)
if cash > 0:
price = data.current(context.symbol(instrument), 'price')
lots = int(cash/price/100)
context.order_lots(context.symbol(instrument), lots)
## 通过训练集数据训练模型
# 数据标注
m1 = M.advanced_auto_labeler.v1(
instruments=conf.instruments, start_date=conf.start_date, end_date=conf.split_date,
label_expr=conf.label_expr, benchmark='000300.SHA', cast_label_int=True)
# 抽取基础特征
m2_1 = M.general_feature_extractor.v5(
instruments=D.instruments(),
start_date=conf.start_date, end_date=conf.split_date,
features=conf.features)
# 抽取衍生特征
m2_2 = M.derived_feature_extractor.v1(
data=m2_1.data,
features= conf.features)
# 特征转换
m3 = M.transform.v2(data=m2_2.data, transforms=None, drop_null=True)
# 合并标注和特征数据
m4 = M.join.v2(data1=m1.data, data2=m3.data, on=['date', 'instrument'], sort=False)
# 开始训练模型
m5 = M.stock_ranker_train.v4(training_ds=m4.data, features=conf.features)
## 测试集上进行回测
m6 = M.trade.v3(
instruments=None,
start_date=conf.split_date,
end_date=conf.end_date,
prepare=prepare,
initialize=initialize,
handle_data=handle_data,
order_price_field_buy='open',
order_price_field_sell='close',
capital_base=50001,
benchmark='000300.SHA',
options={'hold_days': conf.hold_days, 'model': m5.model_id},
m_deps=np.random.rand()
)
101 Alphas列表
点击查看完整列表
Alpha_1 'where(mean(amount_0,20)<volume_0,((-1*ts_rank(abs(delta(close_0,7)),60))*sign(delta(close_0,7))),-1)'
Alpha_2 'rank(ts_argmax(signedpower(where(close_0/shift(close_0,1)-1<0,std(close_0/shift(close_0,1)-1<0,20),close_0),2),5))-0.5'
Alpha_3 '-1*correlation(rank(delta(log(volume_0),2)),rank(((close_0-open_0)/open_0)),6)'
Alpha_4 '-1*correlation(rank(open_0),rank(volume_0),10)'
Alpha_5 '-1*ts_rank(rank(low_0),9)'
Alpha_6 'rank((open_0-(sum(amount_0/volume_0*adjust_factor_0,10)/10)))*(-1*abs(rank((close_0-amount_0/volume_0*adjust_factor_0))))'
Alpha_7 '-1*correlation(open_0,volume_0,10)'
Alpha_8 'where(mean(amount_0,20)<volume_0,((-1*ts_rank(abs(delta(close_0,7)),60))*sign(delta(close_0,7))),-1)'
Alpha_9 '(-1*rank(((sum(open_0,5)*sum(close_0/shift(close_0,1)-1,5))-delay((sum(open_0,5)*sum(close_0/shift(close_0,1)-1,5)),10))))'
Alpha_10 'where(0<ts_min(delta(close_0,1),5),delta(close_0,1),where(ts_max(delta(close_0,1),5)<0,delta(close_0,1),-1*delta(close_0,1)))'
Alpha_11 'rank(where(0<ts_min(delta(close_0,1),4),delta(close_0,1),where(ts_max(delta(close_0,1),4)<0,delta(close_0,1),-1*delta(close_0,1))))'
Alpha_12 '(rank(ts_max((amount_0/volume_0*adjust_factor_0-close_0),3))+rank(ts_min((amount_0/volume_0*adjust_factor_0-close_0),3)))*rank(delta(volume_0,3))'
Alpha_13 'sign(delta(volume_0,1))*(-1*delta(close_0,1))'
Alpha_14 '-1*rank(covariance(rank(close_0),rank(volume_0),5))'
Alpha_15 '(-1*rank(delta(close_0/shift(close_0,1)-1,3)))*correlation(open_0,volume_0,10)'
Alpha_16 '-1*sum(rank(correlation(rank(high_0),rank(volume_0),3)),3)'
Alpha_17 '-1*rank(covariance(rank(high_0),rank(volume_0),5))'
Alpha_18 '((-1*rank(ts_rank(close_0,10)))*rank(delta(delta(close_0,1),1)))*rank(ts_rank((volume_0/mean(amount_0,20)),5))'
Alpha_19 '-1*rank(((std(abs((close_0-open_0)),5)+(close_0-open_0))+correlation(close_0,open_0,10)))'
Alpha_20 '(-1*sign(((close_0-delay(close_0,7))+delta(close_0,7))))*(1+rank((1+sum(close_0/shift(close_0,1)-1,250))))'
Alpha_21 '((-1*rank((open_0-delay(high_0,1))))*rank((open_0-delay(close_0,1))))*rank((open_0-delay(low_0,1)))'
Alpha_22 'where(sum(close_0,8)/8+stddev(close_0,8)<sum(close_0,2)/2,-1,where(mean(close_0,2)<mean(close_0,8)-std(close_0,8),1,where((1<volume_0/mean(amount_0,20)) |(volume_0/mean(amount_0,20)==1),1,-1)))'
Alpha_23 '-1*(delta(correlation(high_0,volume_0,5),5)*rank(std(close_0,20)))'
Alpha_24 'where(sum(high_0,20)/20<high_0,-1*delta(high_2,0),0)'
Alpha_25 'where((delta(mean(close_0,100),100)/delay(close_0,100)<0.05) |(delta(mean(close_0,100),100)/delay(close_0,100)==0.05) ,-1*(close_0-ts_min(close_0,100)),-1*delta(close_0,2))'
Alpha_26 'rank(-1*(close_0/shift(close_0,1)-1)*mean(amount_0,20)*amount_0/volume_0*adjust_factor_0*(high_0-close_0))'
Alpha_27 '-1*ts_max(correlation(ts_rank(volume_0,5),ts_rank(high_0,5),5),3)'
Alpha_28 'where(0.5<rank((sum(correlation(rank(volume_0),rank(amount_0/volume_0*adjust_factor_0),6),2)/2.0)),-1,1)'
Alpha_29 'scale(correlation(mean(amount_0,20),low_0,5)+(high_0+low_0)*0.5-close_0)'
Alpha_30 'min(product(rank(rank(scale(log(sum(ts_min(rank(rank((-1*rank(delta((close_0-1),5))))),2),1))))),1),5)+ts_rank(delay((-1*shift(close_0,1)/close_0-1),6),5)'
Alpha_31 '((1.0-rank(((sign((close_0-delay(close_0,1)))+sign((delay(close_0,1)-delay(close_0,2)))) +sign((delay(close_0,2)-delay(close_0,3))))))*sum(volume_0,5))/sum(volume_0,20)'
Alpha_32 '(rank(rank(rank(decay_linear((-1*rank(rank(delta(close_0,10)))),10))))+rank((-1*delta(close_0,3))))+sign(scale(correlation(mean(amount_0,20),low_0,12)))'
Alpha_33 'scale(((sum(close_0,7)/7)-close_0))+20*scale(correlation(amount_0/volume_0*adjust_factor_0,delay(close_0,5),230))'
Alpha_34 'rank((-1*((1-(open_0/close_0)))))'
Alpha_35 'rank(((1-rank((std(close_0/shift(close_0,1),2)/stddev(close_0/shift(close_0,1)-1,5))))+(1-rank(delta(close_0,1)))))'
Alpha_36 'ts_rank(volume_0,32)*(1-ts_rank(((close_0+high_0)-low_0),16))*(1-ts_rank(close_0/shift(close_0,1)-1,32))'
Alpha_37 '((((2.21*rank(correlation((close_0-open_0),delay(volume_0,1),15)))+(0.7*rank((open_0-close_0))))+(0.73*rank(ts_rank(delay((-1*close_0/shift(close_0,1)-1),6),5))))+rank(abs(correlation(amount_0/volume_0*adjust_factor_0,mean(amount_0,20),6))))+(0.6*rank((((sum(close_0,200)/200)-open_0)*(close_0-open_0))))'
Alpha_38 'rank(correlation(delay((open_0-close_0),1),close_0,200))+rank((open_0-close_0))'
Alpha_39 '(-1*rank(ts_rank(close_0,10)))*rank((close_0/open_0))'
Alpha_40 '((-1*rank((delta(close_0,7)*(1-rank(decay_linear((volume_0/mean(amount_0,20)),9))))))*(1 +rank(sum(close_0/shift(close_0,1),250))))'
Alpha_41 '((-1*rank(std(high_0,10)))*correlation(high_0,volume_0,10))'
Alpha_42 '(((high_0*low_0)**0.5)-amount_0/volume_0*adjust_factor_0)'
Alpha_43 '(rank((amount_0/volume_0*adjust_factor_0-close_0))/rank((amount_0/volume_0*adjust_factor_0+close_0)))'
Alpha_44 '(ts_rank((volume_0/mean(amount_0,20)),20)*ts_rank((-1*delta(close_0,7)),8))'
Alpha_45 '(-1*correlation(high_0,rank(volume_0),5))'
Alpha_46 '(-1*((rank((sum(delay(close_0,5),20)/20))*correlation(close_0,volume_0,2))*rank(correlation(sum(close_0,5),sum(close_0,20),2))))',
Alpha_47 'where((0.25<(((delay(close_0,20)-delay(close_0,10))/10)-((delay(close_0,10)-close_0)/10))),-1,where(((((delay(close_0,20)-delay(close_0,10))/10)-((delay(close_0,10)-close_0)/10))<0),1,((-1*1)*(close_0-delay(close_0,1)))))'
Alpha_48 '(((rank((1/close_0))*volume_0)/mean(amount_0,20))*((high_0*rank((high_0-close_0)))/(sum(high_0,5) /5)))-rank((amount_0/volume_0*adjust_factor_0-delay(amount_0/volume_0*adjust_factor_0,5)))'
Alpha_49 '((correlation(delta(close_0,1),delta(delay(close_0,1),1),250)*delta(close_0,1))/close_0)/group_mean(industry_sw_level1_0,((correlation(delta(close_0,1),delta(delay(close_0,1),1),250)*delta(close_0,1))/close_0))/sum(((delta(close_0,1)/delay(close_0,1))**2),250)'
Alpha_50 'where(((((delay(close_0,20)-delay(close_0,10))/10)-((delay(close_0,10)-close_0)/10))<(-1*0.1)),1,(close_0-delay(close_0,1))*(-1)) '
Alpha_51 '(-1*ts_max(rank(correlation(rank(volume_0),rank(amount_0/volume_0*adjust_factor_0),5)),5))'
Alpha_52 'where((((delay(close_0,20)-delay(close_0,10))/10)-((delay(close_0,10)-close_0)/10))<(-1*0.05),1,-1*(close_0-delay(close_0,1)))'
Alpha_53 '(((-1*ts_min(low_0,5))+delay(ts_min(low_0,5),5))*rank(((sum(close_0/shift(close_0,1),240)-sum(close_0/shift(close_0,1),20))/220)))*ts_rank(volume_0,5)'
Alpha_54 '(-1*delta((((close_0-low_0)-(high_0-close_0))/(close_0-low_0)),9))'
Alpha_55 '((-1*((low_0-close_0)*(open_0**5)))/((low_0-high_0)*(close_0** 5)))'
Alpha_56 '-1*correlation(rank(((close_0-ts_min(low_0,12))/(ts_max(high_0,12)-ts_min(low_0,12)))),rank(volume_0),6)'
Alpha_57 '0-1*(1*(rank((sum(close_0/shift(close_0,1)-1,10)/sum(sum(close_0/shift(close_0,1)-1,2),3)))*rank(((close_0/shift(close_0,1)-1)*market_cap_0))))'
Alpha_58 '(0-(1*((close_0-amount_0/volume_0*adjust_factor_0)/decay_linear(rank(ts_argmax(close_0,30)),2))))'
Alpha_59 '(-1*ts_rank(decay_linear(correlation( amount_0/volume_0*adjust_factor_0/group_mean(industry_sw_level1_0,amount_0/volume_0*adjust_factor_0),volume_0,4),8),5))'
Alpha_60 '(0-(1*((2*scale(rank(((((close_0-low_0)-(high_0-close_0))/(high_0-low_0))*volume_0))))-scale(rank(ts_argmax(close_0,10))))))'
Alpha_61 '(rank((amount_0/volume_0*adjust_factor_0-ts_min(amount_0/volume_0*adjust_factor_0,16)))<rank(correlation(amount_0/volume_0*adjust_factor_0,mean(amount_0,180),18)))'
Alpha_62 '(rank(correlation(amount_0/volume_0*adjust_factor_0,sum(mean(amount_0,20),22),10))<rank(((rank(open_0)+rank(open_0))<(rank(((high_0+low_0)/2))+rank(high_0)))))*-1'
Alpha_63 '((rank(decay_linear(delta(close_0/group_mean(industry_sw_level1_0,close_0),2),8))-rank(decay_linear(correlation(((amount_0/volume_0*adjust_factor_0*0.318108)+(open_0*(1-0.318108))),sum(mean(amount_0,180),37),14),12)))*-1)'
Alpha_64 '((rank(correlation(sum(((open_0*0.178404)+(low_0*(1-0.178404))),13),sum(mean(amount_0,20),13),17))<rank(delta(((((high_0+low_0)/2)*0.178404)+(amount_0/volume_0*adjust_factor_0*(1-0.178404))),4)))*-1)'
Alpha_65 '((rank(correlation(((open_0*0.00817205)+(amount_0/volume_0*adjust_factor_0*(1-0.00817205))),sum(mean(amount_0,60),9),6))<rank((open_0-ts_min(open_0,14))))*-1)'
Alpha_66 '((rank(decay_linear(delta(amount_0/volume_0*adjust_factor_0,4),7))+ts_rank(decay_linear(((((low_0* 0.96633)+(low_0*(1-0.96633)))-amount_0/volume_0*adjust_factor_0)/(open_0-((high_0+low_0)/2))),11),7))*-1)'
Alpha_67 '((rank((high_0-ts_min(high_0,2)))**rank(correlation( amount_0/volume_0*adjust_factor_0 /group_mean(industry_sw_level1_0,amount_0/volume_0*adjust_factor_0),mean(amount_0,20)/group_mean(industry_sw_level1_0,mean(amount_0,20)),6)))*-1)'
Alpha_68 '((ts_rank(correlation(rank(high_0),rank(mean(amount_0,15)),9),14)<rank(delta(((close_0*0.518371)+(low_0*(1-0.518371))),1.06157)))*-1)'
Alpha_69 '((rank(ts_max(delta(amount_0/volume_0*adjust_factor_0/group_mean(industry_sw_level1_0,amount_0/volume_0*adjust_factor_0),3),5))**ts_rank(correlation(((close_0*0.490655)+(amount_0/volume_0*adjust_factor_0*(1-0.490655))),mean(amount_0,20),5),9))*-1)'
Alpha_70 '((rank(delta(amount_0/volume_0*adjust_factor_0,1))**ts_rank(correlation( close_0/group_mean(industry_sw_level1_0,close_0),mean(amount_0,50),18),18))*-1)'
Alpha_71 'max(ts_rank(decay_linear(correlation(ts_rank(close_0,3),ts_rank(mean(amount_0,180),12),18),4),16),ts_rank(decay_linear((rank(((low_0+open_0)-(amount_0/volume_0*adjust_factor_0 +amount_0/volume_0*adjust_factor_0)))**2),16 ),4))'
Alpha_72 '(rank(decay_linear(correlation(((high_0+low_0)/2),mean(amount_0,40),9),10)) /rank(decay_linear(correlation(ts_rank(amount_0/volume_0*adjust_factor_0,4),ts_rank(volume_0,19),7),3)))'
Alpha_73 '(max(rank(decay_linear(delta(amount_0/volume_0*adjust_factor_0,5),3)),ts_rank(decay_linear(((delta(((open_0* 0.147155)+(low_0*(1-0.147155))),2 ) /((open_0* 0.147155)+(low_0*(1-0.147155))))*-1),3),17))*-1)'
Alpha_74 '(rank(correlation(close_0,sum(mean(amount_0,30),37),15))<rank(correlation(rank(high_0*0.0261661+amount_0/volume_0*adjust_factor_0*(1-0.0261661)),rank(volume_0),11)))*-1'
Alpha_75 'rank(correlation(amount_0/volume_0*adjust_factor_0,volume_0,4 ))<rank(correlation(rank(low_0),rank(mean(amount_0,50)),12))'
Alpha_76 'max(rank(decay_linear(delta(amount_0/volume_0*adjust_factor_0,1),12)),ts_rank(decay_linear(ts_rank(correlation( low_0/group_mean(industry_sw_level1_0,low_0),mean(amount_0,81),8 ),20),17),19))*-1'
Alpha_77 'min(rank(decay_linear(((((high_0+low_0)/2)+high_0)-(amount_0/volume_0*adjust_factor_0+high_0)),20 )),rank(decay_linear(correlation(((high_0+low_0)/2),mean(amount_0,40),3),6)))'
Alpha_78 'rank(correlation(sum(((low_0*0.352233)+(amount_0/volume_0*adjust_factor_0*(1-0.352233))),20),sum(mean(amount_0,20),20),7))**rank(correlation(rank(amount_0/volume_0*adjust_factor_0),rank(volume_0),6))'
Alpha_79 'rank(delta((close_0*0.60733+open_0*(1-0.60733))/ group_mean(industry_sw_level1_0,(close_0*0.60733+open_0*(1-0.60733))),1))<rank(correlation(ts_rank(amount_0/volume_0*adjust_factor_0,4),ts_rank(mean(amount_0,150),9),115))'
Alpha_80 '(rank(sign(delta((open_0*0.868128+high_0*(1-0.868128))/group_mean(industry_sw_level1_0,(open_0*0.868128+high_0*(1-0.868128))),4)))**ts_rank(correlation(high_0,mean(amount_0,10),5),6))*-1'
Alpha_81 '(rank(log(product(rank((rank(correlation(amount_0/volume_0*adjust_factor_0,sum(mean(amount_0,10),50),8))**4)),15)))<rank(correlation(rank(amount_0/volume_0*adjust_factor_0),rank(volume_0),5)))*-1'
Alpha_82 'min(rank(decay_linear(delta(open_0,1.46063),15)),ts_rank(decay_linear(correlation( volume_0/group_mean(industry_sw_level1_0,volume_0),((open_0*0.634196) +(open_0*(1-0.634196))),17),7),13))*-1'
Alpha_83 '(rank(delay(((high_0-low_0)/(sum(close_0,5)/5)),2))*rank(rank(volume_0)))/(((high_0-low_0)/(sum(close_0,5)/5))/(amount_0/volume_0*adjust_factor_0-close_0))'
Alpha_84 'signedpower(ts_rank((amount_0/volume_0*adjust_factor_0-ts_max(amount_0/volume_0*adjust_factor_0,15)),20),delta(close_0,5))'
Alpha_85 'rank(correlation(((high_0*0.876703)+(close_0*(1-0.876703))),mean(amount_0,30),10))**rank(correlation(ts_rank(((high_0+low_0)/2),4),ts_rank(volume_0,10),7))'
Alpha_86 '(ts_rank(correlation(close_0,sum(mean(amount_0,20),15),6),20)<rank(((open_0+close_0)-(amount_0/volume_0*adjust_factor_0+open_0))))*-1'
Alpha_87 'max(rank(decay_linear(delta(((close_0*0.369701)+(amount_0/volume_0*adjust_factor_0*(1-0.369701))),2),3)),ts_rank(decay_linear(abs(correlation( mean(amount_0,81) /group_mean(industry_sw_level1_0,mean(amount_0,81)) ,close_0,14)),5),14))*-1'
Alpha_88 'min(rank(decay_linear(((rank(open_0)+rank(low_0))-(rank(high_0)+rank(close_0))),8)),ts_rank(decay_linear(correlation(ts_rank(close_0,8),ts_rank(mean(amount_0,60),21),8),7),3))'
Alpha_89 'ts_rank(decay_linear(correlation(((low_0*0.967285)+(low_0*(1-0.967285))),mean(amount_0,10),7),6),4)-ts_rank(decay_linear(delta( amount_0/volume_0*adjust_factor_0/group_mean(industry_sw_level1_0,amount_0/volume_0*adjust_factor_0),3),10),15)'
Alpha_90 '(rank((close_0-ts_max(close_0,5)))**ts_rank(correlation(mean(amount_0,40)/group_mean(industry_sw_level1_0,mean(amount_0,40)),low_0,5),3))*-1'
Alpha_91 '(ts_rank(decay_linear(decay_linear(correlation(close_0/group_mean(industry_sw_level1_0,close_0),volume_0,10),16),4),5)-rank(decay_linear(correlation(amount_0/volume_0*adjust_factor_0,mean(amount_0,30),4),3)))*-1'
Alpha_92 'min(ts_rank(decay_linear(((((high_0+low_0)/2)+close_0)<(low_0+open_0)),15),19),ts_rank(decay_linear(correlation(rank(low_0),rank(mean(amount_0,30)),8),7),7))'
Alpha_93 'ts_rank(decay_linear(correlation((amount_0/volume_0*adjust_factor_0)/group_mean(industry_sw_level1_0,amount_0/volume_0*adjust_factor_0) ,mean(amount_0,81),17),20),8)/rank(decay_linear(delta(((close_0*0.524434)+(amount_0/volume_0*adjust_factor_0*(1-0.524434))),3),16))'
Alpha_94 '(rank((amount_0/volume_0*adjust_factor_0-ts_min(amount_0/volume_0*adjust_factor_0,12)))**ts_rank(correlation(ts_rank(amount_0/volume_0*adjust_factor_0,20),ts_rank(mean(amount_0,60),4),18),3))*-1'
Alpha_95 'rank((open_0-ts_min(open_0,12)))<ts_rank((rank(correlation(sum(((high_0+low_0)/ 2),19),sum(mean(amount_0,40),19),13))**5),12)'
Alpha_96 'max(ts_rank(decay_linear(correlation(rank(amount_0/volume_0*adjust_factor_0),rank(volume_0),4),4),8),ts_rank(decay_linear(ts_argmax(correlation(ts_rank(close_0,7),ts_rank(mean(amount_0,60),4),4),13),14),13))*-1'
Alpha_97 '(rank(decay_linear(delta(((low_0*0.721001)+(amount_0/volume_0*adjust_factor_0*(1-0.721001)))/group_mean(industry_sw_level1_0,(low_0*0.721001)+(amount_0/volume_0*adjust_factor_0*(1-0.721001))),3),20)) -ts_rank(decay_linear(ts_rank(correlation(ts_rank(low_0,8),ts_rank(mean(amount_0,60),17),5),16),16),7))*-1'
Alpha_98 'rank(decay_linear(correlation(amount_0/volume_0*adjust_factor_0,sum(mean(amount_0,5),26),5),7))-rank(decay_linear(ts_rank(ts_argmin(correlation(rank(open_0),rank(mean(amount_0,15)),21),9),7),8))'
Alpha_99 '(rank(correlation(sum(((high_0+low_0)/2),20),sum(mean(amount_0,60),20),9)) <rank(correlation(low_0,volume_0,6)))*-1'
Alpha_100 '-1*(((1.5*scale(rank(((((close_0-low_0)-(high_0-close_0))/(high_0-low_0))*volume_0))/group_mean(industry_sw_level2_0,rank(((((close_0-low_0)-(high_0-close_0))/(high_0-low_0))*volume_0)))))-scale((correlation(close_0,rank(mean(amount_0,20)),5)-rank(ts_argmin(close_0,30)))/group_mean(industry_sw_level2_0,(correlation(close_0,rank(mean(amount_0,20)),5)-rank(ts_argmin(close_0,30))))))*(volume_0/mean(amount_0,20)))'
Alpha_101 '(close_0-open_0)/((high_0-low_0)+0.001)'
这里展示了WorldQuant公开的101个alpha及其表达式,感兴趣的朋友可以参考 单因子测试 的代码做实验,唯一需要修改的是将具体的因子变动下,希望大家能开发出可以稳定盈利的策略,发掘出新的alpha。
注:部分因子可能是布尔型因子,因子值要么是1,要么是-1,这样的单因子在传入StockRanker的时候可能会出错,导致模型训练失败。
相关阅读
小结: 了解上述方法过后,大家即可在策略研究平台上,通过表达式快速进行因子构建和数据标数。
本文由BigQuant宽客学院推出,版权归BigQuant所有,转载请注明出处。
- 上一页: 如何按自己需求实现自定义因子?
- 下一页: Pandas快速入门
- 目录:量化学院