声明
通常,此分析是基于历史数据,而对历史风险暴露的估计可能会影响未来的风险暴露。 因此,计算因子风险暴露是不够的。 你必须对风险暴露保持信心,并明白对风险暴露的建模是否合理。
运用多因子模型计算因子风险暴露
我们可以运用多因子模型分析一个组合中风险和收益的来源,多因子模型对收益的分解如下:
$$R_i=a_i+b_{i1}F_1+b_{i2}F_2+...+b_{iK}F_K+\epsilon_i$$
通过对历史收益率进行建模,我们可以分析出收益中有多少是来自因子收益率,有多少来自资产特质波动($\epsilon$)。 我们也可以研究投资组合所面临的风险来源,即投资组合的因子暴露。
在风险分析中,我们经常对主动回报(相对于基准的回报)和主动风险(主动回报的标准差,也称为跟踪误差或跟踪风险)进行建模。
例如,我们可计算到一个因子对主动风险的边际贡献——FMCAR。 对于因子$j$,表示为:
$$FMCAR_j =\frac{ b_j\sum_{i=1}^Kb_icov(F_j,F_i)}{(Active \ Risk)^2}$$
$b_j$表示组合对因子$j$的风险暴露,$b_i$表示组合对因子$i$的风险暴露,$K$表示一共$K$个因子。$FMCAR_j$这项指标这告诉我们,假设其他条件不变,暴露在因子$j$下我们增加了多少风险。
本文尝试运用Fama-French三因子模型来演示一下因子风险暴露分析,欢迎 克隆。
克隆策略
In [13]:
# 导入要用到的模块
import numpy as np
import statsmodels.api as sm
import scipy.stats as stats
from statsmodels import regression
import matplotlib.pyplot as plt
import pandas as pd
In [14]:
start_date = '2016-01-01'
end_date = '2017-04-10'
raw_data = D.features(D.instruments(),start_date,end_date,fields=
['market_cap_0','rank_market_cap_0','pb_lf_0','rank_pb_lf_0','daily_return_0'])
# 每日股票数量
stock_count = raw_data[['date','instrument']].groupby('date').count()
result=raw_data.merge(stock_count.reset_index('date'), on=['date'], how='outer')
result = result.rename(columns={'instrument_x':'instrument','instrument_y':'stock_count'})
In [17]:
def is_smallest(x):
if x.rank_market_cap_0 < 600/x.stock_count:
return True
else:
return False
result['smallest'] = result.apply(is_smallest,axis=1)
def is_biggest(x):
if x.rank_market_cap_0 > 1-600/x.stock_count:
return True
else:
return False
result['biggest'] = result.apply(is_biggest,axis=1)
def is_lowpb(x):
if x.rank_pb_lf_0 < 600/x.stock_count:
return True
else:
return False
result['lowpb'] = result.apply(is_lowpb,axis=1)
def is_highpb(x):
if x.rank_pb_lf_0 > 1-600/x.stock_count:
return True
else:
return False
result['highpb'] = result.apply(is_highpb,axis=1)
result = result.set_index('date')
In [18]:
# 计算每日因子收益率,因子收益的定义,以市值因子举例,市值因子收益率=biggest组的平均收益率-smallest组的平均收益率
R_biggest = result[result.biggest]['daily_return_0'].groupby(level=0).mean()
R_smallest = result[result.smallest]['daily_return_0'].groupby(level=0).mean()
R_highpb = result[result.highpb]['daily_return_0'].groupby(level=0).mean()
R_lowpb = result[result.lowpb]['daily_return_0'].groupby(level=0).mean()
# 市值因子和市净率因子收益率
SMB = R_smallest - R_biggest
HML = R_lowpb - R_highpb
In [19]:
# 因子累计收益率并绘图
SMB_CUM = np.cumprod(SMB+1)
HML_CUM = np.cumprod(HML+1)
plt.plot(SMB_CUM.index, SMB_CUM.values,)
plt.plot(HML_CUM.index, HML_CUM.values)
plt.ylabel('Cumulative Return')
plt.legend(['SMB Portfolio Returns', 'HML Portfolio Returns']);
In [20]:
# 我们以5只股票的组合(portfolio)举例
instruments = D.instruments()[:5]
Stock_matrix = D.history_data(instruments,start_date,end_date,fields=['close'])
Stock_matrix = pd.pivot_table(Stock_matrix,values='close',index=['date'],columns=['instrument'])
portfolio = Stock_matrix.pct_change()[1:]
# 组合的每日收益率(等权重组合)
R = np.mean(portfolio, axis=1)
# 基准收益率
bench = D.history_data('000300.SHA',start_date, end_date, fields=['close']).set_index('date')['close'].pct_change()[1:]
# 主动收益率
active = R - bench
# 建立一个常数项,为下文回归做准备
constant = pd.TimeSeries(np.ones(len(active.index)), index=active.index)
df = pd.DataFrame({'R': active,
'F1': SMB,
'F2': HML,
'Constant': constant})
# 删除含有缺失值的行
df = df.dropna()
In [21]:
# 线性回归并获取回归系数
b1, b2 = regression.linear_model.OLS(df['R'], df[['F1', 'F2']]).fit().params
# 因子对于主动收益的敏感性(即因子暴露)
print('Sensitivities of active returns to factors:\nSMB: %f\nHML: %f' % (b1, b2))
In [22]:
# 计算因子风险贡献
F1 = df['F1']
F2 = df['F2']
cov = np.cov(F1, F2)
ar_squared = (active.std())**2
fmcar1 = (b1*(b2*cov[0,1] + b1*cov[0,0]))/ar_squared
fmcar2 = (b2*(b1*cov[0,1] + b2*cov[1,1]))/ar_squared
print('SMB Risk Contribution:', fmcar1)
print('HML Risk Contribution:', fmcar2)
In [23]:
# 计算滚动的beta
model = pd.stats.ols.MovingOLS(y = df['R'], x=df[['F1', 'F2']],
window_type='rolling',
window=100)
rolling_parameter_estimates = model.beta
rolling_parameter_estimates.plot()
plt.title('Computed Betas');
plt.legend(['F1 Beta', 'F2 Beta', 'Intercept']);
In [24]:
# 去除有缺省值的日期,从有实际有效值的日期开始
# 计算方差协方差
covariances = pd.rolling_cov(df[['F1', 'F2']], window=100)[99:]
# 计算主动风险
active_risk_squared = pd.rolling_std(active, window = 100)[99:]**2
# 计算beta
betas = rolling_parameter_estimates[['F1', 'F2']]
# 新建一个空的dataframe
FMCAR = pd.DataFrame(index=betas.index, columns=betas.columns)
# 每个因子循环
for factor in betas.columns:
# 每一天循环
for t in betas.index:
# 求beta与协方差之积的和,见公式
s = np.sum(betas.loc[t] * covariances[t][factor])
# 获取beta
b = betas.loc[t][factor]
# 获取主动风险
AR = active_risk_squared.loc[t]
# 估计当天的FMCAR
FMCAR[factor][t] = b * s / AR
In [25]:
# 因子对于主动收益风险的边际贡献
plt.plot(FMCAR['F1'].index, FMCAR['F1'].values)
plt.plot(FMCAR['F2'].index, FMCAR['F2'].values)
plt.ylabel('Marginal Contribution to Active Risk Squared')
plt.legend(['F1 FMCAR', 'F2 FMCAR'])
In [28]:
from statsmodels.stats.stattools import jarque_bera
_, pvalue1, _, _ = jarque_bera(FMCAR['F1'].dropna().values)
_, pvalue2, _, _ = jarque_bera(FMCAR['F2'].dropna().values)
print('p-value F1_FMCAR is normally distributed', pvalue1)
print('p-value F2_FMCAR is normally distributed', pvalue2)