在历史股票数据中寻找和目前沪深300表现相似K线

策略分享
标签: #<Tag:0x00007f73e6743d60>

(Arthas) #1

最近沪深300的走势看不懂(其实之前也不懂),之前有了解华泰证券上线的相似K线功能,自己也尝试着在历史股票数据中寻找和目前沪深300表现相似K线。
30日相似的找到了3个,100日相似的找了4个。
看看就行,别当真哈:slightly_smiling_face:

30日相似

50日相似


画出蜡烛图代码

克隆策略
In [1]:
import matplotlib.pyplot as plt
from matplotlib.finance import candlestick_ohlc
In [7]:
# 过去30天较相似
aa = '000300.SHA'
bb = '2017-07-18'
df = D.history_data(instruments=aa, fields=['open', 'low', 'high', 'close', 'amount'])
df = df[df.amount>0]
df.reset_index(drop=True, inplace=True)
back_len = 30
df0 = df[0-back_len:]
df0['date'] = range(len(df0))
df0['date'] = df0['date'].astype(float)
f, ax1 = plt.subplots(figsize=(6,2))
candlestick_ohlc(ax=ax1, quotes=df0[['date', 'open', 'high', 'low', 'close']].values, colorup='r', colordown='g')
plt.xlim(-1, 30)
plt.title(aa+'  '+bb)
plt.show()


a = ['000811.SZA', '000733.SZA', '000988.SZA']
b = ['2015-12-02', '2015-04-03', '2013-03-04']
for i in range(3):
    aa = a[i]
    bb = b[i]
    df = D.history_data(instruments=aa, fields=['open', 'low', 'high', 'close', 'amount'])
    df = df[df.amount>0]
    df.dropna(inplace=True)
    df.reset_index(drop=True, inplace=True)
    back_len = 30
    forward_len = 10
    indicator_len = len(df[df.date<=bb])
    df0 = df[indicator_len-back_len:indicator_len+forward_len]
    df0['date'] = range(len(df0))
    df0['date'] = df0['date'].astype(float)
    f, ax1 = plt.subplots(figsize=(8,2))
    candlestick_ohlc(ax=ax1, quotes=df0[['date', 'open', 'high', 'low', 'close']].values, colorup='r', colordown='g')
    ax1.axvline(29.5)
    plt.xlim(-1, 42)
    plt.title(aa+'  '+bb)
    plt.show()
In [8]:
# 过去100天较相似
aa = '000300.SHA'
bb = '2017-07-18'
df = D.history_data(instruments=aa, fields=['open', 'low', 'high', 'close', 'amount'])
df = df[df.amount>0]
df.reset_index(drop=True, inplace=True)
back_len = 100
df0 = df[0-back_len:]
df0['date'] = range(len(df0))
df0['date'] = df0['date'].astype(float)
f, ax1 = plt.subplots(figsize=(9,3))
candlestick_ohlc(ax=ax1, quotes=df0[['date', 'open', 'high', 'low', 'close']].values, colorup='r', colordown='g')
plt.xlim(-1, 100)
plt.title(aa+'  '+bb)
plt.show()


a = ['000429.SZA', '000989.SZA', '600303.SHA', '000982.SZA']
b = ['2013-02-20', '2012-03-16', '2011-03-10', '2013-02-25']
for i in range(4):
    aa = a[i]
    bb = b[i]
    df = D.history_data(instruments=aa, fields=['open', 'low', 'high', 'close', 'amount'])
    df = df[df.amount>0]
    df.dropna(inplace=True)
    df.reset_index(drop=True, inplace=True)
    back_len = 100
    forward_len = 30
    indicator_len = len(df[df.date<=bb])
    df0 = df[indicator_len-back_len:indicator_len+forward_len]
    df0['date'] = range(len(df0))
    df0['date'] = df0['date'].astype(float)
    f, ax1 = plt.subplots(figsize=(12,3))
    candlestick_ohlc(ax=ax1, quotes=df0[['date', 'open', 'high', 'low', 'close']].values, colorup='r', colordown='g')
    ax1.axvline(99.5)
    plt.xlim(-1, 132)
    plt.title(aa+'  '+bb)
    plt.show()

(神龙斗士) #2

有意思的研究。

建议代码质量可以更好一点:

  • 可以先从变量名开始,减少a, b, aa, bb 这样的变量名,给出更有意义的名字
  • 将一些代码块封装为函数,并给使用描述性的函数名(表明是什么)
  • 可以用 M.cached 缓存,这样其他人能快速复现结果

(Lipson) #3

你好,能交流下你怎么去识别相似的K先么?