克隆策略

Numpy使用介绍2

In [18]:
import numpy as np
In [20]:
'''查看方法的帮助文档'''
np.info(np.sum) 
 sum(*args, **kwargs)

Sum of array elements over a given axis.

Parameters
----------
a : array_like
    Elements to sum.
axis : None or int or tuple of ints, optional
    Axis or axes along which a sum is performed.  The default,
    axis=None, will sum all of the elements of the input array.  If
    axis is negative it counts from the last to the first axis.

    .. versionadded:: 1.7.0

    If axis is a tuple of ints, a sum is performed on all of the axes
    specified in the tuple instead of a single axis or all the axes as
    before.
dtype : dtype, optional
    The type of the returned array and of the accumulator in which the
    elements are summed.  The dtype of `a` is used by default unless `a`
    has an integer dtype of less precision than the default platform
    integer.  In that case, if `a` is signed then the platform integer
    is used while if `a` is unsigned then an unsigned integer of the
    same precision as the platform integer is used.
out : ndarray, optional
    Alternative output array in which to place the result. It must have
    the same shape as the expected output, but the type of the output
    values will be cast if necessary.
keepdims : bool, optional
    If this is set to True, the axes which are reduced are left
    in the result as dimensions with size one. With this option,
    the result will broadcast correctly against the input array.

    If the default value is passed, then `keepdims` will not be
    passed through to the `sum` method of sub-classes of
    `ndarray`, however any non-default value will be.  If the
    sub-class' method does not implement `keepdims` any
    exceptions will be raised.
initial : scalar, optional
    Starting value for the sum. See `~numpy.ufunc.reduce` for details.

    .. versionadded:: 1.15.0

where : array_like of bool, optional
    Elements to include in the sum. See `~numpy.ufunc.reduce` for details.

    .. versionadded:: 1.17.0

Returns
-------
sum_along_axis : ndarray
    An array with the same shape as `a`, with the specified
    axis removed.   If `a` is a 0-d array, or if `axis` is None, a scalar
    is returned.  If an output array is specified, a reference to
    `out` is returned.

See Also
--------
ndarray.sum : Equivalent method.

add.reduce : Equivalent functionality of `add`.

cumsum : Cumulative sum of array elements.

trapz : Integration of array values using the composite trapezoidal rule.

mean, average

Notes
-----
Arithmetic is modular when using integer types, and no error is
raised on overflow.

The sum of an empty array is the neutral element 0:

>>> np.sum([])
0.0

For floating point numbers the numerical precision of sum (and
``np.add.reduce``) is in general limited by directly adding each number
individually to the result causing rounding errors in every step.
However, often numpy will use a  numerically better approach (partial
pairwise summation) leading to improved precision in many use-cases.
This improved precision is always provided when no ``axis`` is given.
When ``axis`` is given, it will depend on which axis is summed.
Technically, to provide the best speed possible, the improved precision
is only used when the summation is along the fast axis in memory.
Note that the exact precision may vary depending on other parameters.
In contrast to NumPy, Python's ``math.fsum`` function uses a slower but
more precise approach to summation.
Especially when summing a large number of lower precision floating point
numbers, such as ``float32``, numerical errors can become significant.
In such cases it can be advisable to use `dtype="float64"` to use a higher
precision for the output.

Examples
--------
>>> np.sum([0.5, 1.5])
2.0
>>> np.sum([0.5, 0.7, 0.2, 1.5], dtype=np.int32)
1
>>> np.sum([[0, 1], [0, 5]])
6
>>> np.sum([[0, 1], [0, 5]], axis=0)
array([0, 6])
>>> np.sum([[0, 1], [0, 5]], axis=1)
array([1, 5])
>>> np.sum([[0, 1], [np.nan, 5]], where=[False, True], axis=1)
array([1., 5.])

If the accumulator is too small, overflow occurs:

>>> np.ones(128, dtype=np.int8).sum(dtype=np.int8)
-128

You can also start the sum with a value other than zero:

>>> np.sum([10], initial=5)
15

2.常用计算方法举例

  • 计算sin、log10、去重、相反数、非nan值元素的均值、排序
  • 计算协方差矩阵、计算皮尔逊相关系数
  • 判断nan值和无穷大
  • 最大值、最小值的位置
  • 切分array_split
  • 时间转化与时间间隔计算
  • 计算累和、累积(常用来计算净值曲线)
  • 多项式拟合、求根
  • 其他
In [1]:
"""计算sin、log10、去重、相反数、非nan值元素的均值、排序"""
a = np.array([1,2,4.0,1.0])
b = np.array([3.2,2.4,5.0,10.1])
print(np.sin(a))  # 取sin
print(np.log10(b))  # 取对数
print(np.unique(a)) # 唯一数据
print(np.negative(a)) # 取相反数
print(np.nanmean(a)) # 非nan值计算均值
print(np.sort(a))
[ 0.84147098  0.90929743 -0.7568025   0.84147098]
[0.50514998 0.38021124 0.69897    1.00432137]
[1. 2. 4.]
[-1. -2. -4. -1.]
2.0
[1. 1. 2. 4.]
In [22]:
"""计算两个array的协方差和皮尔逊积矩相关系数"""
a = np.array([1,2,4.0,1.0])
b = np.array([3.2,2.4,5.0,10.1])
print(np.cov(a,b)) # 计算协方差
print(np.corrcoef(a,b)) # 计算皮尔逊积矩相关系数
[[ 2.     -1.1   ]
 [-1.1    11.9625]]
[[ 1.         -0.22488822]
 [-0.22488822  1.        ]]
In [7]:
"""计算多个array之间的皮尔逊相关系数"""
a = np.array([1,2,3,4,5,6,7,8,9,10])
b = np.array([2,4,1,5,1,3,6,2,7,0])
c = np.array([0,3,2,1,4,7,1,9,6,2])
x = np.vstack((a,b,c))
print(np.corrcoef(x))
[[1.         0.10233683 0.47840854]
 [0.10233683 1.         0.0242104 ]
 [0.47840854 0.0242104  1.        ]]
In [24]:
"""判断nan值和无穷"""
print(np.isnan([0, np.nan]))
print(np.isinf([0, np.Inf, 1, -np.Inf]))
[False  True]
[False  True False  True]
In [25]:
'''最大值和最小值的位置'''
print('最大值位置',np.argmax([1,2,3]))
print('最小值位置',np.argmin([1,2,3]))
最大值位置 2
最小值位置 0
In [21]:
'''切分array_split'''
x = np.array([10,20,30,40,50,60,70,80])
print(np.array_split(x,2)) # 等分
print(np.array_split(x,4)) # 非等分
print(np.array_split(x,[2,3,4,6])) # [1~2, 3, 4~5, 6~8]
[array([10, 20, 30, 40]), array([50, 60, 70, 80])]
[array([10, 20]), array([30, 40]), array([50, 60]), array([70, 80])]
[array([10, 20]), array([30]), array([40]), array([50, 60]), array([70, 80])]
In [22]:
'''字符串时间转化'''
print(np.datetime64('2019-01-01')) # 年-月-日
print(np.datetime64('2019-01-01T11:00:00')) # 年-月-日 时:分:秒
print(np.datetime64('2019-01-01 11:00:00.500')) # 年-月-日 时:分:秒.毫秒

'''两个时间之间的差值'''
time1 = np.datetime64('2019-01-01')
time2 = np.datetime64('2019-03-01')
print(int((time2 - time1)/np.timedelta64(1,'D'))) # 计算两个datetime64格式时间之间的天数
print(int((time2 - time1)/np.timedelta64(1,'h'))) # 计算两个datetime64格式时间之间的小时数
print(int((time2 - time1)/np.timedelta64(1,'m'))) # 计算两个datetime64格式时间之间的分钟数
print(int((time2 - time1)/np.timedelta64(1,'s'))) # 计算两个datetime64格式时间之间的秒数
2019-01-01
2019-01-01T11:00:00
2019-01-01T11:00:00.500
59
1416
84960
5097600
In [23]:
'''计算累和和累积
   (策略每日收益率序列 + 1)再累乘 = 策略净值曲线
    策略累计收益率曲线 = 策略净值曲线 - 1
'''
策略日收益率 = np.array([0, -0.1, 0.1, -0.1, 0.1])
策略净值 = np.cumprod(1+ 策略日收益率)
策略累计收益率 = 策略净值 - 1
print('策略净值:',策略净值)
print('策略累计收益率:',策略累计收益率)
策略净值: [1.     0.9    0.99   0.891  0.9801]
策略累计收益率: [ 0.     -0.1    -0.01   -0.109  -0.0199]
In [29]:
'''多项式拟合'''
x = np.arange(0, 3)  # x值,此时表示弧度
y = x**2 + x + 1  #函数值,转化成度
z1 = np.polyfit(x, y, 2) # 用7次多项式拟合,可改变多项式阶数;
p1 = np.poly1d(z1) #得到多项式系数,按照阶数从高到低排列
print(p1)  #显示多项式
   2
1 x + 1 x + 1
In [30]:
'''多项式求解
   计算多项式方程 1 * x**2 - 2 * x**1 -3 = 0 的解
'''
np.roots([1,-2,-3]) 
Out[30]:
array([ 3., -1.])
In [31]:
'''np.where(condition, x1,x2) 条件填充'''
a = np.array([0, -0.1, 0.1, -0.1, 0.1])
np.where(a>0, 1,-1)
Out[31]:
array([-1, -1,  1, -1,  1])

其他方法

  • np.gradient 计算梯度
  • np.percentile 计算分位数(百分比)
  • np.quantile 计算分位数
  • np.rad2deg(np.pi/2) 弧度角度转换
  • np.random.rand 或 np.random.randint-生成随机数
  • np.hsplit、np.hstack 横向分割、拼接两个array
  • np.vsplit、np.vstack 纵向分割、拼接两个array
  • B = np.mat("1 -2 1;0 2 -8;-4 5 9") b = np.array([0,8,-9]) np.linalg.solve(B,b) 求解线性方程

  • np.save, np.load 写入、读取np格式文件

  • np.savez, np.load 写入、读取znp格式:
  • np.savetxt, np.loadtxt 写入、读取csv文