Python Numpy库

发表于 21-07-2020 更新于 02-02-2023 分类于 Python

11.1 为什么要用Numpy

11.1.1 低效的Python for循环

【例】求100万个数的倒数

def compute_reciprocals(values):
    res = []
    for value in values:      # 每遍历到一个元素，就要判断其类型，并查找适用于该数据类型的正确函数
        res.append(1/value)
    return res


values = list(range(1, 1000000))
%timeit compute_reciprocals(values)

186 ms ± 33.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit ：ipython中统计运行时间的魔术方法（多次运行取平均值）

import numpy as np

values = np.arange(1, 1000000)
%timeit 1/values

5.48 ms ± 340 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

实现相同计算，Numpy的运行速度是Python循环的25倍，产生了质的飞跃

11.1.2 Numpy为什么如此高效

Numpy 是由C语言编写的

1、编译型语言VS解释型语言

C语言执行时，对代码进行整体编译，速度更快

2、连续单一类型存储VS分散多变类型存储

（1）Numpy数组内的数据类型必须是统一的，如全部是浮点型，而Python列表支持任意类型数据的填充

（2）Numpy数组内的数据连续存储在内存中，而Python列表的数据分散在内存中

这种存储结构，与一些更加高效的底层处理方式更加的契合

3、多线程VS线程锁

Python语言执行时有线程锁，无法实现真正的多线程并行，而C语言可以

11.1.3 什么时候用Numpy

在数据处理的过程中，遇到使用“Python for循环” 实现一些向量化、矩阵化操作的时候，要优先考虑用Numpy

如： 1、两个向量的点乘

2、矩阵乘法

11.2 Numpy数组的创建

11.2.1 从列表开始创建

import numpy as np

x = np.array([1, 2, 3, 4, 5])
print(x)

[1 2 3 4 5]

1 2	print(type(x)) print(x.shape)

<class 'numpy.ndarray'>
(5,)

设置数组的数据类型

1
2
3

x = np.array([1, 2, 3, 4, 5], dtype="float32")
print(x)
print(type(x[0]))

[1. 2. 3. 4. 5.]
<class 'numpy.float32'>

二维数组

x = np.array([[1, 2, 3],
             [4, 5, 6],
             [7, 8, 9]])
print(x)
print(x.shape)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
(3, 3)

11.2.2 从头创建数组

（1）创建长度为5的数组，值都为0

1	np.zeros(5, dtype=int)

array([0, 0, 0, 0, 0])

（2）创建一个2*4的浮点型数组，值都为1

1	np.ones((2, 4), dtype=float)

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.]])

（3）创建一个3*5的数组，值都为8.8

1	np.full((3, 5), 8.8)

array([[8.8, 8.8, 8.8, 8.8, 8.8],
       [8.8, 8.8, 8.8, 8.8, 8.8],
       [8.8, 8.8, 8.8, 8.8, 8.8]])

（4）创建一个3*3的单位矩阵

np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

（5）创建一个线性序列数组，从1开始，到15结束，步长为2

1	np.arange(1, 15, 2)

array([ 1,  3,  5,  7,  9, 11, 13])

（6）创建一个4个元素的数组，这四个数均匀的分配到0~1

1	np.linspace(0, 1, 4)

array([0.        , 0.33333333, 0.66666667, 1.        ])

（7）创建一个10个元素的数组，形成1~10^9的等比数列

1	np.logspace(0, 9, 10)

array([1.e+00, 1.e+01, 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06, 1.e+07,
       1.e+08, 1.e+09])

（8）创建一个3*3的，在0~1之间均匀分布的随机数构成的数组

1	np.random.random((3,3))

array([[0.57664546, 0.02649211, 0.02776287],
       [0.13697427, 0.26319293, 0.18207141],
       [0.41400812, 0.03992104, 0.6775525 ]])

（9）创建一个3*3的，均值为0，标准差为1的正态分布随机数构成的数组

1	np.random.normal(0, 1, (3,3))

array([[ 1.58543187, -0.01831794,  0.86629919],
       [ 0.7911387 , -0.71228317,  1.49750223],
       [-0.45986603, -0.53562974, -0.72991252]])

（10）创建一个3*3的，在[0,10)之间随机整数构成的数组

1	np.random.randint(0, 10, (3,3))

array([[5, 1, 8],
       [8, 8, 2],
       [2, 7, 0]])

（11）随机重排列

1 2	x = np.array([10, 20, 30, 40]) np.random.permutation(x) # 生产新列表

array([40, 20, 10, 30])

1
2
3

print(x)
np.random.shuffle(x)          # 修改原列表
print(x)

[10 20 30 40]
[30 20 10 40]

（12）随机采样

按指定形状采样

1 2	x = np.arange(10, 25, dtype = float) x

array([10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22.,
       23., 24.])

1	np.random.choice(x, size=(4, 3))

array([[24., 18., 11.],
       [24., 11., 22.],
       [11., 16., 23.],
       [16., 20., 19.]])

1 2	import numpy as np np.random.choice(10, 10)

array([6, 5, 0, 4, 2, 7, 2, 6, 6, 3])

choice(a, size=None, replace=True, p=None)

a为一个一维数据或者int的对象
size为随机选取出后的数据的类型，可以是一维，也可以是二维
replace=True 代表选取后可以放回，也就是说有可能会出现重复选取的数据
replace=False 代表选取后不放回，不会出现重复数据
p为选取的概率
random.choice中的参数a如果为一个int类型，那么会自动生成一个（0,5）的一维数组形式

1 2	x = np.arange(5).reshape(1, 5) x.sum(axis=1, keepdims=True)

array([[0, 1, 2, 3, 4]])






array([[10]])

按概率采样

1 2	x = x.flatten() np.random.choice(x, size=(4, 3), p=x/np.sum(x))

array([[4, 4, 4],
       [2, 3, 4],
       [2, 1, 1],
       [3, 4, 2]])

11.3 Numpy数组的性质

11.3.1 数组的属性

1 2	x = np.random.randint(10, size=(3, 4)) x

array([[1, 6, 8, 9],
       [7, 4, 2, 7],
       [5, 9, 2, 1]])

1、数组的形状shape

x.shape

(3, 4)

2、数组的维度ndim

x.ndim

1 2	y = np.arange(10) y

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

y.ndim

3、数组的大小size

x.size

4、数组的数据类型dtype

x.dtype

dtype('int32')

11.3.2 数组索引

1、一维数组的索引

1 2	x1 = np.arange(10) x1

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

x1[0]

x1[5]

x1[-1]

2、多维数组的索引——以二维为例

1 2	x2 = np.random.randint(0, 20, (2,3)) x2

array([[ 9,  7, 14],
       [ 2,  8,  3]])

x2[0, 0]

x2[0][0]

注意：numpy数组的数据类型是固定的，向一个整型数组插入一个浮点值，浮点值会向下进行取整

1	x2[0, 0] = 1.618

x2

array([[ 1,  7, 14],
       [ 2,  8,  3]])

11.3.3 数组的切片

1、一维数组——跟列表一样

1 2	x1 = np.arange(10) x1

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

x1[:3]

array([0, 1, 2])

x1[3:]

array([3, 4, 5, 6, 7, 8, 9])

x1[::-1]

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

2、多维数组——以二维为例

1 2	x2 = np.random.randint(20, size=(3,4)) x2

array([[15, 11, 12, 18],
       [ 8, 12, 10,  8],
       [19, 13, 19, 13]])

1	x2[:2, :3] # 前两行，前三列

array([[15, 11, 12],
       [ 8, 12, 10]])

1	x2[:2, 0:3:2] # 前两行前三列（每隔一列）

array([[15, 12],
       [ 8, 10]])

1	x2[::-1, ::-1]

array([[13, 19, 13, 19],
       [ 8, 10, 12,  8],
       [18, 12, 11, 15]])

3、获取数组的行和列

1 2	x3 = np.random.randint(20, size=(3,4)) x3

array([[18, 14, 16, 18],
       [11, 14, 13, 15],
       [ 9,  8,  9, 12]])

1	x3[1, :] #第一行从0开始计数

array([11, 14, 13, 15])

1	x3[1] # 第一行简写

array([11, 14, 13, 15])

1	x3[:, 2] # 第二列从0开始计数

array([16, 13,  9])

4、切片获取的是视图，而非副本

1 2	x4 = np.random.randint(20, size=(3,4)) x4

array([[ 9, 18,  2,  3],
       [ 8, 10,  2, 14],
       [ 0,  6, 18, 14]])

1 2	x5 = x4[:2, :2] x5

array([[ 9, 18],
       [ 8, 10]])

注意：视图元素发生修改，则原数组亦发生相应修改

1 2	x5[0, 0] = 0 x5

array([[ 0, 18],
       [ 8, 10]])

x4

array([[ 0, 18,  2,  3],
       [ 8, 10,  2, 14],
       [ 0,  6, 18, 14]])

修改切片的安全方式：copy

1 2	x4 = np.random.randint(20, size=(3,4)) x4

array([[15, 15,  4, 10],
       [ 3, 15, 12, 15],
       [ 5,  5, 16,  8]])

1 2	x6 = x4[:2, :2].copy() x6

array([[15, 15],
       [ 3, 15]])

1 2	x6[0, 0] = 0 x6

array([[ 0, 15],
       [ 3, 15]])

x4

array([[15, 15,  4, 10],
       [ 3, 15, 12, 15],
       [ 5,  5, 16,  8]])

11.3.4 数组的变形

1 2	x5 = np.random.randint(0, 10, 12) x5

array([8, 7, 0, 9, 6, 0, 6, 2, 7, 4, 5, 3])

numpy.random.randint(low,high=None,size=None,dtype)

x5.shape

(12,)

1 2	x6 = x5.reshape(3, 4) x6

array([[8, 7, 0, 9],
       [6, 0, 6, 2],
       [7, 4, 5, 3]])

注意：reshape返回的是视图，而非副本

1 2	x6[0, 0] = 0 x5

array([0, 7, 0, 9, 6, 0, 6, 2, 7, 4, 5, 3])

一维向量转行向量

1 2	x7 = x5.reshape(1, x5.shape[0]) x7

array([[0, 7, 0, 9, 6, 0, 6, 2, 7, 4, 5, 3]])

1 2	x8 = x5[np.newaxis, :] x8

array([[0, 7, 0, 9, 6, 0, 6, 2, 7, 4, 5, 3]])

x8 = x5[np.newaxis, :]
x8
x8 = x8[np.newaxis, :]
x8
x8 = x8[np.newaxis, :]
x8
x8.T
#可见np.newaxis的作用真的是增加坐标轴

array([[0, 7, 0, 9, 6, 0, 6, 2, 7, 4, 5, 3]])


array([[[0, 7, 0, 9, 6, 0, 6, 2, 7, 4, 5, 3]]])


array([[[[0, 7, 0, 9, 6, 0, 6, 2, 7, 4, 5, 3]]]])


array([[[[0]]]

       [[[7]]]

       [[[0]]],

       [[[9]]],

       [[[6]]],

       [[[0]]],

       [[[6]]],

       [[[2]]],

       [[[7]]],

       [[[4]]],

       [[[5]]],

       [[[3]]]])

一维向量转列向量

1 2	x7 = x5.reshape(x5.shape[0], 1) x7

array([[0],
       [7],
       [0],
       [9],
       [6],
       [0],
       [6],
       [2],
       [7],
       [4],
       [5],
       [3]])

1 2	x8 = x5[:, np.newaxis] x8

array([[0],
       [7],
       [0],
       [9],
       [6],
       [0],
       [6],
       [2],
       [7],
       [4],
       [5],
       [3]])

多维向量转一维向量

1 2	x6 = np.random.randint(0, 10, (3, 4)) x6

array([[9, 8, 1, 4],
       [5, 2, 4, 9],
       [0, 5, 2, 0]])

flatten返回的是副本

1 2	x9 = x6.flatten() x9

array([9, 8, 1, 4, 5, 2, 4, 9, 0, 5, 2, 0])

1 2	x9[0]=0 x6

array([[9, 8, 1, 4],
       [5, 2, 4, 9],
       [0, 5, 2, 0]])

ravel返回的是视图

1 2	x10 = x6.ravel() x10

array([9, 8, 1, 4, 5, 2, 4, 9, 0, 5, 2, 0])

1 2	x10[0]=0 x6

array([[0, 8, 1, 4],
       [5, 2, 4, 9],
       [0, 5, 2, 0]])

reshape返回的是视图

1 2	x11 = x6.reshape(-1) x11

array([0, 8, 1, 4, 5, 2, 4, 9, 0, 5, 2, 0])

1 2	x11[0]=10 x6

array([[10,  8,  1,  4],
       [ 5,  2,  4,  9],
       [ 0,  5,  2,  0]])

11.3.5 数组的拼接

x1 = np.array([[1, 2, 3],
              [4, 5, 6]])
x2 = np.array([[7, 8, 9],
              [0, 1, 2]])

1、水平拼接——非视图

1 2	x3 = np.hstack([x1, x2]) x3

array([[1, 2, 3, 7, 8, 9],
       [4, 5, 6, 0, 1, 2]])

1 2	x3[0][0] = 0 x1

array([[1, 2, 3],
       [4, 5, 6]])

1 2	x4 = np.c_[x1, x2] x4

array([[1, 2, 3, 7, 8, 9],
       [4, 5, 6, 0, 1, 2]])

1 2	x4[0][0] = 0 x1

array([[1, 2, 3],
       [4, 5, 6]])

2、垂直拼接——非视图

x1 = np.array([[1, 2, 3],
              [4, 5, 6]])
x2 = np.array([[7, 8, 9],
              [0, 1, 2]])

1 2	x5 = np.vstack([x1, x2]) x5

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9],
       [0, 1, 2]])

1 2	x6 = np.r_[x1, x2] x6

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9],
       [0, 1, 2]])

11.3.6 数组的分裂

1、split的用法

1 2	x6 = np.arange(10) x6

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

1 2	x1, x2, x3 = np.split(x6, [2, 7]) print(x1, x2, x3)

[0 1] [2 3 4 5 6] [7 8 9]

2、hsplit的用法

1 2	x7 = np.arange(1, 26).reshape(5, 5) x7

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25]])

left, middle, right = np.hsplit(x7, [2,4])
print("left:\n", left)            # 第0~1列
print("middle:\n", middle)        # 第2~3列
print("right:\n", right)          # 第4列

left:
 [[ 1  2]
 [ 6  7]
 [11 12]
 [16 17]
 [21 22]]
middle:
 [[ 3  4]
 [ 8  9]
 [13 14]
 [18 19]
 [23 24]]
right:
 [[ 5]
 [10]
 [15]
 [20]
 [25]]

3、vsplit的用法

1 2	x7 = np.arange(1, 26).reshape(5, 5) x7

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25]])

upper, middle, lower = np.vsplit(x7, [2,4])
print("upper:\n", upper)         # 第0~1行
print("middle:\n", middle)       # 第2~3行
print("lower:\n", lower)         # 第4行

upper:
 [[ 1  2  3  4  5]
 [ 6  7  8  9 10]]
middle:
 [[11 12 13 14 15]
 [16 17 18 19 20]]
lower:
 [[21 22 23 24 25]]

upper, middle1, middle2, lower = np.vsplit(x7, [1,3,4])
print("upper:\n", upper)         # 第0行
print("middle1:\n", middle1)       # 第1~2行
print("middle2:\n", middle2)       # 第3行
print("lower:\n", lower)         # 第4行

upper:
 [[1 2 3 4 5]]
middle1:
 [[ 6  7  8  9 10]
 [11 12 13 14 15]]
middle2:
 [[16 17 18 19 20]]
lower:
 [[21 22 23 24 25]]

11.4 Numpy四大运算

11.4.1 向量化运算

1、与数字的加减乘除等

1 2	x1 = np.arange(1,6) x1

array([1, 2, 3, 4, 5])

print("x1+5", x1+5)
print("x1-5", x1-5)
print("x1*5", x1*5)
print("x1/5", x1/5)

x1+5 [ 6  7  8  9 10]
x1-5 [-4 -3 -2 -1  0]
x1*5 [ 5 10 15 20 25]
x1/5 [0.2 0.4 0.6 0.8 1. ]

print("-x1", -x1)
print("x1**2", x1**2)
print("x1//2", x1//2)
print("x1%2", x1%2)

-x1 [-1 -2 -3 -4 -5]
x1**2 [ 1  4  9 16 25]
x1//2 [0 1 1 2 2]
x1%2 [1 0 1 0 1]

2、绝对值、三角函数、指数、对数

（1）绝对值

1 2	x2 = np.array([1, -1, 2, -2, 0]) x2

array([ 1, -1,  2, -2,  0])

abs(x2)

array([1, 1, 2, 2, 0])

1	np.abs(x2)

array([1, 1, 2, 2, 0])

（2）三角函数

1 2	theta = np.linspace(0, np.pi, 3) theta

array([0.        , 1.57079633, 3.14159265])

1
2
3

print("sin(theta)", np.sin(theta))
print("con(theta)", np.cos(theta))
print("tan(theta)", np.tan(theta))

sin(theta) [0.0000000e+00 1.0000000e+00 1.2246468e-16]
con(theta) [ 1.000000e+00  6.123234e-17 -1.000000e+00]
tan(theta) [ 0.00000000e+00  1.63312394e+16 -1.22464680e-16]

x = [1, 0 ,-1]
print("arcsin(x)", np.arcsin(x))
print("arccon(x)", np.arccos(x))
print("arctan(x)", np.arctan(x))

arcsin(x) [ 1.57079633  0.         -1.57079633]
arccon(x) [0.         1.57079633 3.14159265]
arctan(x) [ 0.78539816  0.         -0.78539816]

（3）指数运算

1 2	x = np.arange(3) x

array([0, 1, 2])

np.exp(x)

array([1.        , 2.71828183, 7.3890561 ])

（4）对数运算

x = np.array([1, 2, 4, 8 ,10])
print("ln(x)", np.log(x))
print("log2(x)", np.log2(x))
print("log10(x)", np.log10(x))

ln(x) [0.         0.69314718 1.38629436 2.07944154 2.30258509]
log2(x) [0.         1.         2.         3.         3.32192809]
log10(x) [0.         0.30103    0.60205999 0.90308999 1.        ]

3、两个数组的运算

1 2	x1 = np.arange(1,6) x1

array([1, 2, 3, 4, 5])

1 2	x2 = np.arange(6,11) x2

array([ 6,  7,  8,  9, 10])

print("x1+x2:", x1+x2)
print("x1-x2:", x1-x2)
print("x1*x2:", x1*x2)
print("x1/x2:", x1/x2)

x1+x2: [ 7  9 11 13 15]
x1-x2: [-5 -5 -5 -5 -5]
x1*x2: [ 6 14 24 36 50]
x1/x2: [0.16666667 0.28571429 0.375      0.44444444 0.5       ]

11.4.2 矩阵运算

1 2	x = np.arange(9).reshape(3, 3) x

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

矩阵的转置

1
2

y = x.T
y

array([[0, 3, 6],
       [1, 4, 7],
       [2, 5, 8]])

矩阵乘法

x = np.array([[1, 0],
             [1, 1]])
y = np.array([[0, 1],
             [1, 1]])

x.dot(y)

array([[0, 1],
       [1, 2]])

1	np.dot(x, y)

array([[0, 1],
       [1, 2]])

y.dot(x)

array([[1, 1],
       [2, 1]])

1	np.dot(y, x)

array([[1, 1],
       [2, 1]])

注意跟x*y的区别

x*y #按元素乘

array([[0, 0],
       [1, 1]])

11.4.3 广播运算

1 2	x = np.arange(3).reshape(1, 3) x

array([[0, 1, 2]])

x+5

array([[5, 6, 7]])

规则

如果两个数组的形状在维度上不匹配

那么数组的形式会沿着维度为1的维度进行扩展以匹配另一个数组的形状。

1 2	x1 = np.ones((3,3)) x1

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

1 2	x2 = np.arange(3).reshape(1, 3) x2

array([[0, 1, 2]])

x1+x2

array([[1., 2., 3.],
       [1., 2., 3.],
       [1., 2., 3.]])

1 2	x3 = np.logspace(1, 10, 10, base=2).reshape(2, 5) x3

array([[   2.,    4.,    8.,   16.,   32.],
       [  64.,  128.,  256.,  512., 1024.]])

1 2	x4 = np.array([[1, 2, 4, 8, 16]]) x4

array([[ 1,  2,  4,  8, 16]])

x3/x4

array([[ 2.,  2.,  2.,  2.,  2.],
       [64., 64., 64., 64., 64.]])

1 2	x5 = np.arange(3).reshape(3, 1) x5

array([[0],
       [1],
       [2]])

1 2	x6 = np.arange(3).reshape(1, 3) x6

array([[0, 1, 2]])

x5+x6

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

11.4.4 比较运算和掩码

1、比较运算

1 2	x1 = np.random.randint(100, size=(10,10)) x1

array([[64, 24, 50, 91, 67, 48,  9, 79, 49,  6],
       [32, 42, 76, 59, 90, 99, 14, 77, 72, 24],
       [16, 21, 56, 70, 26,  4, 71, 26, 30, 23],
       [38, 73, 89,  3, 84, 22, 11, 56, 73, 87],
       [43, 24, 93, 91, 44, 22, 82, 57, 35, 57],
       [89, 38, 39, 16, 70, 23, 91, 29, 34, 60],
       [40, 13, 62, 98, 91, 26,  0, 68, 17, 98],
       [94, 74,  8, 47, 97, 45, 97, 97, 24, 46],
       [19, 41,  4,  2, 76,  3, 47, 76, 99, 56],
       [71, 17, 16, 71, 76, 40, 97, 60, 11, 58]])

x1 > 50

array([[ True, False, False,  True,  True, False, False,  True, False,
        False],
       [False, False,  True,  True,  True,  True, False,  True,  True,
        False],
       [False, False,  True,  True, False, False,  True, False, False,
        False],
       [False,  True,  True, False,  True, False, False,  True,  True,
         True],
       [False, False,  True,  True, False, False,  True,  True, False,
         True],
       [ True, False, False, False,  True, False,  True, False, False,
         True],
       [False, False,  True,  True,  True, False, False,  True, False,
         True],
       [ True,  True, False, False,  True, False,  True,  True, False,
        False],
       [False, False, False, False,  True, False, False,  True,  True,
         True],
       [ True, False, False,  True,  True, False,  True,  True, False,
         True]])

2、操作布尔数组

1 2	x2 = np.random.randint(10, size=(3, 4)) x2

array([[5, 7, 2, 8],
       [7, 3, 3, 9],
       [9, 0, 9, 1]])

1 2	print(x2 > 5) np.sum(x2 > 5)

[[False  True False  True]
 [ True False False  True]
 [ True False  True False]]





6

1	np.all(x2 > 0)

False

1	np.any(x2 == 6)

False

1	np.all(x2 < 9, axis=1) # 按行进行判断

array([ True, False, False])

x2

array([[5, 7, 2, 8],
       [7, 3, 3, 9],
       [9, 0, 9, 1]])

1	(x2 < 9) & (x2 >5)

array([[False,  True, False,  True],
       [ True, False, False, False],
       [False, False, False, False]])

1	np.sum((x2 < 9) & (x2 >5))

3、将布尔数组作为掩码

x2

array([[5, 7, 2, 8],
       [7, 3, 3, 9],
       [9, 0, 9, 1]])

x2 > 5

array([[False,  True, False,  True],
       [ True, False, False,  True],
       [ True, False,  True, False]])

1	x2[x2 > 5]

array([7, 8, 7, 9, 9, 9])

11.4.5 花哨的索引

1、一维数组

1 2	x = np.random.randint(100, size=10) x

array([75, 41,  0, 77, 78, 82,  2, 79, 93,  7])

注意：结果的形状与索引数组ind一致，相应元素就是索引位置的元素

ind = [2, 6, 9]
x[ind]
ind = [6, 6, 2]
x[ind]

array([0, 2, 7])






array([2, 2, 0])

1
2
3

ind = np.array([[1, 0],
               [2, 3]])
x[ind]

array([[41, 75],
       [ 0, 77]])

2、多维数组

1 2	x = np.arange(12).reshape(3, 4) x

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

row = np.array([0, 1, 2])
x[row,:] 
row = np.array([1, 1, 0])
x[row,:]

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])






array([[4, 5, 6, 7],
       [4, 5, 6, 7],
       [0, 1, 2, 3]])

col = np.array([1, 3, 0])
x[:, col] 
col = np.array([0, 0, 0])
x[:, col]

array([[ 1,  3,  0],
       [ 5,  7,  4],
       [ 9, 11,  8]])






array([[0, 0, 0],
       [4, 4, 4],
       [8, 8, 8]])

1
2
3

row = np.array([0, 1, 2])
col = np.array([1, 3, 0])
x[row, col]               # x(0, 1) x(1, 3) x(2, 0)

array([1, 7, 8])

1	row[:, np.newaxis] # 列向量

array([[0],
       [1],
       [2]])

1	x[row[:, np.newaxis], col] # 广播机制

array([[ 1,  3,  0],
       [ 5,  7,  4],
       [ 9, 11,  8]])

11.5 其他Numpy通用函数

11.5.1 数值排序

1 2	x = np.random.randint(20, 50, size=10) x

array([39, 21, 45, 46, 27, 32, 42, 25, 44, 32])

产生新的排序数组

1	np.sort(x)

array([21, 25, 27, 32, 32, 39, 42, 44, 45, 46])

array([39, 21, 45, 46, 27, 32, 42, 25, 44, 32])

替换原数组

1 2	x.sort() x

array([21, 25, 27, 32, 32, 39, 42, 44, 45, 46])

获得排序索引

1 2	x = np.random.randint(20, 50, size=10) x

array([32, 43, 44, 25, 21, 40, 33, 45, 23, 21])

1 2	i = np.argsort(x) i

array([4, 9, 8, 3, 0, 6, 5, 1, 2, 7], dtype=int64)

11.5.2 最大最小值

1 2	x = np.random.randint(20, 50, size=10) x

array([26, 44, 27, 20, 33, 20, 24, 47, 41, 48])

1 2	print("max:", np.max(x)) print("min:", np.min(x))

max: 48
min: 20

1 2	print("max_index:", np.argmax(x)) print("min_index:", np.argmin(x))

max_index: 9
min_index: 3

11.5.3 数值求和、求积

1 2	x = np.arange(1,6) x

array([1, 2, 3, 4, 5])

x.sum()

np.sum(x)

1 2	x1 = np.arange(6).reshape(2,3) x1

array([[0, 1, 2],
       [3, 4, 5]])

按行求和

1	np.sum(x1, axis=1)

array([ 3, 12])

按列求和

1	np.sum(x1, axis=0)

array([3, 5, 7])

全体求和

1	np.sum(x1)

求积

array([1, 2, 3, 4, 5])

x.prod()

1	np.prod(x)

11.5.4 中位数、均值、方差、标准差

1	x = np.random.normal(0, 1, size=10000)

import matplotlib.pyplot as plt

plt.hist(x, bins=50)
plt.show()

(array([  2.,   0.,   0.,   0.,   3.,   1.,   9.,   8.,  10.,  15.,  37.,
         62.,  53.,  77., 124., 169., 209., 308., 354., 460., 488., 532.,
        612., 583., 655., 653., 638., 588., 553., 491., 481., 387., 347.,
        269., 235., 182., 129.,  90.,  64.,  40.,  32.,  20.,   7.,   7.,
          6.,   0.,   6.,   1.,   1.,   2.]),
 array([-4.13996285, -3.97711225, -3.81426166, -3.65141106, -3.48856047,
        -3.32570987, -3.16285928, -3.00000868, -2.83715809, -2.6743075 ,
        -2.5114569 , -2.34860631, -2.18575571, -2.02290512, -1.86005452,
        -1.69720393, -1.53435333, -1.37150274, -1.20865214, -1.04580155,
        -0.88295095, -0.72010036, -0.55724976, -0.39439917, -0.23154858,
        -0.06869798,  0.09415261,  0.25700321,  0.4198538 ,  0.5827044 ,
         0.74555499,  0.90840559,  1.07125618,  1.23410678,  1.39695737,
         1.55980797,  1.72265856,  1.88550916,  2.04835975,  2.21121035,
         2.37406094,  2.53691153,  2.69976213,  2.86261272,  3.02546332,
         3.18831391,  3.35116451,  3.5140151 ,  3.6768657 ,  3.83971629,
         4.00256689]),
 <a list of 50 Patch objects>)




<Figure size 640x480 with 1 Axes>

中位数

1	np.median(x)

-0.010584626095505473

均值

x.mean()

-0.0018366166547129652

1	np.mean(x)

-0.0018366166547129652

方差

x.var()

0.984963218594082

np.var(x)

0.984963218594082

标准差

x.std()

0.9924531316863694

np.std(x)

0.9924531316863694