# 数据挖掘—Numpy的学习

### 什么是Numpy

NumPy系统是Python的一种开源的数值计算扩展。这种工具可用来存储和处理大型矩阵(任意维度的数据处理)，比Python自身的嵌套列表（nested list structure)结构要高效的多（该结构也可以用来表示矩阵（matrix））。

NumPy provides an N-dimension array type, the ndarray, which describes a collection of ‘items’of the same type.

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
import numpy as np
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
import random
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
import time
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 生成一个大数组
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
python_list = []
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
for i in range(100000000):
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
python_list.append(random.random())
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
ndarray_list = np.array(python_list)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
len(ndarray_list)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 原生pythonlist求和
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
t1 = time.time()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
a = sum(python_list)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
t2 = time.time()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
d1 = t2 – t1
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(d1) # 0.7309620380401611
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# ndarray求和
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
t3 = time.time()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
b = np.sum(ndarray_list)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
t4 = time.time()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
d2 = t4 – t3
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(d2) # 0.12980318069458008
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

Numpy优势:

1）存储风格

ndarray – 相同类型 – 通用性不强 – 数据是连续性的存储

list – 不同类型 – 通用性很强 – 引用的方式且不连续的堆空间存储

2）并行化运算

ndarray支持向量化运算

3）底层语言

C语言，解除了GIL

1、内存块风格

2、ndarry支持并行化运算

3、Numpy底层是C编程，内部解除了GIL(全局解释器锁–实际上只有一个线程)的限制

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
import numpy as np
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 创建数组的时候指定类型(1)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
t = np.array([1.1, 2.2, 3.3], dtype=np.float32)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 创建数组的时候指定类型(2)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
tt = np.array([1.1, 2.2, 3.3], dtype="float32")
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

### 基本操作

1）生成0和1

np.zeros(shape)

np.ones(shape)

2）从现有数组中生成

np.array() np.copy() 深拷贝

np.asarray() 浅拷贝

3）生成固定范围的数组

np.linspace(0, 10, 100)

[0, 10] 等距离

np.arange(a, b, c)

range(a, b, c)

[a, b) c是步长

4）生成随机数组

1）均匀分布

2）正态分布

σ 幅度、波动程度、集中程度、稳定性、离散程度

1、生成0和1的数组

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
import numpy as np
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 方法一：np.array()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
score = np.array([[80, 89, 86, 67, 79],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[94, 92, 93, 67, 64],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[86, 85, 83, 67, 80]])
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 方法二：np.copy()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
ttt = np.copy(score)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 方法三：np.asarray()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
tttt = np.asarray(ttt)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

np.array() np.copy() 深拷贝

np.asarray() 浅拷贝

3 生成固定范围的数组

np.linspace(0, 10, 100)

[0, 10] 左闭右闭的等距离输出100个数字

np.arange(a, b, c)

[a, b) 左闭右开的步长为c的数组

4 生成随机数组（ 分布状况 – 直方图）

1）均匀分布

2）正态分布

σ 幅度、波动程度、集中程度、稳定性、离散程度

1、均匀分布：出现的概率一样

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
demo:
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
demo:
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
import numpy as np
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
def type_change():
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
ndarry的类型修改一： astype(‘float32′)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
arr3 = np.array(
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[[1, 2, 3], [4, 5, 6]]
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
) # （2, 3）
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
arr4 = arr3.astype("float32") # int转换为float
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(arr3.dtype) # int32
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(arr4.dtype) # float32
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
ndarry的类型修改二： 利用tostrint()序列化
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
arr5 =arr3.tostring() # 序列化 \x01\x00\x00\x00
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(arr5)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
if __name__ == ‘__main__’:
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 类型形状
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
type_change()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

set

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
import numpy as np
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
def demo():
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
temp = np.array([[1, 2, 3, 4], [3, 4, 5, 6]])
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 判断temp里面的元素是否大于5(temp > 5)就标记为True 否则为False:
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(temp > 5)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 找到数值大于等于5的数字
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(temp[temp >= 5]) # [5 6]
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 找到数值大于等于5的数字,并统一赋值为100
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
temp[temp >= 5] = 100
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(temp)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
if __name__ == ‘__main__’:
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 逻辑运算 — 布尔索引
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
demo()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

* *通用判断函数

np.all(布尔值)

np.any()

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
import numpy as np
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
def demo():
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
temp = np.array([[1, 2, 3, 4], [3, 4, 5, 6]])
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# np.where(): np.where(布尔值, True的位置的值, False的位置的值)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(np.where(temp > 4, 100, -100)) # 如果元素大于4，则置为100，否则置为-100
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[[-100 -100 -100 -100]
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[-100 -100 100 100]]
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
if __name__ == ‘__main__’:
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 逻辑运算 — 三元运算符
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
demo()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
import numpy as np
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
def demo():
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
temp = np.array([[1, 2, 3, 4], [3, 4, 5, 6], [5, 6, 7, 8]])
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(temp.max(axis=0)) # [5 6 7 8]， 按照列比较
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(temp.max(axis=1)) # [4 6 8]， 按照行比较
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(np.argmax(temp, axis=1)) # [3 3 3]， 返回最大值所在的位置
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(np.argmin(temp, axis=1)) # [0 0 0 ]， 返回最小值所在的位置
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
if __name__ == ‘__main__’:
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 统计运算
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
demo()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
import numpy as np
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
def demo():
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
temp = np.array([[1, 2, 3, 4], [3, 4, 5, 6], [5, 6, 7, 8]])
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(temp + 10)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(temp * 10)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
if __name__ == ‘__main__’:
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 数组与数的运算
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
demo()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

2、数组与数组的运算(需满足广播机制)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
import numpy as np
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
def demo():
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 方案一：ndarray存储矩阵
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
data = np.array([[80, 86],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[82, 80],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[85, 78],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[90, 90],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[86, 82],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[82, 90],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[78, 80],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[92, 94]])
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(type(data)) #
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 方案二： matrix存储矩阵
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
data_mat = np.mat([[80, 86],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[82, 80],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[85, 78],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[90, 90],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[86, 82],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[82, 90],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[78, 80],
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
[92, 94]])
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(type(data_mat)) #
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
if __name__ == ‘__main__’:
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# ndarray存储矩阵
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
demo()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

2、矩阵乘法

(m, n) * (n, l) = (m, l)

A (2, 3) B(3, 2)

A * B = (2, 2)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
id,value1,value2,value3
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
1,123,1.4,23
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
2,110,,18
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
3,,2.1,19
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

demo:

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
import numpy as np
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
def fill_nan_by_column_mean():
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
”’
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
t = np.genfromtxt("F:\linear\\test.csv", delimiter=",")
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
for i in range(t.shape[1]): # 按照列求平均，先计算数据的shape，看列的数量
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 计算nan的个数
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
nan_num = np.count_nonzero(t[:, i][t[:, i] != t[:, i]])
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
if nan_num > 0:
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
now_col = t[:, i]
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 求和
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
now_col_not_nan = now_col[np.isnan(now_col) == False].sum()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 和/个数
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
now_col_mean = now_col_not_nan / (t.shape[0] – nan_num)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 赋值给now_col
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
now_col[np.isnan(now_col)] = now_col_mean
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 赋值给t，即更新t的当前列
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
t[:, i] = now_col
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
print(t)
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
return t
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
if __name__ == ‘__main__’:
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
# 处理缺失值 — 均值填补
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)
fill_nan_by_column_mean()
![数据挖掘—Numpy的学习](https://johngo-pic.oss-cn-beijing.aliyuncs.com/articles/20220812/519608-20190322200402913-137423312.png)

Original: https://www.cnblogs.com/ftl1012/p/10561952.html
Author: 小a玖拾柒
Title: 数据挖掘—Numpy的学习

