Python——最全的Numpy Pandas库的学习笔记

2023年7月16日上午8:53 • 人工智能 • 阅读 51

文章目录

*
– Numpy
–
+ 属性
+ 数组的创建
+ 数组的变换
+ 数组的运算
+ 随机数函数
+ 统计函数
+ 矩阵运算计算特征值
+ 排序
+ 注：
– Pandas
–
+ 创建DataFrame
+ DataFrame操作
+ DataFrame输出到excel
+ Excel 格式修改

Numpy

属性

arr = np.array([10,20,20])

arr.ndim
arr.shape
arr.size
arr.dtype
arr.itemsize

数组的创建


x = np.array(list/tuple, dtype=np.float32)

np.arange(n)
np.arange(10)
>>> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

np.ones(shape)
np.ones((3,2))
>>> array([[1., 1.],
           [1., 1.],
           [1., 1.]])

np.zeros(shape)
np.full(shape,val)
np.eye(n)
np.ones_like(a)
np.zeros_like(a)
np.full_like(a,val)

np.linspace(1,10,4)
np.concatenate((a,b))

数组的变换

a = np.ones((2,3,4),  dtype=np.int32)

a.reshape(shape)
a.resize(shape)
a.swapaxes(ax1,ax2)
a.flatten()

new_a  = a.astype(new_type)

ls = a.tolist()

数组的运算


np.abs(x)  np.fabs(x)
np.sqrt(x)
np.square(x)
np.log(x)  np.log10(x)   np.log2(x)
np.ceil(x)  np.floor(x)
np.rint(x)
np.modf(x)
np.cos(x)  np.cosh(x)
np.sin(x)  np.sinh(x)
np.tan(x)  np.tanh(x)

np.exp(x)
np.sign(x)

+ ‐  * / **
np.maximum(x,y)  np.fmax()
np.minimum(x,y)  np.fmin()

np.mod(x,y)
np.copysign(x,y)
> < >=  == !=

随机数函数

np.random.*

np.random.rand(d0,d1,..,dn)
np.random.randn(d0,d1,..,dn)
np.random.randint(low[,high,shape])
np.random.seed(s)

np.random.shuffle(a)
np.random.permutation(a)
choice(a[,size,replace,p])

np.random.uniform(low,high,size)
np.random.normal(loc,scale,size)
np.random.poisson(lam,size)

统计函数

np.sum(a, axis=None)
np.mean(a, axis=None)
np.average(a,axis=None,weights=None)
np.std(a, axis=None)
np.var(a, axis=None)
np.min(a)  np.max(a)
np.argmin(a)  np.argmax(a)
np.unravel_index(index, shape)
np.ptp(a)
np.median(a)

矩阵运算计算特征值

import numpy as np
c = np.dot(a,b)
c = np.cross(a,b)

a = [[2,-1,0,0,0],
     [-1,2,-1,0,0],
     [0,-1,2,-1,0],
     [0,0,-1,2,-1],
     [0,0,0,-1,1]]
a = np.array(a)
eigenvalue, featurevector = np.linalg.eig(a)

idx = eigenvalue.argsort()
eigenvalue = eigenvalue[idx]
featurevector = featurevector[idx]

排序


data = data[data[:,2].argsort()]

ele = np.sort(ele,axis=1)

ele = np.sort(ele,axis=0)

ele = ele[ele[:,0].argsort()]

注：

Numpy的笔记中，还缺少有关矩阵文件读取、矩阵运算、线性代数的相关内容。

Pandas

Pandas库可以认为是升级版字典，常用于进行数据分析处理，DataFrame数据可保存至Excel。

创建DataFrame

import pandas as pd
import numpy as np

df = pd.DataFrame({'EIGHT': ['ARE', 'YOU', 'OK？']})

values = np.zeros((2,3), dtype='int32,float32')
index = ['x', 'y']
columns = ['a','b','c']
df = pd.DataFrame(data=values, index=index, columns=columns)

columns = ['姓名学号','早餐','午餐','晚餐','宿舍楼','宿舍号','楼层号','姓名']
df = pd.DataFrame(data=ls, columns=columns)

columns = ['a','b','c']
df = pd.DataFrame.from_dict(dic2, orient='index',columns = columns)

data = pd.read_excel('0.xlsx')
data = pd.read_excel（io，sheet_name = 0，header = 0，names = None，index_col = None，usecols = None，squeeze = False,dtype = None, ...）

DataFrame操作


df.loc[10] = [1,2,3]
df.loc[''] = [1,2,3]
df.iloc[0] = [1,2,3]

n = df.values
n = df.as_matrix()
n = np.array(df)

c = data.loc[1]
b = data.iloc[:,3]
d = data.iloc[1,3]

df[column_name]
df[column_name][index_name]

df.rename({'index_name':'new_index_name'},inplace=True)

DataFrame输出到excel

import pandas as pd
x = pd.DataFrame(data)
x.to_excel('data.xls',sheet_name='data')

with pd.ExcelWriter('0.xlsx',engine="openpyxl") as writer:
    df.to_excel(writer,index=True,index_label = '222',startrow = 10,startcol= 10)
    df.to_excel(writer,index=True,index_label = '222',startrow = 0,startcol= 0)
    mon1.to_excel(excel_writer=writer,sheet_name='201901')
    mon2.to_excel(excel_writer=writer,sheet_name='201902')

writer = pd.ExcelWriter("C:/Users/wlt/Desktop/XXX.xls")
mon1.to_excel(excel_writer=writer,sheet_name='201901')
writer.save()
writer.close()

pandas to_excel源代码

df.to_excel('0.xlsx',index=True,index_label = '222')
DataFrame.to_excel(excel_writer, sheet_name='Sheet1', na_rep='', float_format=None, columns=None, header=True, index=True, index_label=None, startrow=0, startcol=0, engine=None, merge_cells=True, encoding=None, inf_rep='inf', verbose=True, freeze_panes=None)

excel_writer：文件路径或现有的ExcelWriter。
sheet_name：它是指包含DataFrame的工作表的名称。
na_repr：缺少数据表示形式。
float_format：这是一个可选参数, 用于格式化浮点数字符串。
列：指要写入的列。
header：写出列名。如果给出了字符串列表, 则假定它是列名的别名。
index：写入索引。
index_label：引用索引列的列标签。如果未指定, 并且标头和索引为True, 则使用索引名称。如果DataFrame使用MultiIndex, 则应给出一个序列。
startrow：默认值0。它指向转储DataFrame的左上单元格行。
startcol：默认值0。它指向转储DataFrame的左上方单元格列。
engine：这是一个可选参数, 用于写入要使用的引擎, openpyxl或xlsxwriter。
merge_cells：返回布尔值, 其默认值为True。它将MultiIndex和Hierarchical行写为合并的单元格。
encoding：这是一个可选参数, 可对生成的excel文件进行编码。仅对于xlwt是必需的。
inf_rep：它也是一个可选参数, 默认值为inf。它通常表示无穷大。
详细：返回一个布尔值。它的默认值为True。
它用于在错误日志中显示更多信息。
Frozen_panes：它也是一个可选参数, 用于指定要冻结的最底部一行和最右边一列。

Excel 格式修改

import pandas as pd
import openpyxl

data= pd.DataFrame(data=np.random.randn(6,3),columns=["a",'b','c'])

filename = 'test.xlsx'
writer = pd.ExcelWriter(filename,engine='openpyxl')
data.to_excel(writer, sheet_name='Sheet1',index=False)
worksheet = writer.sheets['Sheet1']

worksheet.freeze_panes = 'A2'

worksheet.freeze_panes = 'B2'

for i in worksheet.columns:
    i[0].fill = openpyxl.styles.PatternFill("solid", fgColor="FF9933")
    i[0].font = openpyxl.styles.Font(name='Calibri',size=11,bold=True,
                                     italic=False,vertAlign=None,underline='none',strike=False,color='FF000000')

for cell in i[1:]:
    cell.number_format = '#,##0_-'
    cell.number_format = '0.00%;-0.00%'
    cell.number_format = '$#,##0.00;-$#,##0.00'

qtyindex = None
for row in worksheet.rows:
    for j in row:
        if j.value =='c':
            qtyindex = j.col_idx-1
    break
a = 0
for row in worksheet.rows:
    a +=1
    if qtyindex and status_index and a>1:
        if row[qtyindex].value > 0 :
            for cell in row:
                cell.fill = openpyxl.styles.PatternFill("solid", fgColor="FFB6C1")

for i in range(1,data.shape[1]+1):   把列索引转换成 A,B...表示
        str = ''
        while (not (i // 26 == 0 and i % 26 == 0)):
            temp = 25
            if (i % 26 == 0):
                str += chr(temp + 65)
            else:
                str += chr(i % 26 - 1 + 65)
            i //= 26
        writer.sheets['Sheet1'].column_dimensions[str[::-1]].width = 16

writer.save()

Original: https://blog.csdn.net/Lzn_nzL/article/details/124184841
Author: Lzn_nzL
Title: Python——最全的Numpy Pandas库的学习笔记

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/696081/

转载文章受原作者版权保护。转载请注明原作者出处！

人工智能

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

logit回归模型的参数估计过程_Logistic 回归模型的参数估计为什么不能采用最小二乘法？…

统计狗来总结一下前面各个楼主的回答，先甩答案： logistic回归模型的参数估计问题，是可以用最小二乘方法的思想进行求解的，但和经典的(或者说用在经典线性回归的参数估计问题)最小…

人工智能 2023年6月18日
0095
安装tensorflow/keras

抵扣说明： 1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。 Original: https:…

人工智能 2023年5月25日
0083
机器学习笔记：训练集、验证集与测试集

在学习《深度学习原理与pythorch实战》这本书的4.3.4划分数据集这一小节（即76页划分数据集）的过程中，提到了训练集、测试集与验证集这几个概念，以及为何相对于通用的训练集、…

人工智能 2023年6月13日
0070
听说大家都在用这个TikTok数据分析工具…

随着TikTok的迅猛发展，TikTok数据分析工具应运而生。但由于处于发展阶段，很多TikTok数据分析工具功能尚未齐全，仍在更新中。它们并不像国内飞瓜、蝉妈妈等那样成熟，所以达…

人工智能 2023年6月11日
0097
Python pandas模块

; 1 pandas数据读取 Pandas需要先读取表格类型的数据，然后进行分析 1.1 读取文件和基础语句：读取csv文件数据： import pandas as pd fil…

人工智能 2023年6月19日
0089
Linux环境配置MMDetection

环境介绍 Python3.7+CUDA10.0+GCC7.5.0+Pytorch1.4.0+Torchvision0.5.0 创建python环境使用以下命令创建一个python…

人工智能 2023年7月12日
0083
TensorFlow模型的保存与加载（二）——pb模式【源码】

如果本文对您有帮助，欢迎点赞支持！目录前言 1、TF模型保存方法 2、pb模式 3、适合保存模型的时机一、保存模型 1、定义简单网络模型 2、保存网络模型为pb文件二、加载…

人工智能 2023年5月26日
00113
详解如何获取深度学习模型中间层的输出值

人工智能 2023年5月26日
0052
Pysot训练自己的数据集

1、linux系统激活环境 conda activate pytorch=1.5.1 2、更改数据集参数文件地址：pysot-master/pysot/core/config.p…

人工智能 2023年7月13日
0045
深度学习–TensorFlow（4）BP神经网络（损失函数、梯度下降、常用激活函数、梯度消失&&梯度爆炸）

目录一、概念与定义二、损失函数/代价函数（loss）三、梯度下降法二维w与loss：三维w与loss：四、常用激活函数 1、softmax激活函数 2、sigmoid激…

人工智能 2023年7月14日
0058
OpenCV-Python实战（2）——图像与视频文件的处理

OpenCV-Python实战（2）——图像与视频文件的处理 * – 0. 前言 – 1. 图像与视频文件处理基础 – 2. 图像的读取与写入 …

人工智能 2023年6月17日
0053
MIPI CSI-2笔记（1） — CSI-2概览和CSI-2的分层架构

这是MIPI CSI-2规范笔记的第一篇笔记。虽然网上有很多CSI-2的文章，但是总觉得太散。于是有了开这个系列的想法，主要是总结学习MIPI CSI-2规范的相关知识，系统地学习…

人工智能 2023年6月18日
00198
RequestMappingHandlerMapping类的简介说明

转自: 下文笔者将讲述RequestMappingHandlerMapping类的简介说明,如下所示: RequestMappingHandlerMapping的功能： Reque…

人工智能 2023年6月27日
0066
OpenMV 从入手到跑TensorFlow Lite神经网络进行垃圾分类

Original: https://blog.csdn.net/qq_36300069/article/details/118071444Author: 超级网吧Title: Op…

人工智能 2023年7月13日
0051
【机器学习】——白话入门及术语解释

文章目录前言一、以普通例子循序渐进讲解什么是机器学习二、通过西瓜的例子类比学习一些相关术语 * 1. 以数据表格方式学习 2. 还记得坐标系么 3. 训练相关的一些术语三、…

人工智能 2023年7月26日
0046
Pandas基础命令速查表

前言最近发现写的关于python的博客慢慢有人在看，并且关注。突然觉得分享学习内容供大家参考是一件快乐的事情，虽然跟其他大博主相差太远，文章质量也不在一个level。但是还是想在这…

人工智能 2023年7月6日
0062

2024 年 5 月
一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31