文章目录
pandas
; pandas常用数据类型
Series一维数据,带标签数组
DataFrame 二维,Series容器
import pandas as pd
t = pd.Series([1, 2, 31, 12, 3, 4])
print(t)
1.Series的创建
import pandas as pd
t = pd.Series([1, 32, 31, 132, 32, 34])
print(t)
print("*" * 30)
t1 = pd.Series([1, 223, 23, 24, 43], index=list("abcde"))
print(t1)
print("*" * 30)
temp_dict = {"name": "xiaohong", "age": 30, "tel": 10086}
t3 = pd.Series(temp_dict)
print(t3)
print(t3[0])
print("*" * 30)
print(t3[0:2])
print("*" * 30)
print(t3.index)
print("*" * 30)
print(len(t3.index))
print("*" * 30)
print(t3.values, type(t3))
pandas读取外部数据
1.读取csv文件
import pandas as pd
df=pd.read_csv("D:\桌面\Python\project01\pandas\Affairs.csv")
print(df)
pandas读取数据库
主要使用: pd.read_sql(sql,con=db_conn)
sql:查询数据库中创建的表
con:通过pymysql建立连接
DataFrame
基础
import pandas as pd
import numpy as np
t1=pd.DataFrame(np.arange(12).reshape(3,4),index=["a","b","c"],columns=list('wxyz'))
print(t1)
字典
d1={"name":["孙悟空","猪八戒","沙和尚"],"age":[500,520,250],"tel":[123456,456134,654321]}
t2=pd.DataFrame(d1)
print(t2)
列表
d2=[{"name":"孙悟空","age":500,"tel":12345},{"name":"猪八戒","age":520,"tel":456134},{"name":"沙和尚","age":250,"tel":654321}]
t3=pd.DataFrame(d2)
print(t3)
排序
df=df.sort_values(by='人气',ascending=False)
by表示需要排序的内容,ascending为true默认升序排序
df = pd.DataFrame({
"name": ["成龙", "孙悟空", "猪八戒", "沙和尚", "唐僧", "百龙霸"],
"人气": [10250, 12560, 18630, 11881, 1800, 12888],
"年龄": [140, 80, 120, 90, 125, 116],
"是否已婚": ["是", "否", "否", "否", "否", "否"] })
print(df)
df1=df.sort_values(by='人气',ascending=False)
print(df1)
索引
print(df[:2])
print(df["年龄"])
print(df[(25<df["年龄"])&(df["年龄"]<100)])
loc与iloc
t1=pd.DataFrame(np.arange(12).reshape(3,4),index=list("abc"),columns=["W","X","Y","Z"])
loc[]:通过标签索引行数据,iloc[]:通过位置索引行数据
import pandas as pd
import numpy as np
t1=pd.DataFrame(np.arange(15).reshape(3,5),index=list("abc"),columns=["V","W","X","Y","Z"])
print(t1)
print(t1.loc["a","Z"])
print(t1.loc["a"])
print(t1.loc[:"b"])
print(t1.loc["a":"c",["W","Z"]])
print(t1.iloc[:,[2,1]])
print(t1.iloc[1:,:2])
字符串离散化
数据的合并
1)join:默认情况下他是把行行索引相同的数据合并到一起
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
t1=pd.DataFrame(np.arange(10).reshape(2,5),index=list("AB"),columns=list("VWXYZ"))
print(t1)
print("--"*20)
t2=pd.DataFrame(np.arange(12).reshape(3,4),index=list("ABC"))
print(t2)
print("--"*20)
t3=t1.join(t2)
print(t3)
print("--"*20)
t4=t2.join(t1)
print(t4)
2)merge:按照指定的列把数据按照一定的方式合并到一起
Original: https://blog.csdn.net/m0_62497122/article/details/126996503
Author: 胖胖龙打代码
Title: 数据分析3-pandas
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/677693/
转载文章受原作者版权保护。转载请注明原作者出处!