Pandas基本使用1

本文主要参考Pandas中文文档进行学习讲解:
1.pandas可以直接使用列表和numpy数组: pd.Series()

import pandas as pd
import numpy as np

list= [11, 23, 35, 35, 64, 58]
s = pd.Series(list)
print(s)

num = np.array([12.3,34,56,88])
ss = pd.Series(num)
print(ss)

Pandas基本使用1
2.生成日期索引 pd.date_range('20220718', periods=10)
import pandas as pd

dates = pd.date_range('20220718', periods=10)
print(dates)

Pandas基本使用1
3.用含日期时间索引与标签的 NumPy 数组生成 DataFrame:
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=10)
df = pd.DataFrame(np.random.randn(10, 6), index=dates, columns=list('ABCDEF'))
print(df)

Pandas基本使用1
4.字典对象生成 DataFrame:
import pandas as pd
import numpy as np

zd = {'A': 1.,
      'B': pd.Timestamp('20130102'),
      'C': pd.Series(1, index=list(range(4)), dtype='float32'),
      'D': np.array([3] * 4, dtype='int32'),
      'E': pd.Categorical(["test", "train", "test", "train"]),
      'F': 'foo'}
df2 = pd.DataFrame(zd)
print(df2.dtypes)
print(df2)

Pandas基本使用1
5.如何查看 DataFrame 头部和尾部数据: df.head() df.tail()
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=10)
df = pd.DataFrame(np.random.randn(10, 5), index=dates, columns=list('ABCDE'))

print(df.head(2))

print(df.tail(4))

Pandas基本使用1
6.显示索引: df.index
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=10)
df = pd.DataFrame(np.random.randn(10, 5), index=dates, columns=list('ABCDE'))
print(df)
print('\n索引为:',df.index)

Pandas基本使用1
7.显示列名: df.columns
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=10)
df = pd.DataFrame(np.random.randn(10, 5), index=dates, columns=list('ABCDE'))
print(df)
print('\n列名为:',df.columns)

Pandas基本使用1
8转化为numpy: df.to_numpy()
DataFrame.to_numpy() 的输出不包含行索引和列标签。
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df.to_numpy())

Pandas基本使用1
9.数据统计摘要 df.describe()
describe() 可以快速查看数据的统计摘要:
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df)
print(df.describe())

Pandas基本使用1
10.转置数据: df.T
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df)
print(df.T)

Pandas基本使用1
11.按轴排序: df.sort_index(axis=1,ascending=False)
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df)
print(df.sort_index(axis=1,ascending=False))

Pandas基本使用1
12.选择具体数据:
选择单列,产生 Series df['A'] 与 df.A 等效
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df['B'])
print(df.A)

Pandas基本使用1
用 [ ] 切片行: df['20220718':'20220721']
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df[0:3])
print(df['20220718':'20220721'])

Pandas基本使用1
用标签提取一行数据:
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df)
print(df.loc[dates[0]])
print(df.loc['20220718'])

Pandas基本使用1
用标签切片,包含行与列结束点:
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df)
print(df.loc['20220718':'20220721', ['A', 'C']])

Pandas基本使用1
提取标量值(某一个值):
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df)
print(df.loc[dates[0], 'A'])
print(df.at[dates[0], 'A'])

Pandas基本使用1
用整数位置选择:
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df)
print(df.iloc[3])
print(df.iloc[1:3,2:3])
print(df.iloc[[1,3],[2,3]])
print(df.iloc[2,3])

Pandas基本使用1
快速访问标量:
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df)

print(df.iloc[2,3])
print(df.iat[2,3])

Pandas基本使用1
布尔索引
用单列的值选择数据:
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df)

print(df[df.A > 0])

Pandas基本使用1
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
print(df)

print( df[df > 0])

Pandas基本使用1
用 isin() 筛选:
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
df['E'] = [1,2,3,4,5,6]
print(df)

print(df[df['E'].isin([1, 3])])

Pandas基本使用1
13.赋值
import pandas as pd
import numpy as np

dates = pd.date_range('20220718', periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
df['E'] = [1,2,3,4,5,6]
print(df)

df.at[dates[0], 'A'] = 0.666

df.iat[0, 2] = 3

df.loc[:, 'E'] = np.array([5] * len(df))
print(df)

Pandas基本使用1

Original: https://blog.csdn.net/weixin_43788986/article/details/125841967
Author: <编程路上>
Title: Pandas基本使用1

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/742411/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球