拆分 Pandas DataFrame;把dataframe分成若干个小的dataframe,三种方法

1.使用行索引分割 DataFrame
2.使用 groupby() 方法拆分 DataFrame
3.使用 sample() 方法拆分 DataFrame

我们将使用下面的 apprix_df DataFrame 来解释如何将一个 DataFrame 分割成多个更小的 DataFrame。

使用行索引分割 DataFrame

import pandas as pd

apprix_df = pd.DataFrame({
    'Name': ["Anish","Rabindra","Manish","Samir","Binam"],
    'Post': ["CEO","CTO","System Admin","Consultant","Engineer"],
    'Qualification':["MBA","MS","MCA","PhD","BE"]
})

print("Apprix Team DataFrame:")
print(apprix_df,"\n")

apprix_1 = apprix_df.iloc[:2,:]
apprix_2 = apprix_df.iloc[2:,:]

print("The DataFrames formed by splitting of Apprix Team DataFrame are: ","\n")
print(apprix_1,"\n")
print(apprix_2,"\n")

输出:

Apprix Team DataFrame:
       Name          Post Qualification
0     Anish           CEO           MBA
1  Rabindra           CTO            MS
2    Manish  System Admin           MCA
3     Samir    Consultant           PhD
4     Binam      Engineer            BE

The DataFrames formed by splitting the Apprix Team DataFrame are:

       Name Post Qualification
0     Anish  CEO           MBA
1  Rabindra  CTO            MS

     Name          Post Qualification
2  Manish  System Admin           MCA
3   Samir    Consultant           PhD
4   Binam      Engineer            BE

它使用行索引将 DataFrame apprix_df 分成两部分。第一部分包含 apprix_df DataFrame 的前两行,而第二部分包含最后三行。

我们可以在 iloc 属性中指定每次分割的行。[:2,:] 表示选择索引 2 之前的行(索引 2 的行不包括在内)和 DataFrame 中的所有列。因此,apprix_df.iloc[:2,:] 选择 DataFrame apprix_df 中索引 0 和 1 的前两行。

import pandas as pd

apprix_df = pd.DataFrame({
    'Name': ["Anish","Rabindra","Manish","Samir","Binam"],
    'Post': ["CEO","CTO","System Admin","Consultant","Engineer"],
    'Qualification':["MBA","MS","MS","PhD","MS"]
})

print("Apprix Team DataFrame:")
print(apprix_df,"\n")

groups = apprix_df.groupby(apprix_df.Qualification)
ms_df = groups.get_group("MS")
mba_df=groups.get_group("MBA")
phd_df=groups.get_group("PhD")

print("Group with Qualification MS:")
print(ms_df,"\n")

print("Group with Qualification MBA:")
print(mba_df,"\n")

print("Group with Qualification PhD:")
print(phd_df,"\n")

输出:

Apprix Team DataFrame:
       Name          Post Qualification
0     Anish           CEO           MBA
1  Rabindra           CTO            MS
2    Manish  System Admin            MS
3     Samir    Consultant           PhD
4     Binam      Engineer            MS

Group with Qualification MS:
       Name          Post Qualification
1  Rabindra           CTO            MS
2    Manish  System Admin            MS
4     Binam      Engineer            MS

Group with Qualification MBA:
    Name Post Qualification
0  Anish  CEO           MBA

Group with Qualification PhD:
    Name        Post Qualification
3  Samir  Consultant           PhD

它根据 Qualification 列的值将 DataFrame apprix_df 分成三部分。Qualification 列值相同的行将被放在同一个组中。

使用 sample() 方法拆分 DataFrame

我们可以通过使用 sample() 方法从 DataFrame 中随机抽取行来形成一个 DataFrame。我们可以设置从父 DataFrame 中抽取行的比例。

import pandas as pd

apprix_df = pd.DataFrame({
    'Name': ["Anish","Rabindra","Manish","Samir","Binam"],
    'Post': ["CEO","CTO","System Admin","Consultant","Engineer"],
    'Qualification':["MBA","MS","MS","PhD","MS"]
})

print("Apprix Team DataFrame:")
print(apprix_df,"\n")

random_df = apprix_df.sample(frac=0.4,random_state=60)

print("Random split from the Apprix Team DataFrame:")
print(random_df)

输出:

Apprix Team DataFrame:
       Name          Post Qualification
0     Anish           CEO           MBA
1  Rabindra           CTO            MS
2    Manish  System Admin            MS
3     Samir    Consultant           PhD
4     Binam      Engineer            MS

Random split from the Apprix Team DataFrame:
    Name      Post Qualification
0  Anish       CEO           MBA
4  Binam  Engineer            MS

它从 apprix_df DataFrame 中随机抽取 40% 的行,然后显示由抽取的行形成的 DataFrame。设置 random_state 是为了确保每次抽样都能得到相同的随机样本。

Original: https://blog.csdn.net/qq_41932272/article/details/119322273
Author: redy_
Title: 拆分 Pandas DataFrame;把dataframe分成若干个小的dataframe,三种方法

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/756889/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球