Pandas中loc,iloc函数的用法

用loc,iloc,直接取值三种方法;对DataFrame,Series,行和列进行操作

import pandas as pd
#读取college数据集
college = pd.read_csv('data/college.csv', index_col='INSTNM')

iloc通过行标签取数 索引值的下标

选取第61行
pd.options.display.max_rows = 6
college.iloc[60]
'''
CITY                  Anchorage
STABBR                       AK
HBCU                          0
                        ...

UG25ABV                  0.4386
MD_EARN_WNE_P10           42500
GRAD_DEBT_MDN_SUPP      19449.5
Name: University of Alaska Anchorage, Length: 26, dtype: object
'''

选取多个不连续的行
college.iloc[[60, 99, 3]] #在series中取值62行,101行,5行

CITYSTABBRHBCUMENONLYWOMENONLYRELAFFILSATVRMIDSATMTMIDDISTANCEONLYUGDS…UGDS_2MORUGDS_NRAUGDS_UNKNPPTUG_EFCURROPERPCTPELLPCTFLOANUG25ABVMD_EARN_WNE_P10GRAD_DEBT_MDN_SUPPINSTNM University of Alaska AnchorageAnchorageAK0.00.00.00NaNNaN0.012865.0…0.09800.01810.04570.453910.23850.26470.43864250019449.5International Academy of Hair DesignTempeAZ0.00.00.00NaNNaN0.0188.0…0.01600.00000.06380.000000.71850.73460.39052220010556University of Alabama in HuntsvilleHuntsvilleAL0.00.00.00595.0590.00.05451.0…0.01720.03320.03500.214610.30720.45960.26404550024097

3 rows × 26 columns

iloc可以用切片连续选取
college.iloc[99:102] #选取99行到101行,99,100,101

CITYSTABBRHBCUMENONLYWOMENONLYRELAFFILSATVRMIDSATMTMIDDISTANCEONLYUGDS…UGDS_2MORUGDS_NRAUGDS_UNKNPPTUG_EFCURROPERPCTPELLPCTFLOANUG25ABVMD_EARN_WNE_P10GRAD_DEBT_MDN_SUPPINSTNM International Academy of Hair DesignTempeAZ0.00.00.00NaNNaN0.0188.0…0.01600.00000.06380.000000.71850.73460.39052220010556GateWay Community CollegePhoenixAZ0.00.00.00NaNNaN0.05211.0…0.01270.01610.07020.746510.32700.21890.5832298007283Mesa Community CollegeMesaAZ0.00.00.00NaNNaN0.019055.0…0.02050.02570.06820.645710.34230.22070.4010352008000

3 rows × 26 columns

loc通过行标签取数 索引值

也可以通过行标签选取
college.loc['University of Alaska Anchorage']
'''
CITY                  Anchorage
STABBR                       AK
HBCU                          0
MENONLY                       0
WOMENONLY                     0
                        ...

PCTPELL                  0.2385
PCTFLOAN                 0.2647
UG25ABV                  0.4386
MD_EARN_WNE_P10           42500
GRAD_DEBT_MDN_SUPP      19449.5
Name: University of Alaska Anchorage, Length: 26, dtype: object
'''

用loc加列表来选取

也可以用loc加列表来选取
labels = ['University of Alaska Anchorage','International Academy of Hair Design','University of Alabama in Huntsville']
college.loc[labels]

CITYSTABBRHBCUMENONLYWOMENONLYRELAFFILSATVRMIDSATMTMIDDISTANCEONLYUGDS…UGDS_2MORUGDS_NRAUGDS_UNKNPPTUG_EFCURROPERPCTPELLPCTFLOANUG25ABVMD_EARN_WNE_P10GRAD_DEBT_MDN_SUPPINSTNM University of Alaska AnchorageAnchorageAK0.00.00.00NaNNaN0.012865.0…0.09800.01810.04570.453910.23850.26470.43864250019449.5International Academy of Hair DesignTempeAZ0.00.00.00NaNNaN0.0188.0…0.01600.00000.06380.000000.71850.73460.39052220010556University of Alabama in HuntsvilleHuntsvilleAL0.00.00.00595.0590.00.05451.0…0.01720.03320.03500.214610.30720.45960.26404550024097

3 rows × 26 columns

loc可以用标签连续选取start-stop

loc可以用标签连续选取start-stop
start = 'Amridge University'
stop = 'Athens State University'
college.loc[start:stop]

CITYSTABBRHBCUMENONLYWOMENONLYRELAFFILSATVRMIDSATMTMIDDISTANCEONLYUGDS…UGDS_2MORUGDS_NRAUGDS_UNKNPPTUG_EFCURROPERPCTPELLPCTFLOANUG25ABVMD_EARN_WNE_P10GRAD_DEBT_MDN_SUPPINSTNM Amridge UniversityMontgomeryAL0.00.00.01NaNNaN1.0291.0…0.00000.00000.27150.453610.68010.77950.85404010023370University of Alabama in HuntsvilleHuntsvilleAL0.00.00.00595.0590.00.05451.0…0.01720.03320.03500.214610.30720.45960.26404550024097Alabama State UniversityMontgomeryAL1.00.00.00425.0430.00.04811.0…0.00980.02430.01370.089210.73470.75540.12702660033118.5The University of AlabamaTuscaloosaAL0.00.00.00555.0565.00.029851.0…0.02610.02680.00260.084410.20400.40100.08534190023750Central Alabama Community CollegeAlexander CityAL0.00.00.00NaNNaN0.01592.0…0.00000.00000.00190.388210.58920.39770.31532750016127Athens State UniversityAthensAL0.00.00.00NaNNaN0.02991.0…0.01740.00570.03340.551710.40880.62960.64103900018595

6 rows × 26 columns

index.tolist()提取行索引生成列表

#index.tolist()提取行索引生成列表 在series中,多选取一行,代表,多添加一行的列名
college.iloc[[60, 49, 3]].index.tolist()#选了三行
['University of Alaska Anchorage',
 'Snead State Community College',
 'University of Alabama in Huntsville']

使用iloc,loc选取前3行和前4列的不同做法

读取college数据集,给行索引命名为INSTNM;选取前3行和前4列
college = pd.read_csv('data/college.csv', index_col='INSTNM')
college.iloc[:3, :4]
college.loc[:'Amridge University', :'MENONLY']

Pandas中loc,iloc函数的用法

选取两列的所有的行

college.iloc[:, [4,6]].head()
college.loc[:, ['WOMENONLY', 'SATVRMID']].head()

WOMENONLYSATVRMIDINSTNM Alabama A & M University0.0424.0University of Alabama at Birmingham0.0570.0Amridge University0.0NaNUniversity of Alabama in Huntsville0.0595.0Alabama State University0.0425.0

选取不连续的行和列

选取不连续的行和列
college.iloc[[100, 200], [7, 15]]

                                      SATMTMID  UGDS_NHPI
INSTNM
GateWay Community College               NaN      0.0029
American Baptist Seminary of the West   NaN      NaN

用loc和列表,选取不连续的行和列
rows = ['GateWay Community College', 'American Baptist Seminary of the West']
columns = ['SATMTMID', 'UGDS_NHPI']
college.loc[rows, columns]

                                      SATMTMID  UGDS_NHPI
INSTNM
GateWay Community College               NaN      0.0029
American Baptist Seminary of the West   NaN      NaN

不用loc,iloc行切片

#从行索引10到20,每隔一个取一行
读取college数据集;从行索引10到20,每隔一个取一行
college = pd.read_csv('data/college.csv', index_col='INSTNM')
college[10:20:2]

CITYSTABBRHBCUMENONLYWOMENONLYRELAFFILSATVRMIDSATMTMIDDISTANCEONLYUGDS…UGDS_2MORUGDS_NRAUGDS_UNKNPPTUG_EFCURROPERPCTPELLPCTFLOANUG25ABVMD_EARN_WNE_P10GRAD_DEBT_MDN_SUPPINSTNM Birmingham Southern CollegeBirminghamAL0.00.00.01560.0560.00.01180.0…0.00510.00000.00510.001710.19200.48090.01524420027000Concordia College AlabamaSelmaAL1.00.00.01420.0400.00.0322.0…0.00310.04660.00000.105610.86670.93330.236719900PrivacySuppressedEnterprise State Community CollegeEnterpriseAL0.00.00.00NaNNaN0.01729.0…0.02540.00120.00690.382310.48950.22630.3399246008273Faulkner UniversityMontgomeryAL0.00.00.01NaNNaN0.02367.0…0.01730.01820.02580.230210.58120.72530.45893720022000New Beginning College of CosmetologyAlbertvilleAL0.00.00.00NaNNaN0.0115.0…0.00000.00000.00000.078310.82240.85530.3933NaN5500

5 rows × 26 columns

Series切片求10到19之间,每隔2个间隔的值

Series也可以进行同样的切片
city = college['CITY']
city[10:20:2]
'''
INSTNM
Birmingham Southern College              Birmingham
Concordia College Alabama                     Selma
Enterprise State Community College       Enterprise
Faulkner University                      Montgomery
New Beginning College of Cosmetology    Albertville
Name: CITY, dtype: object
'''

查看第4002个行索引标签

查看第4002个行索引标签
college.index[4001]
#'Spokane Community College'

对DataFrame用标签切片

Series和DataFrame都可以用标签进行切片。下面是对DataFrame用标签切片
start = 'Mesa Community College'
stop = 'Spokane Community College'
college[start:stop:1500]

CITYSTABBRHBCUMENONLYWOMENONLYRELAFFILSATVRMIDSATMTMIDDISTANCEONLYUGDS…UGDS_2MORUGDS_NRAUGDS_UNKNPPTUG_EFCURROPERPCTPELLPCTFLOANUG25ABVMD_EARN_WNE_P10GRAD_DEBT_MDN_SUPPINSTNM Mesa Community CollegeMesaAZ0.00.00.00NaNNaN0.019055.0…0.02050.02570.06820.645710.34230.22070.4010352008000Hair Academy Inc-New CarrolltonNew CarrolltonMD0.00.00.00NaNNaN0.0504.0…0.00000.00000.00000.468310.97561.00000.5882152009666National College of Natural MedicinePortlandOR0.00.00.00NaNNaN0.0NaN…NaNNaNNaNNaN1NaNNaNNaNNaNPrivacySuppressed

3 rows × 26 columns

对Series用标签切片

下面是对Series用标签切片
city[start:stop:1500]
'''
INSTNM
Mesa Community College                            Mesa
Hair Academy Inc-New Carrollton         New Carrollton
National College of Natural Medicine          Portland
Name: CITY, dtype: object
'''

直接切片不能用于列,只能用于DataFrame的行和Series,也不能同时选取行和列。

下面尝试选取两列,导致错误
college[:10, ['CITY', 'STABBR']]
TypeError: '(slice(None, 10, None), ['CITY', 'STABBR'])' is an invalid key
只能用.loc和.iloc选取
first_ten_instnm = college.index[:10]
college.loc[first_ten_instnm, ['CITY', 'STABBR']]

CITYSTABBRINSTNM A & W Healthcare EducatorsNew OrleansLAA T Still University of Health SciencesKirksvilleMOABC Beauty AcademyGarlandTXABC Beauty College IncArkadelphiaARAI Miami International University of Art and DesignMiamiFL

Original: https://blog.csdn.net/weixin_48135624/article/details/113845102
Author: 缘 源 园
Title: Pandas中loc,iloc函数的用法

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/679169/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球