一、创建方法
直观上理解,DataFrame就是很多个Series拼在一起的一个新的数据结构,他看起来就像Excel的表一样,以下是快速创建的方法。
marvel_data = [
['Spider-Man', 'male', 1962],
['Captain America', 'male', 1941],
['Wolverine', 'male', 1974],
['Iron Man', 'male', 1963],
['Thor', 'male', 1963],
['Thing', 'male', 1961],
['Mister Fantastic', 'male', 1961],
]
pd.DataFrame(marvel_data)
0120Spider-Manmale19621Captain Americamale19412Wolverinemale19743Iron Manmale19634Thormale19635Thingmale19616Mister Fantasticmale1961
二、修改索引
marvel_df = pd.DataFrame(marvel_data)
col_names = ['name', 'sex', 'first_appearance']
marvel_df.columns = col_names
namesexfirst_appearance0Spider-Manmale19621Captain Americamale19412Wolverinemale19743Iron Manmale19634Thormale19635Thingmale19616Mister Fantasticmale1961 如果我们想要用这些超级英雄的名字来作为索引,只需要一行简单的
marvel_df.index = marvel_df['name']
namesexfirst_appearancenameSpider-ManSpider-Manmale1962Captain AmericaCaptain Americamale1941WolverineWolverinemale1974Iron ManIron Manmale1963ThorThormale1963ThingThingmale1961Mister FantasticMister Fantasticmale1961
三、删除行、列
丢弃掉其中的一列或者一行, inplace参数的意义在于,是否是直接在原数据上进行修改,默认是False创建一个副本
marvel_df = marvel_df.drop("sex", axis=1, inplace=False)
marvel_df.drop("sex", axis=1, inplace=True)
marvel_df.drop("Thor", axis=0, inplace=True)
删除行列的这个axis非常容易弄混,一定记清楚,如果是numpy对每一行求和求方差是axis=1,然而删除一整行却是axis=0,最好找到一个适合自己的记法
我的记法是:axis=1的时候是对每一行都操作,所以我把每一行的这个属性都删掉,就相当于我删了这一整列了
四、定位方法
marvel_df.loc[:, "first_appearance"]
marvel_df.loc["Spider-Man": "Iron Man"]
iloc方法和loc方法的不同点在于,iloc是根据下标来定位元素的,起点是0,左闭右开
marvel_df.iloc[:, 1]
marvel_df.iloc[0:4, :]
五、对DataFrame元素进行修改
marvel_df.loc['Thor', 'first_appearance'] = 111111
marvel_df['years_since'] = 2022 - marvel_df['first_appearance']
六、Mask
marvel_df['sex'] == 'female'
'''
name
Spider-Man False
Captain America False
Wolverine False
Iron Man False
Thor False
Thing False
Mister Fantastic False
Name: sex, dtype: bool
'''
一个小练习:marvel_df里性别这一列的male和female转换成0和1(不适用map和replace)提示:True可以看成1,False可以看成0
marvel_df['sex'] = (marvel_df['Sex'] == 'female').astype("int64")
逻辑运算
marvel_df['first_appearance'] > 1970
marvel_df[marvel_df['first_appearance'] > 1970]
Original: https://blog.csdn.net/m0_57273156/article/details/123586563
Author: LoveData_
Title: Pandas DataFrame的一些简单操作
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/739125/
转载文章受原作者版权保护。转载请注明原作者出处!