Python pandas判断DataFrame是否为空和DataFrame遍历

一、pandas

pandas 是基于NumPy 的一种工具,该工具是为解决数据分析任务而创建的。Pandas 纳入了大量库和一些标准的数据模型,提供了高效地操作大型数据集所需的工具。pandas提供了大量能使我们快速便捷地处理数据的函数和方法。你很快就会发现,它是使Python成为强大而高效的数据分析环境的重要因素之一。

二、if条件判断DataFrame是否为空

dataframe.empty加if条件判断文件是否为空,如果返回的dataframe为空,可能导致某些逻辑错误。

data = pd.read_csv(filename, skiprows=1, header=None, error_bad_lines=False)

if data.empty:
     do empty
else:
     do not empty
data = pd.read_csv(filename, skiprows=1, header=None, error_bad_lines=False)
if not data.empty:
    do not empty
else:
    do empty

三、DataFrame取某一列

one method
dataframe[b][dataframe[a]==1].values[0]

two method
dataframe[dataframe[a]==1][b].values[0]

三、DataFrame按行按列遍历的方式

DataFrame是一种矩阵形式,所有的行名保存在index里,列名保存在columns里。如下方式可以创建一个DataFrame:

import pandas as pd
import numpy as np

行数*列数要与数据个数一致
>>> df = pd.DataFrame(np.arange(12).reshape(3, 4), index = ['row1', 'row2', 'row3'], columns=['col1', 'col2','col3'])
Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.6/site-packages/pandas/core/internals.py", line 4857, in create_block_manager_from_blocks
    placement=slice(0, len(axes[0])))]
  File "/root/miniconda3/lib/python3.6/site-packages/pandas/core/internals.py", line 3205, in make_block
    return klass(values, ndim=ndim, placement=placement)
  File "/root/miniconda3/lib/python3.6/site-packages/pandas/core/internals.py", line 125, in __init__
    '{mgr}'.format(val=len(self.values), mgr=len(self.mgr_locs)))
ValueError: Wrong number of items passed 4, placement implies 3

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/root/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 379, in __init__
    copy=copy)
  File "/root/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 536, in _init_ndarray
    return create_block_manager_from_blocks([values], [columns, index])
  File "/root/miniconda3/lib/python3.6/site-packages/pandas/core/internals.py", line 4866, in create_block_manager_from_blocks
    construction_error(tot_items, blocks[0].shape[1:], axes, e)
  File "/root/miniconda3/lib/python3.6/site-packages/pandas/core/internals.py", line 4843, in construction_error
    passed, implied))
ValueError: Shape of passed values is (4, 3), indices imply (3, 3)

>>> df = pd.DataFrame(np.arange(12).reshape(3, 4), index = ['row1', 'row2', 'row3'], columns=['col1', 'col2', 'col3', 'col4'])
>>>
>>> df
      col1  col2  col3  col4
row1     0     1     2     3
row2     4     5     6     7
row3     8     9    10    11

>>> df.index
Index(['row1', 'row2', 'row3'], dtype='object')
>>>
>>> df.columns
Index(['col1', 'col2', 'col3', 'col4'], dtype='object')</module></stdin>

Python pandas判断DataFrame是否为空和DataFrame遍历

iteritems(): 按列遍历,将DataFrame的每一列迭代为(列名, Series)对,可以通过row[index]对元素进行访问

iterrows(): 按行遍历,将DataFrame的每一行迭代为(index, Series)对,可以通过row[name]对元素进行访问

itertuples(): 按行遍历,将DataFrame的每一行迭代为元祖,可以通过row[name]对元素进行访问,比iterrows()效率要高

>>> import pandas as pd
>>>
>>> pdd = [{'c1':10, 'c2':100}, {'c1':11, 'c2':111}, {'c1':22, 'c2':222}]
>>>
>>> print(type(pdd))
<class 'list'>
>>>
>>> df = pd.DataFrame(pdd)
>>>
>>> print(df)
   c1   c2
0  10  100
1  11  111
2  22  222
>>> print(type(df))
<class 'pandas.core.frame.dataframe'></class></class>

按列遍历iteritems()用法:

index--&#x5217;&#x540D;
>>> for index, row in df.iteritems():
...     print(index)
...

c1
c2

row--&#x67D0;&#x4E00;&#x5217;, row[0]&#x67D0;&#x4E00;&#x5217;&#x7684;&#x7B2C;&#x4E00;&#x884C;
>>> for index, row in df.iteritems():
...     print(row[0], row[1], row[2])
...

10 11 22
100 111 222

按行遍历iterrows()用法:

index-&#x884C;&#x53F7;
>>> for index, row in df.iterrows():
...     print(index)
...

0
1
2

&#x67D0;&#x4E00;&#x884C;&#x901A;&#x8FC7;&#x5217;&#x540D;name&#x8BBF;&#x95EE;&#x5BF9;&#x5E94;&#x7684;&#x5143;&#x7D20;
>>> for index, row in df.iterrows():
...     print(row['c1'], row['c2'])
...

10 100
11 111
22 222

按行遍历itertuples()用法:

getattr(row, 'name')&#x5F97;&#x5230;&#x67D0;&#x884C;&#x7684;&#x5143;&#x7D20;
>>> for row in df.itertuples():
...     print(getattr(row, 'c1'), getattr(row, 'c2'))
...

10 100
11 111
22 222

引用

【1】https://pandas.pydata.org/

Original: https://blog.csdn.net/liveshow021_jxb/article/details/113062275
Author: liveshow021_jxb
Title: Python pandas判断DataFrame是否为空和DataFrame遍历

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/753852/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球