python 将多个df写入xlsx文件的ExcelWriter

关于pandas.ExcelWriter用法的介绍,本文参考了python内置文档 Help on class ExcelWriter in module pandas.io.excel._base。通过查看ExcelWriter的py文件,可以看到该类的定义框架如下所示:

class ExcelWriter(builtins.object):
    def ExcelWriter(
        path: 'FilePathOrBuffer | ExcelWriter',
        engine=None,
        date_format=None,
        datetime_format=None,
        mode: 'str' = 'w',
        storage_options: 'StorageOptions' = None,
        if_sheet_exists: 'str | None' = None,
        engine_kwargs: 'dict | None' = None,
        **kwargs):

其中,最基本的应用是:

ExcelWriter(path, engine=None, date_format=None, datetime_format=None,mode='w')

以下首先对ExcelWriter的帮助文档进行整合:

ExcelWriter是一个用于将DataFrame对象写入Excel工作表的类。默认的处理是对xls文件使用xlwt方法,对xlsx文件使用openpyxl方法,对ods文件使用odf方法。可以参见DataFrame.to_excel的文档查看这三种方法的典型用法。ExcelWriter应当视为一个上下文管理器(context manager),否则,call close()函数进行保存并关闭任何已打开的文件的处理(handles)

Class for writing DataFrame objects into excel sheets. Default is to use xlwt for xls, openpyxl for xlsx, odf for ods. See DataFrame.to_excel for typical usage.

参数:

path : str or typing.BinaryIO
    Path to xls or xlsx or ods file.

engine : str (optional)
    Engine to use for writing.

    If None, defaults to .excel..writer.

date_format : str, default None
    Format string for dates written into Excel files.

    (e.g. 'YYYY-MM-DD')

datetime_format : str, default None
    Format string for datetime objects written into Excel files.

    (e.g. 'YYYY-MM-DD HH:MM:SS')

mode : {'w', 'a'}, default 'w'
    File mode to use (write or append).

    Append does not work with fsspec URLs.

storage_options : dict, optional
    Extra options that make sense for a particular storage connection,
    e.g. host, port, username, password, etc.,
    if using a URL that will be parsed by ,
    e.g., starting "s3://", "gcs://".

if_sheet_exists : {'error', 'new', 'replace'}, default 'error'
    How to behave when trying to write to a sheet that already exists
    (append mode only).
    * error: raise a ValueError.
    * new: Create a new sheet, with a name determined by the engine.
    * replace: Delete the contents of the sheet before writing to it.

engine_kwargs : dict, optional
    Keyword arguments to be passed into the engine.

**kwargs : dict, optional
    Keyword arguments to be passed into the engine.

    deprecated:: 1.3.0 Use engine_kwargs instead.

NOTE:

  1. parameter ‘engine’ can only be passed as a keyword argument
  2. deprecated:: 1.2.0 As the `xlwt
  3. For compatibility with CSV writers, ExcelWriter serializes lists and dicts to strings before writing. 这一点使得ExcelWriter的效率很差

例子:

Default usage:

>>> df = pd.DataFrame([["ABC", "XYZ"]], columns=["Foo", "Bar"])
>>> with ExcelWriter("path_to_file.xlsx") as writer:
...     df.to_excel(writer)

To write to separate sheets in a single file:

>>> df1 = pd.DataFrame([["AAA", "BBB"]], columns=["Spam", "Egg"])
>>> df2 = pd.DataFrame([["ABC", "XYZ"]], columns=["Foo", "Bar"])
>>> with ExcelWriter("path_to_file.xlsx") as writer:
...     df1.to_excel(writer, sheet_name="Sheet1")
...     df2.to_excel(writer, sheet_name="Sheet2")

You can set the date format or datetime format:

>>> from datetime import date, datetime
>>> df = pd.DataFrame(
...     [
...         [date(2014, 1, 31), date(1999, 9, 24)],
...         [datetime(1998, 5, 26, 23, 33, 4), datetime(2014, 2, 28, 13, 5, 13)],
...     ],
...     index=["Date", "Datetime"],
...     columns=["X", "Y"],
... )
>>> with ExcelWriter(
...     "path_to_file.xlsx",
...     date_format="YYYY-MM-DD",
...     datetime_format="YYYY-MM-DD HH:MM:SS"
... ) as writer:
...     df.to_excel(writer)

You can also append to an existing Excel file:

>>> with ExcelWriter("path_to_file.xlsx", mode="a", engine="openpyxl") as writer:
...     df.to_excel(writer, sheet_name="Sheet3")

You can store Excel file in RAM:

>>> import io
>>> df = pd.DataFrame([["ABC", "XYZ"]], columns=["Foo", "Bar"])
>>> buffer = io.BytesIO()
>>> with pd.ExcelWriter(buffer) as writer:
...     df.to_excel(writer)

You can pack Excel file into zip archive:

>>> import zipfile
>>> df = pd.DataFrame([["ABC", "XYZ"]], columns=["Foo", "Bar"])
>>> with zipfile.ZipFile("path_to_file.zip", "w") as zf:
...     with zf.open("filename.xlsx", "w") as buffer:
...         with pd.ExcelWriter(buffer) as writer:
...             df.to_excel(writer)

实例用法

实践中常常使用path,engine,mode作为应用ExcelWriter的主要参数。其中再次强调,mode: ‘w’ for write , ‘a’ for append.default ‘w’: write

使用上下文管理器这样就不用担心忘记excel_writer.close()

import pandas as pd
from openpyxl import load_workbook
import os

if os.path.exists(output_file):
    with pd.ExcelWriter(output_file, engine='openpyxl') as excel_writer:
        book = load_workbook(excel_writer.path)
        excel_writer.book = book
        result_df.to_excel(excel_writer, sheet_name=sheet_name)
    else:
        result_df.to_excel(output_file, sheet_name=sheet_name)

如果导出的数据中含有大量的URL,EXCEL单张表格中对URL是存在数量限制的,如果超过这个限制,将无法将超出范围的URL数据导出至EXCEL(该过程不会报错)。通过增加options参数将URL转化为字符(即,不自动将str转化为url),这样可以避免该情况。

writer = pd.ExcelWriter(path,options={'strings_to_urls': False})

Original: https://blog.csdn.net/yingxuanzhang/article/details/124865398
Author: NickTheRock
Title: python 将多个df写入xlsx文件的ExcelWriter

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/678195/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球