目录
前言
做数据分析的小伙伴分析的数据经常来自于Excel,想实现自动化办公的小伙伴除了学习VBA编程,python更容易上手,也是不错的选择。不管如何,第一步首先是能够用python读取Excel文件中的数据,下面我来教你!
Excel的文件扩展名有两种:xlsx和xls。其中xls是excel 2007之前版本的使用的默认格式,xlsx是excel 2007之后的版本使用的默认格式,包括2007的版本。两种格式的文件,在python中的处理方式不同。我会分别介绍。
*说明:本文使用的是python3.7
————
从文件读数据
代码中用到的文件内容如下:
; 读取xls文件
使用的库:xlrd。xlrd最新的版本已不支持读取xlsx文件。
xlrd读取xls文件分为以下几步:
- 打开工作簿
- 获取指定的工作表sheet
- 读取行、读取列、读取单元格
打开工作簿
>>> import xlrd
>>> wb = xlrd.open_workbook("/Users/songzp/Downloads/python读Excel.xls")
>>> wb
<xlrd.book.Book object at 0x7fa32854df50>
如果我们打开的是xlsx文件,则会报错
>>> wb = xlrd.open_workbook("/Users/songzp/Downloads/python读Excel.xlsx")
Traceback (most recent call last):
File "", line 1, in <module>
File "/opt/anaconda3/envs/python3/lib/python3.7/site-packages/xlrd/__init__.py", line 170, in open_workbook
raise XLRDError(FILE_FORMAT_DESCRIPTIONS[file_format]+'; not supported')
xlrd.biffh.XLRDError: Excel xlsx file; not supported
获取工作表
>>> sheet_names = wb.sheet_names()
>>> sheet_names
['Sheet1']
>>> ws = wb.sheet_by_index(0)
>>> ws
Sheet 0:<Sheet1>
>>> ws = wb.sheet_by_name('Sheet1')
>>> ws
Sheet 0:<Sheet1>
读取行
>>> ws.nrows
6
>>> ws.row_values(0)
['姓名', '年龄', '身高', '体重', '性别']
>>> ws.row_values(1)
['张三', 25.0, 175.0, 70.0, '男']
>>> ws.row_values(1, 2, 5)
[175.0, 70.0, '男']
>>> for i in range(ws.nrows):
... print(ws.row_values(i))
...
['姓名', '年龄', '身高', '体重', '性别']
['张三', 25.0, 175.0, 70.0, '男']
['李四', 40.0, 180.0, 73.0, '男']
['杨过', 18.0, 181.0, 75.0, '男']
['小龙女', 28.0, 165.0, 58.0, '女']
['东方不败', 38.0, 173.0, 65.0, '都可']
读取列
>>> ws.ncols
5
>>> ws.col_values(0)
['姓名', '张三', '李四', '杨过', '小龙女', '东方不败']
>>> ws.col_values(1)
['年龄', 25.0, 40.0, 18.0, 28.0, 38.0]
>>> ws.col_values(1, 2, 5)
[40.0, 18.0, 28.0]
>>> for j in range(ws.ncols):
... print(ws.col_values(j))
...
['姓名', '张三', '李四', '杨过', '小龙女', '东方不败']
['年龄', 25.0, 40.0, 18.0, 28.0, 38.0]
['身高', 175.0, 180.0, 181.0, 165.0, 173.0]
['体重', 70.0, 73.0, 75.0, 58.0, 65.0]
['性别', '男', '男', '男', '女', '都可']
读取单元格
ws.cell_value(0,0)
'姓名'
ws.cell_value(0,2)
'身高'
读取xlsx文件
使用的库:openpyxl,openpyxl库支持xlsx格式的文件,不支持xls格式的文件
同样的流程
打开工作簿
>>> import openpyxl
>>> wb = openpyxl.load_workbook("/Users/songzp/Downloads/python读Excel.xlsx")
>>> wb
<openpyxl.workbook.workbook.Workbook object at 0x7fabd1813410>
获取工作表
>>> wb.sheetnames
['Sheet1']
>>> wb.worksheets
[<Worksheet "Sheet1">]
>>> wb.get_sheet_names()
<input>:1: DeprecationWarning: Call to deprecated function get_sheet_names (Use wb.sheetnames).
['Sheet1']
>>> wb.get_sheet_by_name('Sheet1')
<Worksheet "Sheet1">
<input>:1: DeprecationWarning: Call to deprecated function get_sheet_by_name (Use wb[sheetname]).
>>> wb['Sheet1']
<Worksheet "Sheet1">
读取行数据
>>> ws.max_row
6
>>> ws.rows
<generator object Worksheet._cells_by_row at 0x7fabd1851cd0>
>>> list(ws.rows)
[(<Cell 'Sheet1'.A1>, <Cell 'Sheet1'.B1>, <Cell 'Sheet1'.C1>, <Cell 'Sheet1'.D1>, <Cell 'Sheet1'.E1>), (<Cell 'Sheet1'.A2>, <Cell 'Sheet1'.B2>, <Cell 'Sheet1'.C2>, <Cell 'Sheet1'.D2>, <Cell 'Sheet1'.E2>), (<Cell 'Sheet1'.A3>, <Cell 'Sheet1'.B3>, <Cell 'Sheet1'.C3>, <Cell 'Sheet1'.D3>, <Cell 'Sheet1'.E3>), (<Cell 'Sheet1'.A4>, <Cell 'Sheet1'.B4>, <Cell 'Sheet1'.C4>, <Cell 'Sheet1'.D4>, <Cell 'Sheet1'.E4>), (<Cell 'Sheet1'.A5>, <Cell 'Sheet1'.B5>, <Cell 'Sheet1'.C5>, <Cell 'Sheet1'.D5>, <Cell 'Sheet1'.E5>), (<Cell 'Sheet1'.A6>, <Cell 'Sheet1'.B6>, <Cell 'Sheet1'.C6>, <Cell 'Sheet1'.D6>, <Cell 'Sheet1'.E6>)]
>>> for row in ws.rows:
... print([c.value for c in row])
...
['姓名', '年龄', '身高', '体重', '性别']
['张三', 25, 175, 70, '男']
['李四', 40, 180, 73, '男']
['杨过', 18, 181, 75, '男']
['小龙女', 28, 165, 58, '女']
['东方不败', 38, 173, 65, '都可']
读取列数据
>>> ws.max_column
5
>>> list(ws.columns)
[(<Cell 'Sheet1'.A1>, <Cell 'Sheet1'.A2>, <Cell 'Sheet1'.A3>, <Cell 'Sheet1'.A4>, <Cell 'Sheet1'.A5>, <Cell 'Sheet1'.A6>), (<Cell 'Sheet1'.B1>, <Cell 'Sheet1'.B2>, <Cell 'Sheet1'.B3>, <Cell 'Sheet1'.B4>, <Cell 'Sheet1'.B5>, <Cell 'Sheet1'.B6>), (<Cell 'Sheet1'.C1>, <Cell 'Sheet1'.C2>, <Cell 'Sheet1'.C3>, <Cell 'Sheet1'.C4>, <Cell 'Sheet1'.C5>, <Cell 'Sheet1'.C6>), (<Cell 'Sheet1'.D1>, <Cell 'Sheet1'.D2>, <Cell 'Sheet1'.D3>, <Cell 'Sheet1'.D4>, <Cell 'Sheet1'.D5>, <Cell 'Sheet1'.D6>), (<Cell 'Sheet1'.E1>, <Cell 'Sheet1'.E2>, <Cell 'Sheet1'.E3>, <Cell 'Sheet1'.E4>, <Cell 'Sheet1'.E5>, <Cell 'Sheet1'.E6>)]
>>> for col in ws.columns:
... print([c.value for c in col])
...
['姓名', '张三', '李四', '杨过', '小龙女', '东方不败']
['年龄', 25, 40, 18, 28, 38]
['身高', 175, 180, 181, 165, 173]
['体重', 70, 73, 75, 58, 65]
['性别', '男', '男', '男', '女', '都可']
读取单元格
与xlrd不同,openpyxl的索引从1开始
>>> ws.cell(1,1).value
'姓名'
写数据到文件
这一节介绍如何将以下数据写入Excel文件:
data = [{"province": "河南", "area_code": "410000", "population": 9883},
{"province": "广东", "area_code": "440000", "population": 12684},
{"province": "山东", "area_code": "370000", "population": 10169.99},
{"province": "江苏", "area_code": "320000", "population": 8505.4}
]
写数据到Excel文件包含以下四步:
- 新建Excel工作簿
- 新建一个工作表
- 将数据写入对应单元格
- 保存文件
使用的库:xlwt,xlwt可以写xls个xlsx格式的文件,但是xlwt最多只能写65536行数据
新建工作簿
>>> import xlwt
>>> wb = xlwt.Workbook()
>>> wb
<xlwt.Workbook.Workbook object at 0x7fabd18137d0>
新建工作表
>>> new_ws = wb.add_sheet('人口排名前四省')
>>> new_ws
<xlwt.Worksheet.Worksheet object at 0x7fabd10ee790>
将数据写入对应单元格
>>> data = [{"province": "河南", "area_code": "410000", "population": 9883},
... {"province": "广东", "area_code": "440000", "population": 12684},
... {"province": "山东", "area_code": "370000", "population": 10169.99},
... {"province": "江苏", "area_code": "320000", "population": 8505.4}
... ]
>>> title = ["province", "area_code", "population"]
>>> for i in range(len(title)):
... new_ws.write(0, i, title[i])
...
>>> start_row_num = 1
>>> for row in data:
... values = list(row.values())
... for i in range(len(values)):
... new_ws.write(start_row_num, i, values[i])
... start_row_num += 1
保存文件
>>> wb.save("/Users/songzp/Downloads/python写Excel.xlsx")
打开文件并看效果
; 其他方法
除了将值写入单个单元格,还可以进行合并单元格操作,设置单元格的格式等,具体的实现本文不再详解,有兴趣的可以查看官方文档。在这里给出合并单元格的示例。
xlwt文档链接:https://xlwt.readthedocs.io/en/latest/
>>> new_ws.write_merge(5, 6, 0, 1, "合并单元格")
总结
除了本文介绍的xlrd、xlwt、openpyxl库外,还有其他读写Excel文件的库,比如:xlsxwriter、xlutils。
其中xlrd只能读,xlwt只能写,openpyxl除了可以读Excel文件,还可以写数据到Excel文件。每个库支持的文件格式不同,可写的最大行数也不同。可以根据实际需要选择合适的库进行处理Excel文件。
更多的相关python库可查看:https://www.python-excel.org/
附:
微信关注公众号,获取最新文章
Original: https://blog.csdn.net/songszp/article/details/126203567
Author: 追光的鲲
Title: 如何用Python读写Excel文件
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/668444/
转载文章受原作者版权保护。转载请注明原作者出处!