Python数据处理-文章目录

透过 Python 让读者有能力处理数据,读者掌握数据表达的重要性,进而将数据以更浅显易懂的方式,透过视觉的方式来呈现数据所代表的特性。

References

  • Python 文档目录,https://docs.python.org/zh-cn/3.7/contents.html
  • PEP 397 – Python launcher for Windows, https://www.python.org/dev/peps/pep-0397/
  • 什么是 Python Launcher?, https://blog.csdn.net/wuShiJingZuo/article/details/103535381
  • Getting Started with Python in VS Code, https://code.visualstudio.com/docs/python/python-tutorial
  • Using Python environments in VS Code, https://code.visualstudio.com/docs/python/environments
  • 浅拷贝与深拷贝,https://zhuanlan.zhihu.com/p/56741046
  • 求最大公因数的几种方法,https://so.html5.qq.com/page/real/search_news?docid=70000021_7675e858c2215914

2-02 数组索引与切片方法

References

  • NumPy 教程,https://www.runoob.com/numpy/numpy-tutorial.html
  • Python 3 教程,https://www.runoob.com/python3/python3-tutorial.html
  • pandas documentation,https://pandas.pydata.org/pandas-docs/stable/index.html
  • Installing pandas, https://pandas.pydata.org/pandas-docs/stable/getting_started/install.html
  • Python Package Index, https://pypi.org/
  • Python下opencv库的安装过程与一些问题汇总,https://www.cnblogs.com/BIXIABUMO/p/12440634.html
  • Links for opencv-python, https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple/opencv-python/

3-02 数据加载

3-03 数据清洗与合并

数据预处理包含了数据清洗 (data cleansing) 与特征工程 (feature engineering) ,本节主要介绍的是数据清洗部份,主要目的是将原始数据转换成整洁的、组织合理的形式以供后续的特征工程使用。而数据清洗的工作内容很多,举例来说:

  • 基础运算 (basic) - 选择、过滤、删除重复项。
  • 取样 (Sampling) - 基于绝对、相对或是概率。
  • 数据划分 (Data Partitioning) - 将数据集划分为训练、验证、测试数据集。
  • 装箱 (Binning) - 这是用于减少微小观测误差影响的技术,常见的应用如直方图 (Histograms)。
  • 转换 (Transformations) - 如标准化,标准化,缩放,旋转。
  • 数据替换 (Data Replacement) - 剪切、拆分、合并。
  • 插补 (Imputation) - 使用统计算法替换缺失的观察值。
  • 加权 (Weighting) - 属性加权。

本节将会介绍基础运算中的过滤、找出缺失值、删除重复项以及数据替换中的剪切、拆分、合并。

References

  • Pandas 中文教程,https://www.w3cschool.cn/hyspo/
  • Pandas cookbook,https://github.com/jvns/pandas-cookbook
  • pandas.read_csv函数参数详解,https://zhuanlan.zhihu.com/p/129858983
  • Data Preprocessing vs. Data Wrangling in Machine Learning Projects, https://www.infoq.com/articles/ml-data-processing/
  • Data Preparation, https://rapidminer.com/products/studio/feature-list/#data_prep
  • Titanic: Machine Learning from Disaster, https://www.kaggle.com/c/titanic
  • DataFrame – pandas 1.1.4 documentation, https://pandas.pydata.org/pandas-docs/stable/reference/frame.html
  • Merge, join, concatenate and compare, https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html

4-03 Pandas

References

  • Pandas Visualization, https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html
  • Matplotlib: Visualization with Python, https://matplotlib.org/
  • seaborn: statistical data visualization, http://seaborn.pydata.org/
  • Top 50 matplotlib Visualizations – The Master Plots, https://www.machinelearningplus.com/plots/top-50-matplotlib-visualizations-the-master-plots-python/
  • Datasets collected from R packages, https://github.com/selva86/datasets
  • Midwest demographics, https://ggplot2.tidyverse.org/reference/midwest.html#midwest-demographics
  • midwest: Midwest demographics, https://rdrr.io/github/SahaRahul/ggplot2/man/midwest.html
  • mtcars: mtcars, https://rdrr.io/github/matthewhirschey/bespokelearnr/man/mtcars.html
  • seaborn dataset, https://github.com/mwaskom/seaborn-data
  • seaborn: statistical data visualization, https://github.com/mwaskom/seaborn

当使用者完成一个 Python 项目的时后,要将这个代码移交给他人可能会遇到的问题有以下三种情况:

  • Python 解释器:有无安装或版本不同。
  • 相关包: 代码中有需要使用的包。
  • 操作系统: Windows, Mac OS, Linux等不同操作环境。

References

  • PEP 0 – Index of Python Enhancement Proposals (PEPs), https://www.python.org/dev/peps/
  • Virtual Environment, https://book.pythontips.com/en/latest/virtual_environment.html
  • Installing packages using pip and virtual environments, https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/
  • venv — Creation of virtual environments, https://docs.python.org/3/library/venv.html
  • PEP 405 – Python Virtual Environments, https://www.python.org/dev/peps/pep-0405/
  • PyInstaller, http://www.pyinstaller.org/
  • How to Install PyInstaller, https://pyinstaller.readthedocs.io/en/latest/installation.html
  • PyInstaller Manual, https://pyinstaller.readthedocs.io/en/stable/
  • GUI应用, https://pythonguidecn.readthedocs.io/zh/latest/scenarios/gui.html
  • Usage – Matplotlib 2.0.2 documentation, https://matplotlib.org/faq/usage_faq.html
  • Packaging PyQt5 & PySide2 applications for Windows, with PyInstaller, https://www.learnpyqt.com/tutorials/packaging-pyqt5-pyside2-applications-windows-pyinstaller/
  • The Hitchhiker’s Guide to Python!, https://docs.python-guide.org/en/latest/
  • Installing Tk on Windows, https://tkdocs.com/tutorial/install.html#installwin
  • Freezing Your Code, https://docs.python-guide.org/shipping/freezing/
  • Install Docker Desktop on Window, https://docs.docker.com/docker-for-windows/install/
  • python Docker Official Images, https://hub.docker.com/_/python?tab=tags&page=1&ordering=last_updated
  • 适用于 Linux 的 Windows 子系统安装指南 (Windows 10), https://docs.microsoft.com/zh-cn/windows/wsl/install-win10#step-4—download-the-linux-kernel-update-package
  • The base command for the Docker CLI, https://docs.docker.com/engine/reference/commandline/docker/
  • Docker 命令大全, https://www.runoob.com/docker/docker-command-manual.html
  • [Day 15] Docker (1), https://ithelp.ithome.com.tw/articles/10206556
  • Windows上做Python开发太痛苦?Docker了解一下, https://zhuanlan.zhihu.com/p/50864774
  • Day5: 實作撰寫第一個 Dockerfile, https://ithelp.ithome.com.tw/articles/10191016
  • AWS Lambda, https://aws.amazon.com/cn/lambda/?nc1=h_ls
  • Python 中的 AWS Lambda 部署程序包, https://docs.aws.amazon.com/zh_cn/lambda/latest/dg/python-package.html#python-package-venv
  • 构建具有依赖项的应用程序, https://docs.aws.amazon.com/zh_cn/serverless-application-model/latest/developerguide/serverless-sam-cli-using-build.html
  • Creating New AWS Lambda Layer For Python Pandas Library, https://medium.com/@qtangs/creating-new-aws-lambda-layer-for-python-pandas-library-348b126e9f3e

Original: https://blog.csdn.net/m0_50614038/article/details/124241822
Author: Yehchitsai
Title: Python数据处理-文章目录

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/760809/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球