Python 学习笔记（六）–线程

2023年5月25日上午12:29 • Python • 阅读 76

1.自定义进程

自定义进程类，继承Process类，重写run方法（重写Process的run方法）。

from multiprocessing import Process
import time
import os

class MyProcess(Process):
    def __init__(self, name):  ##重写，需要__init__,也添加了新的参数。        ##Process.__init__(self) 不可以省略，否则报错：AttributeError:'XXXX'object has no attribute '_colsed'
        Process.__init__(self)
        self.name = name

    def run(self):
        print("子进程(%s-%s)启动" % (self.name, os.getpid()))
        time.sleep(3)
        print("子进程(%s-%s)结束" % (self.name, os.getpid()))

if __name__ == '__main__':
    print("父进程启动")
    p = MyProcess("Ail")
    # 自动调用MyProcess的run()方法
    p.start()
    p.join()
    print("父进程结束")

输出结果
父进程启动
子进程(Ail-38512)启动
子进程(Ail-38512)结束
父进程结束

2.进程与线程

多进程适合在CPU密集型操作（CPU操作指令比较多，如科学计算、位数多的浮点计算）；

多线程适合在IO密集型操作（读写数据操作比较多的，比如爬虫、文件上传、下载）

线程是并发的，进程是并行的：进程彼此独立，是系统分配的最小资源单元，同一进程中的所有线程共享资源。

[En]

Threads are concurrent and processes are parallel: processes are independent of each other and are the smallest unit of resources allocated by the system, and all threads in the same process share resources.

进程：正在运行的程序或代码是进程，未运行的代码称为程序。进程是系统中资源分配的最小单位，而且进程有自己的存储空间，所以进程之间的数据不共享，开销很大。

[En]

Process: a running program or code is a process, and a code that is not running is called a program. The process is the smallest unit of resource allocation in the system, and the process has its own memory space, so the data between processes are not shared and the overhead is high.

进程是程序的动态执行进程。每个进程都有自己的地址空间、内存、数据堆栈和其他用于跟踪执行的辅助数据。操作系统负责其上所有进程的执行，并为这些进程合理分配执行时间。

[En]

A process is a dynamic execution process of a program. Each process has its own address space, memory, data stack, and other auxiliary data for tracking execution. The operating system is responsible for the execution of all processes on it, and the operating system allocates execution time to these processes reasonably.

线程：调度执行的最小单位，也称为执行路径，不能独立存在，取决于进程的存在。一个进程至少有一个称为主线程的线程，多个线程共享内存(数据共享和全局变量)。从而提高了程序的效率。

[En]

Thread: the smallest unit of scheduling execution, also known as the execution path, cannot exist independently and depends on the existence of a process. A process has at least one thread, called the main thread, and multiple threads share memory (data sharing and global variables). Thus improve the efficiency of the program.

线程是操作系统能够进行运算调度的最小单位，它被包含在进程之中，是进程中的实际运作单位。一条线程指的是进程中一个单一顺序的控制流，一个进程中可以并发多个线程，每条线程并行执行不同的任务。一个线程是一个execution context（执行上下文），即一个CPU执行时所需要的一串指令。

主线程：主线程就是创建线程进程中产生的第一个线程，也就是main函数对应的线程。

协作式程序：用户态的轻量级线程，调度由用户控制，有自己的寄存器上下文和堆栈，切换基本没有内核切换开销，切换灵活。

[En]

Cooperative program: a lightweight thread in user mode, scheduling is controlled by the user, has its own register context and stack, switching basically has no kernel switching overhead, and switching is flexible.

进程和线程的关系

3.多线程

操作系统通过给不同的线程分配时间片（CPU运行时长）来调度线程，当CPU执行完一个线程的时间片后就会快速切换到下一个线程，时间片很短而且切换速度很快，以至于用户根本察觉不到。多个线程根据分配的时间片轮流被CPU执行，如今绝大多数计算机的CPU都是多核的，多个线程在操作系统的调度下，能够被多个CPU并行执行，程序的执行速度和CPU的利用效率大大提升。绝大对数主流的编程语言都能很好地支持多线程，然而，Python由于GIL锁无法实现真正的多线程。

内存中的线程

4. Thread类方法

（1）start() –开始执行该线程；

（2）run() –定义线程的方法（开发者可以在子类中重写）；标准的 run() 方法会对作为 target 参数传递给该对象构造器的可调用对象（如果存在）发起调用，并附带从 args 和 kwargs 参数分别获取的位置和关键字参数。

（3）join(timeout=None) –直至启动的线程终止之前一直挂起；除非给出了timeout(单位秒)，否则一直被阻塞；因为 join() 总是返回 None ，所以要在 join() 后调用 is_alive() 才能判断是否发生超时 — 如果线程仍然存活，则 join() 超时。一个线程可以被 join() 很多次。如果尝试加入当前线程会导致死锁， join() 会引起 RuntimeError 异常。如果尝试 join() 一个尚未开始的线程，也会抛出相同的异常。

（4）is_alive() –布尔值，表示这个线程是否还存活;当 run() 方法刚开始直到 run() 方法刚结束，这个方法返回 True 。

（5）threading.current_thread()–返回当前对应调用者的控制线程的 Thread 对象。例如，获取当前线程的名字，可以是current_thread().name。

5. 多线程与多进程小Case

from threading import Thread
from multiprocessing import Process
import os
def work():
    print('hello,',os.getpid())

if __name__ == '__main__':
    # 在主进程下开启多个线程，每个线程都跟主进程的pid一样
    t1 = Thread(target=work)  # 开启一个线程
    t2 = Thread(target=work)  # 开启两个线程
    t1.start()  ##start()--It must be called at most once per thread object.It arranges for the object's run() method to be                ## invoked in a separate thread of control.This method will raise a RuntimeError if called more than once on the                ## same thread object.

    t2.start()
    print('主线程/主进程pid', os.getpid())

    # 开多个进程，每个进程都有不同的pid
    p1 = Process(target=work)
    p2 = Process(target=work)
    p1.start()
    p2.start()
    print('主线程/主进程pid',os.getpid())

来源于：https://cloud.tencent.com/developer/article/1175618

6.Thread 的生命周期

线程的状态包括:创建、就绪、运行、阻塞、结束。

(1) 创建对象时，代表 Thread 内部被初始化；

(2) 调用 start() 方法后，thread 会开始进入队列准备运行，在未获得CPU、内存资源前，称为就绪状态；轮询获取资源，进入运行状态；如果遇到sleep，则是进入阻塞状态；

(3) thread 代码正常运行结束或者是遇到异常，线程会终止。

7.自定义线程

（1）定义一个类，继承Thread；

（2）重写__init__ 和 run();

（3）创建线程类对象；

（4）启动线程。

import time
import threading
class MyThread(threading.Thread):
    def __init__(self,num):
        super().__init__() ###或者是Thread.__init__()
        self.num = num
    def run(self):
        print('线程名称：', threading.current_thread().getName(), '参数：', self.num, '开始时间：', time.strftime('%Y-%m-%d %H:%M:%S'))
if __name__ == '__main__':
    print('主线程开始：',time.strftime('%Y-%m-%d %H:%M:%S'))
    t1 = MyThread(1)
    t2 = MyThread(2)
    t1.start()
    t2.start()
    t1.join()
    t2.join()
    print('主线程结束：', time.strftime('%Y-%m-%d %H:%M:%S'))

8.线程共享数据与GIL（全局解释器锁）

如果是全局变量，则每个线程都是共享的

[En]

If it is a global variable, each thread is shared

GIL锁：可以用篮球比赛的场景来模拟，把篮球场看作是CPU，一场篮球比赛看作是一个线程，如果只有一个篮球场，多场比赛就要排队进行，类似于一个简单的单核多线程的程序；如果由多块篮球场，多场比赛同时进行，就是一个简单的多核多线程的程序。然而，Python有着特别的规定：每场比赛必须要在裁判的监督之下才允许进行，而裁判只有一个。这样不管你有几块篮球场，同一时间只允许有一个场地进行比赛，其它场地都将被闲置，其它比赛都只能等待。

9.GIL 和 Lock

GIL保证同一时间内一个进程可以有多个线程，但只有一个线程在执行；锁的目的是为了保护共享的数据，同一时间只能有一个线程来修改共享的数据。

类为threading. Lock

它有两个基本方法， acquire() 和 release() 。

当状态为非锁定时， acquire() 将状态改为锁定并立即返回。当状态是锁定时， acquire() 将阻塞至其他线程调用 release() 将其改为非锁定状态，然后 acquire() 调用重置其为锁定状态并返回。

release() 只在锁定状态下调用；它将状态改为非锁定并立即返回。如果尝试释放一个非锁定的锁，则会引发 RuntimeError 异常。

Caese 如下：

from threading import Thread
from threading import Lock
import time
number = 0
def task(lock):
    global number
    lock.acquire() ##持有锁
    for i in range(100000)      number += 1
    lock.release() ##释放锁
if __name__ == '__main__':
    lock=Lock()
    t1 = Thread(target=task,args=(lock,))
    t2 = Thread(target=task,args=(lock,))    t3 = Thread(target=task,args=(lock,))

t1.start()    t2.start()    t3.start()    t1.join()    t2.join()    t3.join()        print('number:',number)

10.线程的信号量

class threading.Semaphore([values])
values是一个内部计数，values默认是1，如果小于0，则会抛出 ValueError 异常，可以用于控制线程数并发数。

信号量的实现方式:

s=Semaphore(?)

在内部有一个counter计数器，counter的值就是同一时间可以开启线程的个数。每当我们s.acquire()一次，计数器就进行减1处理，每当我们s.release()一次，计数器就会进行加1处理，当计数器为0的时候，其它的线程就处于等待的状态。

该程序添加了一个计数器函数(信号量)来限制某个时间点的线程数量，并防止程序崩溃或出现其他异常。

[En]

The program adds a counter function (semaphore) to limit the number of threads at a point in time and prevent the program from crashing or other exceptions.

Case

import time
import threading

s=threading.Semaphore(5)    #添加一个计数器

def task():
    s.acquire()    #计数器获得锁
    time.sleep(2)    #程序休眠2秒
    print("The task run at ",time.ctime())
    s.release()    #计数器释放锁

for i in range(40):
    t1=threading.Thread(target=task,args=())    #创建线程
    t1.start()    #启动线程

也可以使用with操作，替代 acquire ()和 release(),上面的代码调整如下:

import time
import threading

s=threading.Semaphore(5)    #添加一个计数器

def task():    with s:   ## 类似打开文件的with操作
    ##s.acquire()    #计数器获得锁
      time.sleep(2)    #程序休眠2秒
      print("The task run at ",time.ctime())
    ##s.release()    #计数器释放锁

for i in range(40):
    t1=threading.Thread(target=task,args=())    #创建线程
    t1.start()    #启动线程

建议使用with。

11.获取线程的返回值

python多线程一般使用threading模板，但是threading模块有个问题，就是无法返回线程里面运行的结果。
解决方案可以通过自定义线程类、继承Thread类、并复写run方法，在run方法中写入执行函数的方式，并把返回值赋值给result;然后通过调用get_result获取每个进程的返回值。

示例代码如下:

import threading
import Queue

def is_even(value):
   if value % 2 == 0:
      return True
   else:
      return False

class MyThread(threading.Thread):
   def __init__(self,func,args=())
      super(MyThread,self).__init__()
      self.func = func
      self.args = args

   def run(self):
      self.result = self.func(*self.args) ##在执行函数的同时,把结果赋值给result,然后通过get_result函数获取返回的结果。

   def get_result(self):
      try:
         return self.result
      except Exception as e:
         return None

result = []
threads = []
for i in range(10):
  t = MyThread(is_even,args=(i,))
  t.start()
  threads.append(t)
for t in threads:
  t.join() ##一定执行join，等待子进程执行结束，主进程再往下执行
  result.append(t.get_result())

print result

参考

1.Python3多进程multiprocessing模块的使用

https://www.jianshu.com/p/a5f10c152c20

2.python3–threading模块(线程)

https://cloud.tencent.com/developer/article/1175618

<span class="pre">threading</span> — 基于线程的并行

https://docs.python.org/zh-cn/3.7/library/threading.html

4.python 并发编程多线程 GIL与Lock

https://www.cnblogs.com/mingerlcm/p/9026090.html

5.Python多种方法获取多线程返回值

https://wenku.baidu.com/view/2734833bc181e53a580216fc700abb68a982addd.html

Original: https://www.cnblogs.com/xuliuzai/p/15488546.html
Author: 东山絮柳仔
Title: Python 学习笔记（六）–线程

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/510872/

转载文章受原作者版权保护。转载请注明原作者出处！

python

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

python进行Excel处理

1 import pandas import pandas as pd #导入pandas并&#x53D6…

Python 2023年8月17日
0043
ElasticSearch系列——查询、Python使用、Django/Flask集成、集群搭建，数据分片、位置坐标实现附近的人搜索

啊哦~你想找的内容离你而去了哦内容不存在，可能为如下原因导致： ① 内容还在审核中 ② 内容以前存在，但是由于不符合新的规定而被删除 ③ 内容地址错误 ④ 作者删除了内容。可…

Python 2023年8月13日
0045
【Python+Flask+Echarts】可视化练习题 —- 餐饮数据饼图

文章目录数据集案例 * ① 需求 ② 代码实现 – ▶ 读取数据集整体浏览 ▶ 统计每个区不同餐饮类型销售表现 ▶ 可视化 ③ 效果展示数据集本篇的数据来源 h…

Python 2023年8月15日
0049
在pandas库中值的修改

import numpy as npimport pandas as pd dates = pd.date_range(‘20220101’,periods…

Python 2023年8月21日
0034
用Python实现序列帧播放器

用Python实现序列帧播放器注意以下所有代码不可直接使用，若要使用请到百度网盘上下载源码！链接：https://pan.baidu.com/s/1P0x8ddbnn5veF…

Python 2023年9月21日
0053
Django 01 ：初识Django （附案例：用户管理）

文章目录一、安装django 二、创建项目三、创建app 四、快速上手 * 4.1、快速体验 4.2、templates模板 – （1）创建HTML文件（2）编写…

Python 2023年8月4日
0044
Python基于django的在线酒店管理系统

论文主要是对在线酒店管理系统进行了介绍,包括研究的现状,还有涉及的开发背景,然后还对系统的设计目标进行了论述,还有系统的需求,以及整个的设计方案,对系统的设计以及实现,也都论述的比…

Python 2023年8月5日
0082
【Django | 开发】面试招聘信息网站（快速搭建核心需求）

Python 2023年5月24日
0067
Python学习：迭代器与生成器

如果创建一个有很多元素的列表，但是只需要访问前几个元素，后面的元素占着的空间就白白浪费了在循环的过程中不断推算出后续的元素呢？这样就不必创建完整的list，从而节省大量的空间。 …

Python 2023年6月9日
0059
Python数据可视化：mplfinance创建蜡烛图（三）

1.make_mpf_style()函数make_mpf_style(base_mpf_style,base_mpl_style,marketcolors,mavcolors,fa…

Python 2023年9月4日
0046
django 数据迁移–在已有数据表的情况下，django如何与已有的数据库表映射

问题：mysql 数据库中存在数据库表： user 在编写django 后端接口操作时，未在models.py中创建表相关类此时，如何在直接使django与user 建立映射关系…

Python 2023年6月3日
0065
python编程进阶学习笔记

python 一切皆对象 python中的一切皆对象更加彻底在python中的一切皆对象比Java中的一切皆对象更加彻底，Java中有class，也就是类的概念，object是c…

Python 2023年5月24日
0062
CLion配置opencv环境

工具准备 1.clion官网链接：clion2.cmake官网链接：cmake下载红框标记的压缩包，免安装。3.mingw官网链接：mingw安装红框标记下载免安装版本，解压可用。…

Python 2023年9月28日
0054
基于 RTF specification v1.7 的 RTF 文件解析及 OLE 对象提取（使用 Python 开发）

解析 RTF 文件的 OLE 对象def printOLE(data): olestr = "\\object"; if(balance(data, 0) ==…

Python 2023年5月25日
0052
在微信小程序上做一个「博客园年度总结」：解决前端获取接口数据太慢的一种思路

先介绍下目前代码中后端是如何给前端提供数据的： 1、构造一个函数A，这个方法中会调用博客园「获取随笔列表」接口，取到数据作进一步处理，然后把结果返出去； 2、使用flask创建一个…

Python 2023年10月16日
0038
Pandas数据分组聚合

合并数据表 pandas提供join操作进行数据间的快速合并，默认以行索引对其 def join(self, other, on=None, how="left&quot…

Python 2023年8月21日
0043

2024 年 4 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Python 学习笔记（六）–线程

1.自定义进程

2.进程与线程

3.多线程

4. Thread类方法

5. 多线程与多进程 小Case

6.Thread 的生命周期

7.自定义线程

8.线程共享数据与GIL（全局解释器锁）

9.GIL 和 Lock

10.线程的信号量

11.获取线程的返回值

参考

大家都在看

5. 多线程与多进程小Case