探究Python多进程编程下线程之间变量的共享问题
作者:xrzs 发布时间:2023-09-27 15:42:47
1、问题:
群中有同学贴了如下一段代码,问为何 list 最后打印的是空值?
from multiprocessing import Process, Manager
import os
manager = Manager()
vip_list = []
#vip_list = manager.list()
def testFunc(cc):
vip_list.append(cc)
print 'process id:', os.getpid()
if __name__ == '__main__':
threads = []
for ll in range(10):
t = Process(target=testFunc, args=(ll,))
t.daemon = True
threads.append(t)
for i in range(len(threads)):
threads[i].start()
for j in range(len(threads)):
threads[j].join()
print "------------------------"
print 'process id:', os.getpid()
print vip_list
其实如果你了解 python 的多线程模型,GIL 问题,然后了解多线程、多进程原理,上述问题不难回答,不过如果你不知道也没关系,跑一下上面的代码你就知道是什么问题了。
python aa.py
process id: 632
process id: 635
process id: 637
process id: 633
process id: 636
process id: 634
process id: 639
process id: 638
process id: 641
process id: 640
------------------------
process id: 619
[]
将第 6 行注释开启,你会看到如下结果:
process id: 32074
process id: 32073
process id: 32072
process id: 32078
process id: 32076
process id: 32071
process id: 32077
process id: 32079
process id: 32075
process id: 32080
------------------------
process id: 32066
[3, 2, 1, 7, 5, 0, 6, 8, 4, 9]
2、python 多进程共享变量的几种方式:
(1)Shared memory:
Data can be stored in a shared memory map using Value or Array. For example, the following code
http://docs.python.org/2/library/multiprocessing.html#sharing-state-between-processes
from multiprocessing import Process, Value, Array
def f(n, a):
n.value = 3.1415927
for i in range(len(a)):
a[i] = -a[i]
if __name__ == '__main__':
num = Value('d', 0.0)
arr = Array('i', range(10))
p = Process(target=f, args=(num, arr))
p.start()
p.join()
print num.value
print arr[:]
结果:
3.1415927
[0, -1, -2, -3, -4, -5, -6, -7, -8, -9]
(2)Server process:
A manager object returned by Manager() controls a server process which holds Python objects and allows other processes to manipulate them using proxies.
A manager returned by Manager() will support types list, dict, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Queue, Value and Array.
代码见开头的例子。
http://docs.python.org/2/library/multiprocessing.html#managers
3、多进程的问题远不止这么多:数据的同步
看段简单的代码:一个简单的计数器:
from multiprocessing import Process, Manager
import os
manager = Manager()
sum = manager.Value('tmp', 0)
def testFunc(cc):
sum.value += cc
if __name__ == '__main__':
threads = []
for ll in range(100):
t = Process(target=testFunc, args=(1,))
t.daemon = True
threads.append(t)
for i in range(len(threads)):
threads[i].start()
for j in range(len(threads)):
threads[j].join()
print "------------------------"
print 'process id:', os.getpid()
print sum.value
结果:
------------------------
process id: 17378
97
也许你会问:WTF?其实这个问题在多线程时代就存在了,只是在多进程时代又杯具重演了而已:Lock!
from multiprocessing import Process, Manager, Lock
import os
lock = Lock()
manager = Manager()
sum = manager.Value('tmp', 0)
def testFunc(cc, lock):
with lock:
sum.value += cc
if __name__ == '__main__':
threads = []
for ll in range(100):
t = Process(target=testFunc, args=(1, lock))
t.daemon = True
threads.append(t)
for i in range(len(threads)):
threads[i].start()
for j in range(len(threads)):
threads[j].join()
print "------------------------"
print 'process id:', os.getpid()
print sum.value
这段代码性能如何呢?跑跑看,或者加大循环次数试一下。。。
4、最后的建议:
Note that usually sharing data between processes may not be the best choice, because of all the synchronization issues; an approach involving actors exchanging messages is usually seen as a better choice. See also Python documentation: As mentioned above, when doing concurrent programming it is usually best to avoid using shared state as far as possible. This is particularly true when using multiple processes. However, if you really do need to use some shared data then multiprocessing provides a couple of ways of doing so.
5、Refer:
http://stackoverflow.com/questions/14124588/python-multiprocessing-shared-memory
http://eli.thegreenplace.net/2012/01/04/shared-counter-with-pythons-multiprocessing/
http://docs.python.org/2/library/multiprocessing.html#multiprocessing.sharedctypes.synchronized
猜你喜欢
- 在JavaScript开发中,被人问到:null与undefined到底有啥区别?一时间不好回答,特别是undefined,因为这涉及到un
- 想必大家都很喜欢用Word打字,用Excel进行计算和规划,用PowerPoint作幻灯片进行展示…,但是这只用到了Office系列产品的很
- 支付宝支付正式环境:用营业执照,申请商户号,appid测试环境:沙箱环境:https://openhome.alipay.com/platf
- 本文实例讲述了Python生成随机数组的方法。分享给大家供大家参考,具体如下:研究排序问题的时候常常需要生成随机数组来验证自己排序算法的正确
- tkinter的锚点(anchor)问题tkinter中anchor参数(注意,参数的英文都是小写)字母方位n北s南w西e东center中心
- 本文实例讲述了Python3读取文件常用方法。分享给大家供大家参考。具体如下:''''' Creat
- 后端代码就不介绍了,可以参考 django rest framework 实现用户登录认证这里介绍一下前端代码,和前后端的联调过程在comp
- 1.读取CSV文件到Listdef readCSV2List(filePath): try: file=open(filePat
- 这篇文章主要介绍了Python hashlib加密模块常用方法解析,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价
- 一、为什么要安装虚拟环境 情景一、项目A需要某个库的1.0版本,项目B需要这个库的2.0版本。如果没有安装虚拟环境
- 用下列代码判断表单提交到服务器的数据是否有谈话内容,如果没有的话就不作处理了:if len(usersays)<>0&
- 本文主要介绍了pytorch cnn 识别手写的字实现自建图片数据,分享给大家,具体如下:# library# standard libra
- 前言提示:这里可以添加本文要记录的大概内容:公司里B2B是通过WinSCP里SFTP与客户进行数据传输,WinSCP是一个Windows环境
- 前言我们实战经常会遇到以下几个问题:1、遇到一个利用步骤十分繁琐的漏洞,中间错一步就无法利用2、挖到一个通用漏洞,想要批量刷洞小赚一波,但手
- Python 语言的优势在于其功能强大,可以用于网络数据采集、数据分析等各种应用场景。本篇文章将介绍如何使用 Python 获取网络数据、使
- 五子棋游戏相信大部分人都玩过,今天我们用python来实现一次具体代码可以访问我的GitHub地址获取构建五子棋棋盘from collect
- 本文实例为大家分享了python接入微信聊天机器人的具体代码,供大家参考,具体内容如下1.安装库wxpy:pip install -U wx
- 前言:记一次golang使用json进行对象copy的内存溢出问题排查问题现象:新增的功能,灰度部署在k8s集群的服务,发现机器老是被打崩,
- 看代码吧~# 加载库import pandas as pd# 데이터프레임을 만듭니다.dataframe = pd.DataFrame()
- 如何最大限度地实现安全登录功能? 具体方法如下(这是一个程序,为便于说明,中间用虚线“------”将代