该文章是我在2016年发布在某sdn上的,搬运过来。
背景介绍
在现代计算领域,随着硬件性能的提升,特别是多核处理器和大规模集群的普及,分布式计算成为提高程序效率和处理大规模数据的关键手段。Parallel Python(简称PP)作为一款轻量级的分布式计算框架,旨在简化Python代码在SMP系统(多处理器或多核心)和集群环境中的并行执行。尽管网络上不乏关于PP单机多进程应用的教程,但在集群模式下PP的潜力却鲜少被深入挖掘。本文将分享在两台物理机(一台四核,一台双核)上搭建PP集群,实现分布式计算的实践经验。
平台配置与PP集群部署
部署PP集群的第一步是在所有参与计算的节点上安装Python环境,并通过pip install pp
安装PP库。安装完成后,ppserver.py
脚本将出现在Python的scripts目录中,它用于在子节点上启动监听端口。例如,在子节点上执行python ppserver.py -p 35000 -i 192.168.1.104 -s "123456"
,其中-p
指定监听端口,-i
是本地IP地址,-s
后跟的是密钥,确保通信安全。
主节点则负责创建ppserver
实例,需指定子节点的IP列表及相同的密钥,以建立安全的通信渠道。主节点通过动态负载均衡机制,将任务智能分配至各子节点,充分利用集群中每台机器的多核优势。
实战案例:求解质数和
为了验证PP在集群模式下的性能,我们采用了官方示例sum_primes.py
,其主要功能是计算小于给定整数的所有质数之和。在本次实验中,我们将数据规模扩大,同时利用两台物理机的计算资源,观察PP的分布式计算能力。
Python
#!/usr/bin/python # File: sum_primes.py # Author: VItalii Vanovschi # Desc: This program demonstrates parallel computations with pp module # It calculates the sum of prime numbers below a given integer in parallel # Parallel Python Software: http://www.parallelpython.com import math, sys, time, datetime import pp def isprime(n): """Returns True if n is prime and False otherwise""" if not isinstance(n, int): raise TypeError("argument passed to is_prime is not of 'int' type") if n < 2: return False if n == 2: return True max = int(math.ceil(math.sqrt(n))) i = 2 while i <= max: if n % i == 0: return False i += 1 return True def sum_primes(n): """Calculates sum of all primes below given integer n""" return sum([x for x in xrange(2,n) if isprime(x)]) print """Usage: python sum_primes.py [ncpus] [ncpus] - the number of workers to run in parallel, if omitted it will be set to the number of processors in the system """ # tuple of all parallel python servers to connect with #ppservers = () ppservers = ("192.168.1.104:35000",) #ppservers=("*",) if len(sys.argv) > 1: ncpus = int(sys.argv[1]) # Creates jobserver with ncpus workers job_server = pp.Server(ncpus, ppservers=ppservers, secret="123456") else: # Creates jobserver with automatically detected number of workers job_server = pp.Server(ppservers=ppservers, secret="123456") print "Starting pp with", job_server.get_ncpus(), "workers" # Submit a job of calulating sum_primes(100) for execution. # sum_primes - the function # (100,) - tuple with arguments for sum_primes # (isprime,) - tuple with functions on which function sum_primes depends # ("math",) - tuple with module names which must be imported before sum_primes execution # Execution starts as soon as one of the workers will become available job1 = job_server.submit(sum_primes, (100,), (isprime,), ("math",)) # Retrieves the result calculated by job1 # The value of job1() is the same as sum_primes(100) # If the job has not been finished yet, execution will wait here until result is available result = job1() print "Sum of primes below 100 is", result start_time = time.time() # The following submits 8 jobs and then retrieves the results inputs = (500000, 500100, 500200, 500300, 500400, 500500, 500600, 500700, 500000, 500100, 500200, 500300, 500400, 500500, 500600, 500700) #inputs = (1000000, 1000100, 1000200, 1000300, 1000400, 1000500, 1000600, 1000700) jobs = [(input, job_server.submit(sum_primes,(input,), (isprime,), ("math",))) for input in inputs] for input, job in jobs: print datetime.datetime.now() print "Sum of primes below", input, "is", job() print "Time elapsed: ", time.time() - start_time, "s" job_server.print_stats()
运行结果表明,PP成功地在两台机器上进行了负载均衡,总耗时显著减少,加速比达到了5.1倍,意味着相较于单机计算,PP集群模式有效利用了多台计算机的多核CPU资源,极大地提升了计算效率。
c:\Python27\python.exe test_pp_official.py Usage: python sum_primes.py [ncpus] [ncpus] - the number of workers to run in parallel, if omitted it will be set to the number of processors in the system Starting pp with 4 workers Sum of primes below 100 is 1060 2016-08-28 19:07:26.579000 Sum of primes below 500000 is 9914236195 2016-08-28 19:07:33.032000 Sum of primes below 500100 is 9917236483 2016-08-28 19:07:33.035000 Sum of primes below 500200 is 9922237979 2016-08-28 19:07:33.296000 Sum of primes below 500300 is 9926740220 2016-08-28 19:07:33.552000 Sum of primes below 500400 is 9930743046 2016-08-28 19:07:33.821000 Sum of primes below 500500 is 9934746636 2016-08-28 19:07:34.061000 Sum of primes below 500600 is 9938250425 2016-08-28 19:07:37.199000 Sum of primes below 500700 is 9941254397 2016-08-28 19:07:37.202000 Sum of primes below 500000 is 9914236195 2016-08-28 19:07:41.640000 Sum of primes below 500100 is 9917236483 2016-08-28 19:07:41.742000 Sum of primes below 500200 is 9922237979 2016-08-28 19:07:41.746000 Sum of primes below 500300 is 9926740220 2016-08-28 19:07:41.749000 Sum of primes below 500400 is 9930743046 2016-08-28 19:07:41.752000 Sum of primes below 500500 is 9934746636 2016-08-28 19:07:41.756000 Sum of primes below 500600 is 9938250425 2016-08-28 19:07:43.846000 Sum of primes below 500700 is 9941254397 Time elapsed: 17.2770001888 s Job execution statistics: job count | % of all jobs | job time sum | time per job | job server 6 | 35.29 | 27.4460 | 4.574333 | 192.168.1.104:35000 11 | 64.71 | 60.2950 | 5.481364 | local Time elapsed since server creation 17.2849998474 0 active tasks, 4 cores
总结与展望
通过本次实战,我们不仅见证了PP在集群模式下的强大性能,也验证了其在分布式计算场景中的灵活性与高效性。PP不仅简化了并行计算的实现,还展示了其在跨平台、异构集群环境中的卓越适应能力。未来,我们期待进一步探索PP在更复杂计算任务和更大规模集群中的表现,挖掘其在科学计算、数据分析等领域的潜在价值。
参考文献
- Parallel Python Official Documentation: http://www.parallelpython.com
- Parallel Python Examples: http://www.parallelpython.com/examples.php
- Mandelbrot Set
- NumPy Documentation
- Matplotlib Documentation