在python中运行并行请求会话

我正在尝试打开多个Web会话并将数据保存为CSV，使用for循环和requests.get选项编写了我的代码，但是访问90个Web位置需要很长时间。任何人都可以让我知道整个过程如何并行运行loc_var：

代码工作正常，只有问题是为loc_var逐个运行，并花了很长时间。

想要并行访问所有for循环loc_var URL并写入CSV操作

以下是代码：

import pandas as pd
import numpy as np
import os
import requests
import datetime
import zipfile
t=datetime.date.today()-datetime.timedelta(2)
server = [("A","web1",":5000","username=usr&password=p7Tdfr")]
'''List of all web_ips'''
web_1 = ["Web1","Web2","Web3","Web4","Web5","Web6","Web7","Web8","Web9","Web10","Web11","Web12","Web13","Web14","Web15"]
'''List of All location'''
loc_var =["post1","post2","post3","post4","post5","post6","post7","post8","post9","post10","post11","post12","post13","post14","post15","post16","post17","post18"]

for s,web,port,usr in server:

login_url='http://'+web+port+'/api/v1/system/login/?'+usr
print (login_url)
s= requests.session()
login_response = s.post(login_url)
print("login Responce",login_response)
#Start access the Web for Loc_variable
for mkt in loc_var:
    #output is CSV File
    com_actions_url='http://'+web+port+'/api/v1/3E+date(%5C%22'+str(t)+'%5C%22)and+location+%3D%3D+%27'+mkt+'%27%22&page_size=-1&format=%22csv%22'
    print("com_action_url",com_actions_url)
    r = s.get(com_actions_url)
    print("action",r)
    if r.ok == True:            
        with open(os.path.join("/home/Reports_DC/", "relation_%s.csv"%mkt),'wb') as f:
            f.write(r.content)  

    # If loc is not aceesble try with another Web_1 List
    if r.ok == False:
        while r.ok == False:
            for web_2 in web_1:
                login_url='http://'+web_2+port+'/api/v1/system/login/?'+usr
                com_actions_url='http://'+web_2+port+'/api/v1/3E+date(%5C%22'+str(t)+'%5C%22)and+location+%3D%3D+%27'+mkt+'%27%22&page_size=-1&format=%22csv%22'
                login_response = s.post(login_url)
                print("login Responce",login_response)
                print("com_action_url",com_actions_url)
                r = s.get(com_actions_url)
                if r.ok == True:            
                    with open(os.path.join("/home/Reports_DC/", "relation_%s.csv"%mkt),'wb') as f:
                        f.write(r.content)  
                    break

展开

收起

一码平川MACHEL 2019-02-28 14:10:40 3617 版权

2 条回答

写回答

取消提交回答

游客aasf2nc2ujisi

python多线程没有达到真正的并发,只适合,io密集型的,cpu密集型不适合.

建义用gevent第三方协程.

2019-11-18 18:07:10

赞同展开评论
一码平川MACHEL
您可以采用多种方法来进行并发HTTP请求。我通常使用的主要两个是使用多个线程concurrent.futures.ThreadPoolExecutor或使用异步发送请求asyncio/aiohttp。
要使用线程池并行发送请求，您首先要生成一个要并行获取的URL列表（在您的情况下生成一个login_urls和的列表com_action_urls），然后您将同时请求所有URL，如下所示：
from concurrent.futures import ThreadPoolExecutor
import requests
def fetch(url):
```
page = requests.get(url)
return page.text
```
pool = ThreadPoolExecutor(max_workers=5)
urls = ['http://www.google.com', 'http://www.yahoo.com', 'http://www.evergreen.edu'] # Create a list of urls
for page in pool.map(fetch, urls):
```
# Do whatever you want with the results ...
print(page[0:100])
```
使用asyncio / aiohttp通常比上面的线程方法更快，但学习曲线非常复杂。这是一个简单的例子（Python 3.7+）：
import asyncio
import aiohttp
urls = ['http://www.google.com', 'http://www.yahoo.com', 'http://www.evergreen.edu']
async def fetch(session, url):
```
async with session.get(url) as resp:
    return await resp.text()
```
async def fetch_concurrent(urls):
```
loop = asyncio.get_event_loop()
async with aiohttp.ClientSession() as session:
    tasks = []
    for u in urls:
        tasks.append(loop.create_task(fetch(session, u)))

    for result in asyncio.as_completed(tasks):
        page = await result
        #Do whatever you want with results
        print(page[0:100])
```
asyncio.run(fetch_concurrent(urls))
2019-07-17 23:29:43

赞同展开评论

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

在python中运行并行请求会话

相关文章