Python3 新一代Http请求库Httpx使用（详情版）（上）：https://developer.aliyun.com/article/1462934

三、代理

1、简介

HTTPX 支持通过在proxies客户端初始化或顶级 API 函数（如httpx.get(..., proxies=...).

2、使用方法

2.1 简单使用

要将所有流量（HTTP 和 HTTPS）路由到位于的代理http://localhost:8030，请将代理 URL 传递给客户端...

with httpx.Client(proxies="http://localhost:8030") as client:
    ...

对于更高级的用例，传递一个 proxies dict。例如，要将 HTTP 和 HTTPS 请求路由到 2 个不同的代理，分别位于http://localhost:8030和http://localhost:8031，传递一个dict代理 URL：

proxies = {
    "http://": "http://localhost:8030",
    "https://": "https://localhost:8031",
}
with httpx.Client(proxies=proxies) as client:
    ...

2.2 验证

代理凭据可以作为userinfo代理 URL 的部分传递。例如：

proxies = {
    "http://": "http://username:password@localhost:8030",
    # ...
}

2.3 路由

HTTPX 提供了细粒度的控制来决定哪些请求应该通过代理，哪些不应该。此过程称为代理路由。

该proxies字典将 URL 模式（“代理键”）映射到代理 URL。HTTPX 将请求的 URL 与代理密钥进行匹配，以决定应该使用哪个代理（如果有）。从最具体的代理密钥（例如https://:）到最不具体的代理密钥（例如）进行匹配https://。

HTTPX 支持基于scheme、domain、port或这些的组合的路由代理。

2.3.1 通配符路由

通过代理路由所有内容...

proxies = {
    "all://": "http://localhost:8030",
}

2.3.2 方案路由

通过一个代理路由 HTTP 请求，通过另一个代理路由 HTTPS 请求...

proxies = {
    "http://": "http://localhost:8030",
    "https://": "https://localhost:8031",
}

2.3.3 域路由

# 代理域“example.com”上的所有请求，让其他请求通过... 
proxies = {
    "all://example.com": "http://localhost:8030",
}
# 代理域“example.com”上的 HTTP 请求，让 HTTPS 和其他请求通过...
proxies = {
    "http://example.com": "http://localhost:8030",
}
# 将所有请求代理到“example.com”及其子域，让其他请求通过...
proxies = {
    "all://*example.com": "http://localhost:8030",
}
# 代理所有请求到“example.com”的严格子域，让“example.com”等请求通过...
proxies = {
    "all://*.example.com": "http://localhost:8030",
}

2.3.4 端口路由

将端口 1234 上的 HTTPS 请求代理到“example.com”...

proxies = {
    "https://example.com:1234": "https://localhost:8030",
}

代理端口 1234 上的所有请求...

proxies = {
    "all://*:1234": "http://localhost:8030",
}

2.3.5 无代理支持

也可以定义_不应_通过代理路由的请求。

为此，请None作为代理 URL 传递。例如...

proxies = {
    # Route requests through a proxy by default...
    "all://": "http://localhost:8031",
    # Except those for "example.com".
    "all://example.com": None,
}

3、区别

3.1 前言

有细心的朋友就发现了，我前面不是说大部分参数requests库一样么？怎么代理的有点不一样呢？注意啊，我的意思是大部分一样，这样便于大家理解和记忆。

那么，这个代理的区别在哪呢？

我们来看一下requests的代理的使用

3.2 requests代理

使用 proxies任何请求方法的参数配置单个请求，确保在存在环境代理的情况下使用代理：

# 普通的代理
import requests
proxies = {
  'http': 'http://10.10.1.10:3128',
  'https': 'http://10.10.1.10:1080',
}  
requests.get('http://example.org', proxies=proxies)
# 权限认证
proxies = {'http': 'http://user:pass@10.10.1.10:3128/'}
# 给特定的方案和主机提供代理，这将匹配对给定方案和确切主机名的任何请求。
proxies = {'http://example.org': 'http://10.10.1.10:5323'}  # 其为一个简单的路由功能，进行简单的代理分发

3.3 总结

通过回顾requests代理，相信大家就发现了区别了：

在代理字典中，httpx代理的键最后面有两个斜杆，而requests代理没有

我的理解是，这应该是各自第三方库的语法没有一致的标准，这造成了代理ip的语法不一
比如，aiohttp的代理是这样使用的：

async with aiohttp.ClientSession() as session:
    proxy_auth = aiohttp.BasicAuth('user', 'pass')
    async with session.get("http://python.org",
                           proxy="http://proxy.com",
                           proxy_auth=proxy_auth) as resp:
        print(resp.status)

注意： proxy_auth = aiohttp.BasicAuth('your_user', 'your_password') 其为权限认证，当然，权限认证的方法还可以在 urlStr中， proxy = 'http://your_proxy_url:your_proxy_port'

以及scrapy框架的代理是这样使用的：

def start_requests(self):
    for url in self.start_urls:
        return Request(url=url, callback=self.parse,
                       headers={"User-Agent": "scrape web"},
                       meta={"proxy": "http:/154.112.82.262:8050"})  
# 权限认证：
# request.headers["Proxy-Authorization"] = basic_auth_header("<proxy_user>", "<proxy_pass>")

它是给request中的meta对象添加代理： request.meta["proxy"] = "http://192.168.1.1:8050"

当然，如果大家有更好的看法的话，可以私信我哦！

同时，httpx的代理功能更为全面，其可以让我们的代码更加优雅！

四、异步客户端

1、简介

HTTPX 默认提供标准的同步 API，但如果需要，还可以选择异步客户端。

异步是一种比多线程更高效的并发模型，并且可以提供显着的性能优势并支持使用长寿命的网络连接，例如 WebSockets。

如果您使用的是异步 Web 框架，那么您还需要使用异步客户端来发送传出的 HTTP 请求。

发送异步请求：

import asyncio
import httpx
async def test():
    async with httpx.AsyncClient() as client:
        r = await client.get("https://www.baidu.com")
    print(r)
tasks = [test() for i in range(100)]
asyncio.run(asyncio.wait(tasks))

2、 API 差异

如果您使用的是异步客户端，那么有一些 API 使用异步方法。

2.1 发出请求

请求方法都是异步的，因此您应该response = await client.get(...)对以下所有内容使用样式：

AsyncClient.get(url, ...)
AsyncClient.options(url, ...)
AsyncClient.head(url, ...)
AsyncClient.post(url, ...)
AsyncClient.put(url, ...)
AsyncClient.patch(url, ...)
AsyncClient.delete(url, ...)
AsyncClient.request(method, url, ...)
AsyncClient.send(request, ...)

2.2 打开和关闭客户

async with httpx.AsyncClient()如果您需要上下文管理的客户端，请使用...

async with httpx.AsyncClient() as client:
    ...

或者，await client.aclose()如果您想明确关闭客户端，请使用：

client = httpx.AsyncClient()
...
await client.aclose()

2.3 流式响应

该AsyncClient.stream(method, url, ...)方法是一个异步上下文块

client = httpx.AsyncClient()
async with client.stream('GET', 'https://www.example.com/') as response:
    async for chunk in response.aiter_bytes():
        ...

异步响应流方法是：

Response.aread()- 用于有条件地读取流块内的响应。
Response.aiter_bytes()- 用于将响应内容作为字节流式传输。
Response.aiter_text()- 用于将响应内容作为文本流式传输。
Response.aiter_lines()- 用于将响应内容流式传输为文本行。
Response.aiter_raw()- 用于流式传输原始响应字节，而不应用内容解码。
Response.aclose()- 用于关闭响应。你通常不需要这个，因为.streamblock 在退出时会自动关闭响应。

对于上下文块使用不实例的情况，可以通过使用发送实例来进入“手动模式

[Request]client.send(..., stream=True)。
import httpx
from starlette.background import BackgroundTask
from starlette.responses import StreamingResponse
client = httpx.AsyncClient()
async def home(request):
    req = client.build_request("GET", "https://www.example.com/")
    r = await client.send(req, stream=True)
    return StreamingResponse(r.aiter_text(), background=BackgroundTask(r.aclose))

使用这种“手动流模式”时，作为开发人员，您有责任确保Response.aclose()最终调用它。不这样做会使连接保持打开状态，很可能导致资源泄漏。

2.4 流式传输请求

async def upload_bytes():
    ...  # yield byte content
await client.post(url, content=upload_bytes())

3、异步环境

3.1 asyncio

AsyncIO 是 Python 的内置库，用于使用 async/await 语法编写并发代码。

import asyncio
import httpx
async def main():
    async with httpx.AsyncClient() as client:
        response = await client.get('https://www.example.com/')
        print(response)
asyncio.run(main())

3.2 trio

Trio 是一个替代异步库，围绕结构化并发原则设计。

import httpx
import trio
async def main():
    async with httpx.AsyncClient() as client:
        response = await client.get('https://www.example.com/')
        print(response)
trio.run(main)

trio必须安装该软件包才能使用 Trio 后端。

3.3 anyio

AnyIO 是一个异步网络和并发库，可在asyncio或trio. 它与您选择的后端的本机库融合在一起（默认为asyncio）。

import httpx
import anyio
async def main():
    async with httpx.AsyncClient() as client:
        response = await client.get('https://www.example.com/')
        print(response)
anyio.run(main, backend='trio')

4、 python web

正如httpx.Client允许您直接调用 WSGI Web 应用程序一样，httpx.AsyncClient该类允许您直接调用 ASGI Web 应用程序。

我们以这个 Starlette 应用为例：

from starlette.applications import Starlette
from starlette.responses import HTMLResponse
from starlette.routing import Route
async def hello(request):
    return HTMLResponse("Hello World!")
app = Starlette(routes=[Route("/", hello)])

我们可以直接向应用程序发出请求，如下所示：

import httpx
async with httpx.AsyncClient(app=app, base_url="http://testserver") as client:
    r = await client.get("/")
    assert r.status_code == 200
    assert r.text == "Hello World!"

对于一些更复杂的情况，您可能需要自定义 ASGI 传输。这使您可以：

通过设置检查 500 个错误响应而不是引发异常raise_app_exceptions=False。
通过设置将 ASGI 应用程序挂载到子路径root_path。
通过设置为请求使用给定的客户端地址client。

例如：

# Instantiate a client that makes ASGI requests with a client IP of "1.2.3.4",
# on port 123.
transport = httpx.ASGITransport(app=app, client=("1.2.3.4", 123))
async with httpx.AsyncClient(transport=transport, base_url="http://testserver") as client:
    ...

五、总结及注意事项

httpx库协程好处

使用协程的方式可以帮助我们更好地利用 CPU 资源，同时也可以提高程序的效率。

注意事项

使用httpx库协程时，需要确保协程的数量不会过大，以免造成资源浪费和服务器压力。
对于请求和响应处理，应尽量避免使用阻塞式调用，可以使用异步回调的方式来处理。
在使用httpx库协程时，应尽量避免使用全局变量，以免引起不必要的错误。
尽量使用连接池，以减少对服务器的压力

Python3 新一代Http请求库Httpx使用（详情版）（下）

三、代理

1、简介

2、使用方法

2.1 简单使用

2.2 验证

2.3 路由

2.3.1 通配符路由

2.3.2 方案路由

2.3.3 域路由

2.3.4 端口路由

2.3.5 无代理支持

3、区别

3.1 前言

3.2 requests代理

3.3 总结

四、异步客户端

1、简介

2、 API 差异

2.1 发出请求

2.2 打开和关闭客户

2.3 流式响应

2.4 流式传输请求

3、异步环境

3.1 asyncio

3.2 trio

3.3 anyio

4、 python web

五、总结及注意事项

httpx库协程好处

注意事项

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

Python3 新一代Http请求库Httpx使用（详情版）（下）

三、 代理

1、 简介

2、 使用方法

2.1 简单使用

2.2 验证

2.3 路由

2.3.1 通配符路由

2.3.2 方案路由

2.3.3 域路由

2.3.4 端口路由

2.3.5 无代理支持

3、 区别

3.1 前言

3.2 requests代理

3.3 总结

四、 异步客户端

1、 简介

2、 API 差异

2.1 发出请求

2.2 打开和关闭客户

2.3 流式响应

2.4 流式传输请求

3、 异步环境

3.1 asyncio

3.2 trio

3.3 anyio

4、 python web

五、总结及注意事项

httpx库协程好处

注意事项

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

三、代理

1、简介

2、使用方法

3、区别

四、异步客户端

1、简介

3、异步环境