问题一:机器学习PAI服务运行中了,但不可用,调用时报错,如何解决?
机器学习PAI服务运行中了,但不可用,调用时报错?
[2024-03-06 11:38:11] ERROR: Exception in ASGI application
[2024-03-06 11:38:11] Traceback (most recent call last):
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 419, in run_asgi
[2024-03-06 11:38:11] result = await app( # type: ignore[func-returns-value]
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 84, in call
[2024-03-06 11:38:11] return await self.app(scope, receive, send)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in call
[2024-03-06 11:38:11] await super().call(scope, receive, send)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 123, in call
[2024-03-06 11:38:11] await self.middleware_stack(scope, receive, send)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 186, in call
[2024-03-06 11:38:11] raise exc
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 164, in call
[2024-03-06 11:38:11] await self.app(scope, receive, _send)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 83, in call
[2024-03-06 11:38:11] await self.app(scope, receive, send)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 83, in call
[2024-03-06 11:38:11] await self.app(scope, receive, send)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 62, in call
[2024-03-06 11:38:11] await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
[2024-03-06 11:38:11] raise exc
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
[2024-03-06 11:38:11] await app(scope, receive, sender)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 758, in call
[2024-03-06 11:38:11] await self.middleware_stack(scope, receive, send)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 778, in app
[2024-03-06 11:38:11] await route.handle(scope, receive, send)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 299, in handle
[2024-03-06 11:38:11] await self.app(scope, receive, send)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 79, in app
[2024-03-06 11:38:11] await wrap_app_handling_exceptions(app, request)(scope, receive, send)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 64, in wrapped_app
[2024-03-06 11:38:11] raise exc
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 53, in wrapped_app
[2024-03-06 11:38:11] await app(scope, receive, sender)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 74, in app
[2024-03-06 11:38:11] response = await func(request)
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 299, in app
[2024-03-06 11:38:11] raise e
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 294, in app
[2024-03-06 11:38:11] raw_response = await run_endpoint_function(
[2024-03-06 11:38:11] File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 191, in run_endpoint_function
[2024-03-06 11:38:11] return await dependant.call(**values)
[2024-03-06 11:38:11] File "/code/ChatLLM-webui/webui/entrypoints/api_server.py", line 242, in chat_api
[2024-03-06 11:38:11] if cmd_opts.enable_lora:
[2024-03-06 11:38:11] AttributeError: 'Namespace' object has no attribute 'enable_lora'
我加了这个 现在起来了
参考答案:
看着像是vllm在分配显存池的时候没有拿到足够的空间,试试设置下--gpu-memory-utilization,比如0.98 还有 --max-model-len,比如4096
关于本问题的更多回答可点击进行查看:
https://developer.aliyun.com/ask/602707
问题二:通义千问 HTTP请求多轮对话的方式是什么格式的,文档里的没法实现,需要帮助,谢谢
{"model":"qwen-max","input":{"messages":[{"role": "system","content":"You are a helpful assistant."},{"role":"user","content":"入参"}]},"parameters":{}}
这样能正确返回。
{"model":"qwen-max","input":{"messages":[{"role": "system","content":"You are a helpful assistant."},{"role":"user","content":"入参"},{"role": "system","content":"首次返回的文本"},{"role":"user","content":"继续"}]},"parameters":{}}
这样写提示错误信息,message的大概意思是body格式错误。
{"model":"qwen-max","input":{"messages":[{"role":"user","content":"继续"}]},"parameters":{}}
这样写,会回复 有什么问题需要帮忙什么的...。
参考答案:
通义千问的 HTTP 请求多轮对话是通过 JSON 格式的数据传输来实现的。根据你提供的例子,可以按照以下方式构建 JSON 数据:
{
"model": "qwen-max",
"input": {
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "入参"
}
]
},
"parameters": {}
}
需要注意的是,在上述示例中,“model”字段指定了所使用的模型名称,“input”字段包含了多轮对话的消息列表,每个消息都有一个“role”字段表示角色(系统或用户),以及一个“content”字段表示消息内容。最后,“parameters”字段可以包含其他参数,根据你的需求进行设置。
请确保将以上 JSON 数据作为 HTTP 请求的主体发送给通义千问的 API 接口,并设置适当的请求头和 URL。具体的请求方式和 URL 取决于你所使用的编程语言和库,可以参考相关文档或示例代码来执行 HTTP 请求。
关于本问题的更多回答可点击进行查看:
https://developer.aliyun.com/ask/602363
问题三:机器学习PAI创建特征视图时报错,怎么处理?
机器学习PAI创建特征视图时报错,怎么处理?
参考答案:
要正常使用的话,必须要有离线数据源和在线数据源,因为上线的时候取特征需要在线数据源,我们也需要保证离在线一致性。现在只使用离线部分的话可以考虑使用TableStore在线数据源,这个是按量计费的。
关于本问题的更多回答可点击进行查看:
https://developer.aliyun.com/ask/602296
问题四:在机器学习PAI按照官方给的最佳实践手册,走到上述步骤「数据同步Hologres」报错,怎么解决?
在机器学习PAI按照官方给的最佳实践手册,走到上述步骤「数据同步Hologres」报错,怎么解决?
参考答案:
楼主你好,看了你的问题,遇到了“数据同步Hologres”报错,你可以检查配置是否正确,确保你按照手册中的指引正确配置了Hologres的连接信息,包括数据库地址、用户名、密码等,一定要确保这些信息正确无误。
还有就是确保你的账号有足够的权限来执行数据同步的操作,检查你的账号是否有在Hologres数据库中读取、写入的权限。以及检查Hologres数据库状态,确认Hologres数据库的状态是否正常运行,可以尝试重新启动或者重建数据库。
关于本问题的更多回答可点击进行查看:
https://developer.aliyun.com/ask/602295
问题五:机器学习PAI webui文件导入导出本地执行没问题,页面运行报错为什么?
机器学习PAI webui文件导入导出本地执行没问题,remote模式下, flink1.13.0集群,执行页面运行报错为什么?
参考答案:
楼主你好,看了你的问题,可能是由于集群版本不匹配引起的,也就是由于使用的Flink集群版本与页面运行环境不兼容导致的,所以请确保页面运行环境和集群版本匹配。
报错信息中提到了Failed to deserialize JobGraph,可能是由于序列化问题导致的,请检查代码中的序列化操作,确保对象能够正确地被序列化和反序列化。
报错信息中incompatible types for field cpuCores,可能是由于字段类型不匹配导致的。
关于本问题的更多回答可点击进行查看: