首先将DeepSeek-R1-Distill-Llama-70B模型部署到本地服务器的docker-a中;接着在docker-b中的conda环境下,使用evalscope进行大模型压力测试,当数据集为random时能够正常返回结果,但更换数据集为gsm8k时,EvalScope 无法找到 gsm8k 数据集的注册类。
shell中返回日志如下:
(evalscope) root@yjz-eval-1750728279:/yjz_spacec/eval-muxi-poc/dataset-test# python performance-test.py
2025-07-02 16:32:24,694 - evalscope - INFO - Save the result to: ./outputs/20250702_163224/
2025-07-02 16:32:24,694 - evalscope - INFO - Starting benchmark with args:
2025-07-02 16:32:24,694 - evalscope - INFO - {
"model": "/models/deepseek/DeepSeek-R1-Distill-Llama-70B/",
"model_id": "",
"attn_implementation": null,
"api": "openai",
"tokenizer_path": "deepseek-ai/DeepSeek-R1-Distill-Llama-70B",
"port": 8877,
"url": "http://10.118.17.119:9000/v1/chat/completions",
"headers": {},
"connect_timeout": 600,
"read_timeout": 600,
"api_key": null,
"no_test_connection": false,
"number": 2,
"parallel": 1,
"rate": -1,
"log_every_n_query": 10,
"debug": false,
"wandb_api_key": null,
"swanlab_api_key": null,
"name": null,
"outputs_dir": "./outputs/20250702_163224/",
"max_prompt_length": 1024,
"min_prompt_length": 1024,
"prefix_length": 0,
"prompt": null,
"query_template": null,
"apply_chat_template": true,
"dataset": "gsm8k",
"dataset_path": null,
"frequency_penalty": null,
"repetition_penalty": null,
"logprobs": null,
"max_tokens": 1024,
"min_tokens": 1024,
"n_choices": null,
"seed": 0,
"stop": null,
"stop_token_ids": null,
"stream": true,
"temperature": 0.0,
"top_p": null,
"top_k": null,
"extra_args": {
"ignore_eos": true
}
}
2025-07-02 16:32:24,753 - evalscope - INFO - Test connection successful.
2025-07-02 16:32:26,411 - evalscope - ERROR - Exception in async function 'benchmark': 'gsm8k'
Traceback (most recent call last):
File "/opt/conda/envs/evalscope/lib/python3.10/site-packages/evalscope/perf/utils/handler.py", line 17, in async_wrapper
return await func(*args, **kwargs)
File "/opt/conda/envs/evalscope/lib/python3.10/site-packages/evalscope/perf/benchmark.py", line 197, in benchmark
async for request in get_requests(args):
File "/opt/conda/envs/evalscope/lib/python3.10/site-packages/evalscope/perf/benchmark.py", line 75, in get_requests
async for request in generator:
File "/opt/conda/envs/evalscope/lib/python3.10/site-packages/evalscope/perf/benchmark.py", line 41, in generate_requests_from_dataset
message_generator_class = DatasetRegistry(args.dataset)
File "/opt/conda/envs/evalscope/lib/python3.10/site-packages/evalscope/perf/plugin/registry.py", line 20, in __call__
return self.get_class(name)
File "/opt/conda/envs/evalscope/lib/python3.10/site-packages/evalscope/perf/plugin/registry.py", line 14, in get_class
return self._registry[name]
KeyError: 'gsm8k'
2025-07-02 16:32:26,525 - asyncio - ERROR - Task was destroyed but it is pending!
task: <Task pending name='Task-8' coro=<statistic_benchmark_metric() running at /opt/conda/envs/evalscope/lib/python3.10/site-packages/evalscope/perf/utils/handler.py:14>>
sys:1: RuntimeWarning: coroutine 'statistic_benchmark_metric' was never awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
我执行的python文件如下,我应该如何修改我的py文件呢:
#短上下文低并发,输入1k,输出1k,更换数据集为gsm8k
from evalscope.perf.main import run_perf_benchmark
from evalscope.perf.arguments import Arguments
task_cfg = Arguments(
parallel=[1], #请求的并发数,此处传入了多个并发数
number=[2], #每个并发请求的数量,与parallel对应
model='/models/deepseek/DeepSeek-R1-Distill-Llama-70B/', #模型名称,确保与算力服务器中的Curl命令请求中的model字段一致
url='http://10.118.17.119:9000/v1/chat/completions', #请求的url地址,连接算力服务器,需要提前通过docker部署服务
api='openai', #使用的api地址,默认为openai
dataset='gsm8k', #随机生成数据集
min_tokens=1*1024, #生成的最少token数量,不是所有模型服务都支持该参数
max_tokens=1*1024, #可以生成的最大token数量
prefix_length=0, #promt的前缀长度,默认为0,仅对于random数据集有效
min_prompt_length=1*1024, #最小输入prompt长度,默认为0,小于该值时,将丢弃prompt
max_prompt_length=1*1024, #最大输入prompt长度,默认为131072,大于该值时,将丢弃prompt
tokenizer_path='deepseek-ai/DeepSeek-R1-Distill-Llama-70B', #模型的tokenizer路径,计算token数量
extra_args={'ignore_eos': True} #请求中的额外参数,此参数为忽略结束token
)
results = run_perf_benchmark(task_cfg)
ModelScope旨在打造下一代开源的模型即服务共享平台,为泛AI开发者提供灵活、易用、低成本的一站式模型服务产品,让模型应用更简单!欢迎加入技术交流群:微信公众号:魔搭ModelScope社区,钉钉群号:44837352