VLLM (Very Large Language Model)

2023-12-08 519

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

NLP自然语言处理_高级版，每接口累计50万次

NLP 自学习平台，3个模型定制额度 1个月

NLP自然语言处理_基础版，每接口每天50万次

简介： VLLM (Very Large Language Model) 是一种大型语言模型,通常具有数十亿或数万亿个参数,用于处理自然语言文本。VLLM 可以通过预训练和微调来执行各种任务,如文本分类、机器翻译、情感分析、问答等。

VLLM (Very Large Language Model) 是一种大型语言模型,通常具有数十亿或数万亿个参数,用于处理自然语言文本。VLLM 可以通过预训练和微调来执行各种任务,如文本分类、机器翻译、情感分析、问答等。

from vllm import LLM, SamplingParams

import os

# 设置环境变量，从魔搭下载模型

os.environ['VLLM_USE_MODELSCOPE'] = 'True'

llm = LLM(model="qwen/Qwen-1_8B", trust_remote_code=True)

prompts = [

    "Hello, my name is",

    "today is a sunny day,",

    "The capital of France is",

    "The future of AI is",

]

sampling_params = SamplingParams(temperature=0.8, top_p=0.95,stop=["<|endoftext|>"])

outputs = llm.generate(prompts, sampling_params,)

# print the output

for output in outputs:

    prompt = output.prompt

    generated_text = output.outputs[0].text

    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

理解 VLLM 需要了解深度学习和自然语言处理的基本概念。在深度学习中,模型通过学习大量数据来自我优化,以提高其准确性。在自然语言处理中,VLLM 是一种语言模型,用于处理自然语言文本。
要应用 VLLM,需要使用深度学习框架,如 TensorFlow 或 PyTorch,并在该框架中加载 VLLM 模型。然后,可以使用该模型来处理输入文本并生成输出文本。例如,可以使用 VLLM 来回答问题、翻译文本或生成文本摘要。
以下是一个简单的 VLLM 应用示例:

import tensorflow as tf

加载 VLLM 模型

model = tf.keras.models.load_model('vllm_model.h5')

输入文本

input_text = "What is the capital of France?"

处理输入文本并生成输出文本

output_text = model.predict(input_text)

输出结果

print(output_text)

VLLM 是一种非常有用的技术,可以用于各种自然语言处理任务。


import sys

from vllm import LLM, SamplingParams

import os

from modelscope import AutoTokenizer, snapshot_download

# 设置环境变量，从魔搭下载模型

model_dir = snapshot_download("qwen/Qwen-1_8b-Chat")

sys.path.insert(0, model_dir)

from qwen_generation_utils import (

    HistoryType,

    make_context,

    decode_tokens,

    get_stop_words_ids,

    StopWordsLogitsProcessor,

)

llm = LLM(model=model_dir, trust_remote_code=True)

tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)

prompts = [

    "Hello, my name is Alia",

    "Today is a sunny day,",

    "The capital of France is",

    "Introduce YaoMing to me.",

]

sampling_params = SamplingParams(temperature=0.8, top_p=0.95, max_tokens=128, stop=['<|endoftext|>', '<|im_start|>'])

inputs = []

for prompt in prompts:

    raw_text, context_tokens = make_context(

        tokenizer,

        prompt,

        history=[],

        system="You are a helpful assistant.",

        chat_format='chatml',

    )

    inputs.append(context_tokens)

# call with prompt_token_ids, which has template information

outputs = llm.generate(prompt_token_ids=inputs, sampling_params=sampling_params,)

histories = []

for prompt, output in zip(prompts, outputs):

    history = []

    generated_text = output.outputs[0].text

    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

    history.append((prompt, generated_text))

    histories.append(history)

prompts_new = [

    'What is my name again?',

    'What is the weather I just said today?',

    'What is the city you mentioned just now?',

    'How tall is him?'

]

inputs = []

for prompt, history in zip(prompts_new, histories):

    raw_text, context_tokens = make_context(

        tokenizer,

        prompt,

        history=history,

        system="You are a helpful assistant.",

        chat_format='chatml',

    )

    inputs.append(context_tokens)

outputs = llm.generate(prompt_token_ids=inputs, sampling_params=sampling_params,)

# print the output

for prompt, output in zip(prompts_new, outputs):

    generated_text = output.outputs[0].text

    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

VLLM (Very Large Language Model)

加载 VLLM 模型

输入文本

处理输入文本并生成输出文本

输出结果

热门文章

最新文章

相关电子书

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

VLLM (Very Large Language Model)

加载 VLLM 模型

输入文本

处理输入文本并生成输出文本

输出结果

热门文章

最新文章

相关电子书