modelscope有提供baichuan13B系列的int8或int4量化模型吗？

展开

收起

百川13B模型

aly_lhh 2023-07-20 11:53:56 268 版权

3 条回答

写回答

取消提交回答

1965120610311648

4bit量化代码，拿走不谢

import json
import torch
import streamlit as st
#from transformers import AutoModelForCausalLM, AutoTokenizer
#from transformers.generation.utils import GenerationConfig
from modelscope import AutoModelForCausalLM, AutoTokenizer
from transformers import BitsAndBytesConfig
from modelscope import GenerationConfig
import random
import time

modle_name = "baichuan-inc/Baichuan-13B-Chat"
#modle_name = "/root/.cache/modelscope/hub/baichuan-inc/Baichuan-13B-Chat"
st.set_page_config(page_title="Baichuan-13B-Chat")
st.title("Baichuan-13B-Chat")


@st.cache_resource
def init_model():
    quantization_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type='nf4',
        bnb_4bit_compute_dtype=torch.bfloat16)

    print("model start.............")
    model = AutoModelForCausalLM.from_pretrained(
        modle_name,
        #revision = 'v1.0.1',
        torch_dtype=torch.float16,
        #device_map="auto",
        quantization_config=quantization_config,
        #fp16=True,
        device_map="balanced",
        trust_remote_code=True
    )
    #model = model.quantize(4).cuda()
    print("model_generation_config start.............")
    model.generation_config = GenerationConfig.from_pretrained(
        modle_name,
        trust_remote_code=True
    )
    print("tokenizer start.............")
    tokenizer = AutoTokenizer.from_pretrained(
        modle_name,
        use_fast=False,
        #revision = 'v1.0.1',
        trust_remote_code=True
    )
    print("---------------Init End ---------------")
    return model, tokenizer


def clear_chat_history():
    del st.session_state.messages


def init_chat_history():
    with st.chat_message("assistant", avatar='🤖'):
        st.markdown("您好，我是百川大模型，很高兴为您服务🥰")

    if "messages" in st.session_state:
        for message in st.session_state.messages:
            avatar = '🧑‍💻' if message["role"] == "user" else '🤖'
            with st.chat_message(message["role"], avatar=avatar):
                st.markdown(message["content"])
    else:
        st.session_state.messages = []

    return st.session_state.messages


def main():
    model, tokenizer = init_model()
    messages = init_chat_history()

    if prompt := st.chat_input("Shift + Enter 换行, Enter 发送"):
        with st.chat_message("user", avatar='🧑‍💻'):
            st.markdown(prompt)
        messages.append({"role": "user", "content": prompt})
        print(f"[user] {prompt}", flush=True)
        with st.chat_message("assistant", avatar='🤖'):
            placeholder = st.empty()
            for response in model.chat(tokenizer, messages, stream=True):
                placeholder.markdown(response)
                if torch.backends.mps.is_available():
                    torch.mps.empty_cache()
        messages.append({"role": "assistant", "content": response})
        print(json.dumps(messages, ensure_ascii=False), flush=True)

        st.button("清空对话", on_click=clear_chat_history)


if __name__ == "__main__":
    main()

2023-09-02 18:11:12

赞同展开评论

Star时光

很抱歉，我无法提供关于第三方工具或库的具体信息。Modelscope是一个用于模型性能和资源分析的开源工具，而Baichuan13B系列模型则是一个特定的预训练模型系列。要了解有关Modelscope支持的模型量化情况，建议您查阅Modelscope的官方文档或与开发者社区进行咨询。

对于Baichuan13B系列模型的量化情况，可能需要参考相关论文、技术报告或模型发布者的文档。如果没有明确提供Int8或Int4量化版本的模型，您可以尝试使用其他量化工具（如TensorRT、ONNX Runtime等）来对Baichuan13B模型进行量化。

请注意，在量化模型时，不同的工具和库可能具有不同的量化策略和支持的精度选项。因此，在选择量化工具和方法时，建议您对其进行详细研究，并根据您的需求和场景进行评估和测试。

2023-07-24 14:07:29

赞同展开评论
算精通

北京阿里云ACE会长

ModelScope是一个模型性能评估平台，不提供模型的训练或量化。而Baichuan13B是一种基于GPT-3的中文预训练模型，由北京百度网讯科技有限公司开发。目前，该模型是商用模型，由百度提供支持。

如果您想要使用Baichuan13B模型进行中文自然语言处理任务，并且希望使用INT8或INT4量化模型，可以考虑使用一些通用的深度学习框架，例如PyTorch和TensorFlow，来进行训练和量化。这些框架提供了丰富的量化工具和API，可以帮助您将模型量化为INT8或INT4格式，并在不损失模型精度的情况下，减小模型的体积和内存占用。同时，这些框架也提供了一些优化工具和技术，例如模型剪枝、量化感知训练等，可以帮助您进一步优化模型的性能和速度。

2023-07-21 08:03:35

赞同展开评论

modelscope有提供baichuan13B系列的int8或int4量化模型吗？

ModelScope模型即服务

热门讨论

热门文章