数据缓存系列分享(二)：23秒完成从零开始搭建StableDiffusion-阿里云开发者社区

前言

通过文章数据缓存系列分享(一)：打开大模型应用的另一种方式我们了解了ECI的数据缓在使用体验、性能等方面相比于NAS、OSS存储方式的优劣。本文将继续结合实际场景 StableDiffusion 应用讲解数据缓存在大模型应用弹性场景所带来的加速效果。值得一提的是，即便是对于没有任何准备、零算法基础、零大模型背景知识的开发者也可以轻松地通过ECI openAPI在短短的23秒的时间内就可以搭建一个完整的StableDiffusion应用。

第一步：创建模型缓存（422ms）

我们使用的模型是 stabilityai/stable-diffusion-2-1，模型文件总大约为40GB，也可以选择其他的模型或者版本。

截屏2023-11-24 下午10.39.04.png

repoSource:HuggingFace/Model

repoId: stabilityai/stable-diffusion-2-1

然后通过数据缓存控制台直接创建，也可以通过k8s CRD或者阿里云openapi，本样例就直接控制台的方式演示。考虑到huggingface目前网络的稳定性，可以选择香港创建，缓存支持跨地域拷贝。

全部创建参数如下：

截屏2023-11-24 下午5.58.33.png

更多API参数说明可以参考：https://help.aliyun.com/document_detail/2412236.html

k8s用户可以参考文档：https://help.aliyun.com/document_detail/2412299.html

后端制作时间统计：

后端只用了179ms的时间就缓存完成了，加上同步的API请求耗时243ms，这个大模型的缓存合计422ms制作完成。

第二步：准备运行环境（0s）

容器场景就是镜像制作，需要包含stable-diffusion运行所需的环境，比如cuda、transformers库，以及其他的基础依赖等。

ECI已经构建了一个能够运行大多数模型的稳定的环境：

gpu通用版：registry.cn-hangzhou.aliyuncs.com/eci_open/ubuntu:cuda11.7.1-cudnn8-ubuntu20.04

cpu通用版：registry.cn-hangzhou.aliyuncs.com/eci_open/ubuntu:hf-ubuntu20.04

gpu stable-diffusion版：registry.cn-hangzhou.aliyuncs.com/eci_open/stable-diffusion:1.0.0

如果应用没有特殊依赖可以直接使用，该步剩余内容可以直接跳过。

构建镜像的完整的Dockerfile如下:

FROM nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu20.04
LABEL maintainer="Alibaba Cloud Serverless Container"RUN apt update && \
    apt install -y bash \
                   vim \
                   build-essential \
                   git \
                   git-lfs \
                   curl \
                   ca-certificates \
                   libsndfile1-dev \
                   libgl1 \
                   python3.8 \
                   python3-pip \
                   python3.8-venv && \
    rm -rf /var/lib/apt/lists
# make sure to use venvRUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"RUN mkdir -p /workspace/pic/
WORKDIR /workspace
COPY http-server.py http-server.py
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
    python3 -m pip install --no-cache-dir \
        torch \
        torchvision \
        torchaudio \
        invisible_watermark && \
    python3 -m pip install --no-cache-dir \
        accelerate \
        datasets \
        hf-doc-builder \
        huggingface-hub \
        Jinja2 \
        librosa \
        numpy \
        scipy \
        tensorboard \
        transformers \
        omegaconf \
        pytorch-lightning \
        xformers \
        safetensors \
        diffusers
CMD ["/bin/bash"]

因为这个运行环境依赖部分都比较通用，所以ECI直接做成了公共base镜像并做了公共缓存，所有用户都无需自己制作和拉取。

其中会用到的http服务脚本如下：

importsysimportosimporthashlibimporttorchfromdiffusersimportStableDiffusionPipeline, DPMSolverMultistepSchedulerfromhttp.serverimportBaseHTTPRequestHandlerfromhttp.serverimportHTTPServerfromurllibimportparsefromurllib.parseimporturlparse, parse_qsMODEL_DIR_NEV="MODEL_DIR"APP_PORT_ENV="APP_PORT"deftext2image(input) :
model_id=os.getenv(MODEL_DIR_NEV, default="/data/model/")
pipe=StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe.scheduler=DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe=pipe.to("cuda")
image=pipe(input).images[0]
name="/workspace/pic/"+hashlib.md5(input.encode('utf8')).hexdigest() +".png"image.save(name)
returnnameclassGetHandler(BaseHTTPRequestHandler):
defdo_GET(self):
query=parse_qs(urlparse(self.path).query)
# 获取参数值input=query.get('input')[0]
print("get user input:%s, try generate image"%input)
picName=text2image(input)
# 构造响应self.send_response(200)
self.send_header('Content-type', 'text/html')
self.end_headers()
self.wfile.write(bytes("<html><head><title>Stable Diffusion</title></head>", "utf-8"))
self.wfile.write(bytes("<body><p>Success generate image:%s</p>"%picName, "utf-8"))
self.wfile.write(bytes("</body></html>", "utf-8"))

注：

1、应用端口号默认为8888，以及模型存放目录默认为/data/model/，用户可以通过容器的env进行自定义。

第三步：应用部署（22s）

部署到ECI（GPU通用版）

准备好了环境就可以直接部署应用了，主要包含容器运行所需的资源创建、GPU驱动安装、大模型缓存的加载等。本样例继续以控制台操作参考。

选择GPU规格

选择推理镜像

所有地域都可以直接指定：registry.cn-hangzhou.aliyuncs.com/eci_open/ubuntu

镜像版本：cuda11.7.1-cudnn8-ubuntu20.04

启动命令：启动http server的脚本

截屏2023-11-24 下午11.09.58.png

选择模型缓存

选择刚刚创建好的模型缓存，将模型挂载进容器的/data/model/目录

截屏2023-11-24 下午11.10.23.png

开放公网访问（如果需要）

截屏2023-11-27 上午10.16.22.png

注：

1、比较重要的几个参数：推理所需要的容器镜像、挂载模型缓存、以及GPU规格提升推理速度。

2、该版本应用是GPU版，CPU也支持，只需要使用ECI提供的CPU环境镜像即可，其他的没有区别，应用启动会比GPU版本快很多，但是推理会非常慢。

3、如果使用公网ip访问应用，需要检查安全组规格是否放开8888端口。

应用启动端到端耗时22s：

至此，应用依赖的所有资源都已经ready，http服务完成启动。

第四步：测试

http://localhost:8888/?input=__________________________________________

输入（文本）	输出（图片）
A clear crystal sphere is suspended on the calm sea surface, reflecting the sky at sunset,super detailed, artistic, 8K wallpaper, HDR,high quality
A photo of a barbie girl in a barbie world, smile,the whole body photograph, super detailed, artistic, 8K wallpaper, HDR,high quality,brown long hair
A lake beside a high mountain, at dusk, artistic, high quality,HDR
A photo of clear natural scenery with high snow capped mountains, green grasslands, river, sheep herds,artistic, high quality, HDR
An oil painting in the style of Van Gogh, shot at dusk in autumn in the Netherlands, in the picture there are farmland, blue sky, and children playing, two boys and two girls
A painting is more than the sum of its parts. A cow by itself is just a cow. A meadow by itself is just grass, flowers. And the sun peeking through the trees is just a beam of light. But you put them all together and it can be magic

注：

本应用生成的图片默认都在/workspace/pic/目录，如果想要支持浏览器直接查看，可以开启http服务，例如：

python3 -m http.server 9999 --directory /workspace/pic/ &，然后就可以在浏览器http://localhost:9999直接查看了。

扩展

stable-diffusion优秀的模型非常之多，我们想要的模型基本都能找到，而且很多都可以直接使用或者试用，比如 https://huggingface.co/、 https://openai.wiki/lora-model-part-3.html、https://civitai.com/等

截屏2023-08-03 下午4.53.03.png

如果觉得前面http请求方式体验不是很好，ECI也提供了打包好的支持webui的公共镜像：

registry.cn-hangzhou.aliyuncs.com/eci_open/stable-diffusion:1.0.0，已经全地域做了缓存，可以直接使用。

本样例选择了一个亚洲人像绘画比较精致的模型hanafuusen2001/BeautyProMix来展示通过模型缓存搭建SD webUI。

创建模型缓存

模型缓存创建流程和步骤1基本相同：

截屏2023-11-27 上午10.12.14.png

部署SD webUI（控制台方式）

创建好模型缓存后就可以用步骤2相同的流程启动应用。

选择GPU规格

ecs.gn7i-c8g1.2xlarge

截屏2023-11-27 上午10.15.21.png

选择SD标准镜像

镜像仓库：registry.cn-hangzhou.aliyuncs.com/eci_open/stable-diffusion

镜像版本：1.0.0

应用启动命令已经内置无需设置

截屏2023-11-27 上午10.25.57.png

选择SD模型缓存

将前面创建好的SD模型挂载进容器的目录/stable-diffusion-webui/models/Stable-diffusion/

截屏2023-11-27 上午10.16.08.png

开放公网访问（如果需要）

截屏2023-11-27 上午10.16.22.png

部署SD webUI（k8s api方式）

本样例还提供了完整的k8s pod创建参数参考：

{
"metadata": {
"annotations": {
"k8s.aliyun.com/eci-image-cache": "true",
"k8s.aliyun.com/eci-use-specs": "ecs.gn7i-c8g1.2xlarge",
"k8s.aliyun.com/eci-data-cache-bucket": "huggingFace-model"        },
"name": "stable-diffusion",
"namespace": "default"    },
"spec": {
"containers": [
            {
"image": "registry.cn-hangzhou.aliyuncs.com/eci_open/stable-diffusion:1.0.0",
"imagePullPolicy": "IfNotPresent",
"name": "stable-diffusion",
"resources": {
"requests": {
"nvidia.com/gpu": "1"                    }
                },
"volumeMounts": [
                    {
"mountPath": "/stable-diffusion-webui/models/Stable-diffusion/",
"name": "model"                    }
                ]
            }
        ],
"restartPolicy": "Never",
"volumes": [
            {
"hostPath": {
"path": "/models/huggingFace/hanafuusen2001/BeautyProMix/"                },
"name": "model"            }
        ]
    }
}

注：

1、该镜像对于stable-diffusion模型是通用的，只需将创建好的任意模型缓存挂载进/stable-diffusion-webui/models/Stable-diffusion/目录即可使用不同的模型。

2、如果使用公网ip访问webUI，需要检查安全组规格是否放开8888端口。

通过ip:8888就可以访问stable-diffusion的web界面，操作起来更直观:

截屏2023-08-03 下午5.32.41.png

这是我用这个模型画的几幅画，细节处理的很好。

总结

ECI数据缓存在大模型场景主打的就是启动快，正如上一篇文章所提到的，借助数据缓存把大模型从应用镜像里剥离出来后，应用部署也敏捷了很多。目前的数据缓存可以很好的解决一些通用的模型仓库，比如huggingface、modelscope等，模型加载加速的问题，而针对已有的NAS、OSS等存储的模型源，除了ECI的数据缓存，还可以选择开源的Fluid加速方式。

大模型应用启动快还可以带来什么？最显而易见的就是成本的下降和弹性的提升。如果部署一个应用耗时非常久，流程非常冗长，开发者会选择持有资源来部署，牺牲弹性，付出更多的成本。

ECI不是专业的模型平台，关于模型调优和应用给不了更多的建议，但是作为IaaS服务，就是让开发者更好地集成云的基础能力，解决大模型应用部署，尤其是Serverless化后所遇到的挑战。ECI已经能把一个大模型的应用冷启动时间压缩到20s的级别，而且还在持续优化GPU应用的启动以及大模型的极速加载，ECI的目标就是让大模型应用也能在阿里云上弹起来。

QA

1、模型缓存制作失败，提示网络不通或者模型不存在？

如果是在非香港地域以及海外地域，huggingface网络很不稳定，缓存失败的概率很高，建议更换地域缓存，然后走缓存拷贝流程。

截屏2023-11-27 上午11.05.12.png

2、缓存制作卡了很久

对于常用的模型，或者使用过的公共模型都会有预热缓存，缓存会比较快，如果卡了很久，点击制作任务检查下日志是否正常拉取模型中。如果是，说明该缓存是第一次使用，耐心等待。

截屏2023-11-27 上午10.47.02.png

3、缓存制作卡了很久，最后还是失败了

模型缓存默认空间只有20GB，如果是非常大的模型，需要根据模型大小设置缓存空空间。

4、通过公网访问不通

检查下容器的状态或者日志，看是否正常启动。然后检查下安全组规格是否放开这个端口。

5、可以使用其他的模型吗？

可以。模型缓存是通用的，任何模型源的模型都可以支持缓存，不限于样例使用的这些模型。

6、出图报错，错误日志提示OOM

更换更大的规格，显存过小或者图片过大会容易OOM。

附录

数据缓存系列分享(一)：打开大模型应用的另一种方式

数据缓存系列分享(二)：23秒完成从零开始搭建StableDiffusion

数据缓存系列分享(三)：通过 StableDiffusion 扩展插件实现网红爆款文字光影图

数据缓存系列分享(四)：开源大语言模型通义千问快速体验

数据缓存系列分享(五)：零代码搭建妙鸭相机

数据缓存系列分享(六)：通义千问Qwen-14B大模型快速体验

数据缓存系列分享(二)：23秒完成从零开始搭建StableDiffusion

前言

第一步：创建模型缓存（422ms）

第二步：准备运行环境（0s）

第三步：应用部署（22s）

部署到ECI（GPU通用版）

选择GPU规格

选择推理镜像

选择模型缓存

开放公网访问（如果需要）

第四步：测试

扩展

创建模型缓存

部署SD webUI（控制台方式）

选择GPU规格

选择SD标准镜像

选择SD模型缓存

开放公网访问（如果需要）

部署SD webUI（k8s api方式）

总结

QA

附录

弹性计算

热门文章

最新文章

相关课程

相关电子书

相关实验场景