今天我想分享一个有趣的项目叫做 “K8s ChatGPT Bot[1]”。该项目的目的是为 K8s 集群部署一个 ChatGPT 机器人。我们可以询问 ChatGPT 帮助我们解决 Prometheus 的警报,可以得到简洁的答复,不再需要一人在黑暗中独自解决警报问题!
我们需要用到 Robusta[2],如果你还没有 Robusta,可以参考《K8s — Robusta, K8s Troubleshooting Platform[3]》搭建一个 Robusta 平台。
下图是 Robusta 平台如何工作的截图:
你可以在此处查看完整的演示视频:
https://www.loom.com/share/964cd8735a874287a9155c77320bdcdb
运行 K8s ChatGPT 机器人项目
该机器人项目是基于 Robusta.dev[4] 实现的,Robusta.dev 是一个用于响应 K8s 警报的开源平台。其工作流程大致如下:
- Prometheus 使用 Webhook 接收器将警报转发给 Robusta.dev 。
- Robusta.dev 询问 ChatGPT 如何修复 Prometheus 警报。
基于 Spring Boot + MyBatis Plus + Vue & Element 实现的后台管理系统 + 用户小程序,支持 RBAC 动态权限、多租户、数据权限、工作流、三方登录、支付、短信、商城等功能
先决条件
- Slack
- Kubernetes 集群
- Python 3.7 及以上
基于 Spring Cloud Alibaba + Gateway + Nacos + RocketMQ + Vue & Element 实现的后台管理系统 + 用户小程序,支持 RBAC 动态权限、多租户、数据权限、工作流、三方登录、支付、短信、商城等功能
如何安装 Robusta
生成 Robusta 配置文件
为 Robusta 准备 Python 虚拟环境。
$ python3.10 -m venv robusta $ source robusta/bin/activate (robusta) $ pip install -U robusta-cli --no-cache Collecting robusta-cli Downloading robusta_cli-0.10.10-py3-none-any.whl (223 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 223.8/223.8 kB 30.0 MB/s eta 0:00:00 Collecting pymsteams<0.2.0,>=0.1.16 Downloading pymsteams-0.1.16.tar.gz (7.6 kB) Preparing metadata (setup.py) ... done ... Successfully installed PyJWT-2.4.0 appdirs-1.4.4 autopep8-2.0.1 black-21.5b2 cachetools-5.2.1 certifi-2022.12.7 cffi-1.15.1 charset-normalizer-3.0.1 ... ruamel.yaml.clib-0.2.7 six-1.16.0 slack-sdk-3.19.5 tenacity-8.1.0 toml-0.10.2 tomli-2.0.1 typer-0.4.2 typing-extensions-4.4.0 urllib3-1.26.14 watchgod-0.7 webexteamssdk-1.6.1 websocket-client-1.3.3
使用 robusta 生成一个配置文件:
$ robusta gen-config Robusta reports its findings to external destinations (we call them "sinks"). We'll define some of them now. Configure Slack integration? This is HIGHLY recommended. [Y/n]: Y If your browser does not automatically launch, open the below url: https://api.robusta.dev/integrations/slack?id=xxxx
配置 Slack 集成
使用浏览器打开网页:https://api.robusta.dev/integrations/slack?id=xxxx
更新权限:
恭喜你配置 Slack 集成成功。
现在回到我们的 Terminal 终端,我们可以看到以下内容,说明操作成功:
$ robusta gen-config Robusta reports its findings to external destinations (we call them "sinks"). We'll define some of them now. Configure Slack integration? This is HIGHLY recommended. [Y/n]: Y If your browser does not automatically launch, open the below url: https://api.robusta.dev/integrations/slack?id=xxxx You've just connected Robusta to the Slack of: devopsfans Which slack channel should I send notifications to? # k8s-chatgpt-bot Configure Robusta UI sink? This is HIGHLY recommended. [Y/n]: Y Enter your Gmail/Google address. This will be used to login: xxx@gmail.com Choose your account name (e.g your organization name): devopsfans Successfully registered. Robusta can use Prometheus as an alert source. If you haven't installed it yet, Robusta can install a pre-configured Prometheus. Would you like to do so? [y/N]: y Please read and approve our End User License Agreement: https://api.robusta.dev/eula.html Do you accept our End User License Agreement? [y/N]: y Last question! Would you like to help us improve Robusta by sending exception reports? [y/N]: N Saved configuration to ./generated_values.yaml - save this file for future use! Finish installing with Helm (see the Robusta docs). Then login to Robusta UI at https://platform.robusta.dev By the way, we'll send you some messages later to get feedback. (We don't store your API key, so we scheduled future messages using Slack's API)
在 slack channel 中,我们还可以看到:
使用 Helm3 安装 Robusta
安装和更新 robusta 仓库。
$ helm repo add robusta https://robusta-charts.storage.googleapis.com && helm repo update "robusta" has been added to your repositories Hang tight while we grab the latest from your chart repositories... ...Successfully got an update from the "kedacore" chart repository ...Successfully got an update from the "robusta" chart repository ...Successfully got an update from the "grafana" chart repository ...Successfully got an update from the "prometheus-community" chart repository ...Successfully got an update from the "stable" chart repository Update Complete. ⎈Happy Helming!⎈
更新 generated_values.yaml 文件
使用以下内容更新 generated_values.yaml 文件:
playbookRepos: chatgpt_robusta_actions: url: "https://github.com/robusta-dev/kubernetes-chatgpt-bot.git" customPlaybooks: # Add the 'Ask ChatGPT' button to all Prometheus alerts - triggers: - on_prometheus_alert: {} actions: - chat_gpt_enricher: {} globalConfig: chat_gpt_token: YOUR KEY GOES HERE
将 Robusta 部署到 K8s
$ helm install robusta robusta/robusta -f ./generated_values.yaml \ --set clusterName=dev-cluster
验证两个 Robusta pod 正常运行且在日志中没有发现错误日志:
$ kubectl get pods -A | grep robusta default alertmanager-robusta-kube-prometheus-st-alertmanager-0 2/2 Running 1 (4m19s ago) 9m25s default prometheus-robusta-kube-prometheus-st-prometheus-0 2/2 Running 0 9m25s default robusta-forwarder-6b7d8d9d88-2rv9d 1/1 Running 0 9m29s default robusta-grafana-64944bfcdc-v97xh 3/3 Running 0 9m29s default robusta-kube-prometheus-st-admission-patch-6zj4b 0/1 Completed 0 9m28s default robusta-kube-prometheus-st-operator-7b985d7fb-c9f9t 1/1 Running 0 9m29s default robusta-kube-state-metrics-688d794968-ll6gf 1/1 Running 0 9m29s default robusta-prometheus-node-exporter-2k5f7 1/1 Running 0 5m24s default robusta-prometheus-node-exporter-zxsrg 1/1 Running 0 9m29s default robusta-runner-5868b494d6-m6292 1/1 Running 0 9m29s $ robusta logs setting up colored logging 2023-01-14 22:57:01.428 INFO logger initialized using INFO log level 2023-01-14 22:57:01.429 INFO Creating hikaru monkey patches 2023-01-14 22:57:01.429 INFO Creating yaml monkey patch 2023-01-14 22:57:01.429 INFO Creating kubernetes ContainerImage monkey patch 2023-01-14 22:57:01.430 INFO watching dir /etc/robusta/playbooks/ for custom playbooks changes 2023-01-14 22:57:01.431 INFO watching dir /etc/robusta/config/active_playbooks.yaml for custom playbooks changes 2023-01-14 22:57:01.431 INFO Reloading playbook packages due to change on initialization 2023-01-14 22:57:01.431 INFO loading config /etc/robusta/config/active_playbooks.yaml 2023-01-14 22:57:01.467 INFO No custom playbooks defined at /etc/robusta/playbooks/storage 2023-01-14 22:57:01.468 INFO Cloning git repo https://github.com/robusta-dev/kubernetes-chatgpt-bot.git. repo name kubernetes-chatgpt-bot ... 2023-01-14 22:57:07.364 INFO connecting to server as account_id=8302df56-c554-4129-8b95-d143d1f2e3a2; cluster_name=dev-cluster 2023-01-14 22:57:07.977 INFO Initializing services cache 2023-01-14 22:57:08.203 INFO Initializing nodes cache 2023-01-14 22:57:08.395 INFO Initializing jobs cache 2023-01-14 22:57:08.603 INFO Getting events history 2023-01-14 22:57:10.403 INFO Cluster historical data sent. 2023-01-14 23:04:43.681 INFO cluster status {'account_id': '8302df56-c554-4129-8b95-d143d1f2e3a2', 'cluster_id': 'dev-cluster', 'version': '0.10.10', 'last_alert_at': '2023-01-14 23:04:18.959377', 'light_actions': ['related_pods', 'prometheus_enricher', 'add_silence', 'delete_pod', 'delete_silence', 'get_silences', 'logs_enricher', 'pod_events_enricher', 'deployment_events_enricher', 'job_events_enricher', 'job_pod_enricher', 'get_resource_yaml', 'node_cpu_enricher', 'node_disk_analyzer', 'node_running_pods_enricher', 'node_allocatable_resources_enricher', 'node_status_enricher', 'node_graph_enricher', 'oomkilled_container_graph_enricher', 'pod_oom_killer_enricher', 'oom_killer_enricher', 'volume_analysis', 'python_profiler', 'pod_ps', 'python_memory', 'debugger_stack_trace', 'python_process_inspector', 'prometheus_alert', 'create_pvc_snapshot'], 'updated_at': 'now()'}