【滴滴开源运维监控系统】夜莺V5版本部署实践
滴滴开源运维监控系统-夜莺Nightingale
夜莺是新一代国产智能监控系统。对云原生场景、传统物理机虚拟机场景,都有很好的支持,10分钟完成搭建,1小时熟悉使用,经受了滴滴生产环境海量数据的验证,希望打造国产监控的标杆之作
新版Nightingale在2020.3.20发布v1版本,目前是v5.0版本,从这个版本开始,与Prometheus、VictoriaMetrics、Grafana、Telegraf等生态做了协同集成,力争打造国内最好用的开源运维监控系统。
本文参考如下链接完成
https://n9e.gitee.io/quickstart/standalone/ https://n9e.gitee.io/quickstart/telegraf/ https://blog.csdn.net/smallbird108/article/details/122497200
相关组件安装包准备
1、https://downloads.mysql.com/archives/community/ 2、https://github.com/prometheus/prometheus/releases/download/v2.33.1/prometheus-2.33.1.linux-amd64.tar.gz 3、https://dl.influxdata.com/telegraf/releases/telegraf-1.21.3-1.x86_64.rpm 4、https://github.com/n9e/fe-v5/releases n9e-5.3.3.tar.gz
一、安装MySQL
rpm -ivh mysql-community-common-5.7.36-1.el7.x86_64.rpm rpm -ivh mysql-community-libs-5.7.36-1.el7.x86_64.rpm rpm -ivh mysql-community-client-5.7.36-1.el7.x86_64.rpm rpm -ivh mysql-community-server-5.7.36-1.el7.x86_64.rpm
systemctl start mysqld netstat -anp | grep 3306 systemctl enable mysqld 查看初始密码 grep 'temporary password' /var/log/mysqld.log 修改密码 set password for root@localhost=password('MySQL_2022'); grant all privileges on *.* to root@'%' identified by 'MySQL_2022'; flush privileges;
二、安装prometheus
mkdir -p /opt/prometheus tar xf prometheus-2.33.1.linux-amd64.tar.gz cp -far prometheus-2.33.1.linux-amd64/* /opt/prometheus/ cd /opt/prometheus chown -R root:root *
# service cat <<EOF >/etc/systemd/system/prometheus.service [Unit] Description="prometheus" Documentation=https://prometheus.io/ After=network.target [Service] Type=simple ExecStart=/opt/prometheus/prometheus --config.file=/opt/prometheus/prometheus.yml --storage.tsdb.path=/opt/prometheus/data --web.enable-lifecycle --enable-feature=remote-write-receiver --query.lookback-delta=2m Restart=on-failure SuccessExitStatus=0 LimitNOFILE=65536 StandardOutput=syslog StandardError=syslog SyslogIdentifier=prometheus [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl enable prometheus systemctl restart prometheus systemctl status prometheus
其中prometheus在启动的时候要注意开启 --enable-feature=remote-write-receiver
三、安装Redis
建议给Redis添加密码
curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo yum install -y redis systemctl enable redis vim /etc/redis.conf
systemctl restart redis
四、n9e部署
mkdir /usr/local/n9e tar -zxvf n9e-5.3.3.tar.gz -C /usr/local/n9e/ vim /usr/local/n9e/etc/server.conf 配置文件中MySQL Redis连接密码修改以及对接IP地址修改 vim /usr/local/n9e/etc/webapi.conf mysql -uroot -p'MySQL_2022' < /usr/local/n9e/docker/initsql/a-n9e.sql
mkdir /opt/n9e cat <<EOF >/etc/systemd/system/n9e-server.service [Unit] Description="n9e-server" After=network.target [Service] Type=simple ExecStart=/usr/local/n9e/n9e server WorkingDirectory=/usr/local/n9e Restart=on-failure SuccessExitStatus=0 LimitNOFILE=65536 StandardOutput=syslog StandardError=syslog SyslogIdentifier=n9e-server [Install] WantedBy=multi-user.target EOF cat <<EOF >/etc/systemd/system/n9e-webapi.service [Unit] Description="n9e-webapi" After=network.target [Service] Type=simple ExecStart=/usr/local/n9e/n9e webapi WorkingDirectory=/usr/local/n9e Restart=on-failure SuccessExitStatus=0 LimitNOFILE=65536 StandardOutput=syslog StandardError=syslog SyslogIdentifier=n9e-server [Install] WantedBy=multi-user.target EOF
systemctl enable n9e-server.service systemctl enable n9e-server.service systemctl enable n9e-webapi.service systemctl restart n9e-server.service n9e-webapi.service systemctl status n9e-server.service systemctl status n9e-webapi.service firewall-cmd --permanent --zone=public --add-port=18000/tcp firewall-cmd --permanent --zone=public --add-port=19000/tcp firewall-cmd --reload
五、监控主机上安装采集器telegraf
例如找一台监控主机作为监控主机客户端进行测试
rpm -ivh telegraf-1.21.3-1.x86_64.rpm
cat <<EOF > /etc/telegraf/telegraf.conf [global_tags] [agent] interval = "10s" round_interval = true metric_batch_size = 1000 metric_buffer_limit = 10000 collection_jitter = "0s" flush_interval = "10s" flush_jitter = "0s" precision = "" hostname = "" omit_hostname = false [[outputs.opentsdb]] host = "http://192.168.31.127" port = 19000 http_batch_size = 50 http_path = "/opentsdb/put" debug = false separator = "_" [[inputs.cpu]] percpu = true totalcpu = true collect_cpu_time = false report_active = true [[inputs.disk]] ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"] [[inputs.diskio]] [[inputs.kernel]] [[inputs.mem]] [[inputs.processes]] [[inputs.system]] fielddrop = ["uptime_format"] [[inputs.net]] ignore_protocol_stats = true EOF systemctl restart telegraf.service
六、登录n9e web服务端参看监控指标项
默认用户名密码为:root/root.2000
这里使用telegraf作为采集器,本文只简单介绍入门部署,更多功能待研究与实践