PostgreSQL 实时健康监控 大屏 - 低频指标

本文涉及的产品
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云原生数据库 PolarDB MySQL 版,Serverless 5000PCU 100GB
云原生数据库 PolarDB 分布式版,标准版 2核8GB
简介:

标签

PostgreSQL , 大屏指标 , qps , long query , locks , active , idle in transaction , long idle in transaction , 2PC


背景

最关键的一些数据库健康指标,趋势监测。

1 数据年龄

年龄不能超过20亿(大概值),建议当达到15亿时,应尽快安排freeze。

postgres=# select datname,age(datfrozenxid) from pg_database order by 2 desc;  
  datname  |    age      
-----------+-----------  
 postgres  | 172127964  
 template1 | 172127964  
 template0 | 172127964  
 test      | 172127964  
(4 rows)  

2 年龄大于N的对象占用空间数

年龄大于12亿的数据库,占用的空间数

postgres=# select pg_size_pretty(sum(pg_database_size(oid))) from pg_database where age(datfrozenxid)>1200000000;  
 pg_size_pretty   
----------------  
   
(1 row)  

3 freeze 风暴预测

《PostgreSQL Freeze 风暴预测续 - 珍藏级SQL》

4 表膨胀

https://raw.githubusercontent.com/digoal/pgsql_admin_script/master/generate_report.sh

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'`  
do  
psql -d $db --pset=pager=off -q -x -c 'SELECT  
  current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta,  
  ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,  
  CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages,  
  CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,  
  CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs*(relpages-otta))::bigint || $$ bytes$$ END AS wastedsize,  
  iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta,  
  ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,  
  CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages,  
  CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes,  
  CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs*(ipages-iotta))::bigint || $$ bytes$$ END AS wastedisize,  
  CASE WHEN relpages < otta THEN  
    CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta::bigint) END  
    ELSE CASE WHEN ipages < iotta THEN bs*(relpages-otta::bigint)  
      ELSE bs*(relpages-otta::bigint + ipages-iotta::bigint) END  
  END AS totalwastedbytes  
FROM (  
  SELECT  
    nn.nspname AS schemaname,  
    cc.relname AS tablename,  
    COALESCE(cc.reltuples,0) AS reltuples,  
    COALESCE(cc.relpages,0) AS relpages,  
    COALESCE(bs,0) AS bs,  
    COALESCE(CEIL((cc.reltuples*((datahdr+ma-  
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta,  
    COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,  
    COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols  
  FROM  
     pg_class cc  
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$  
  LEFT JOIN  
  (  
    SELECT  
      ma,bs,foo.nspname,foo.relname,  
      (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,  
      (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2  
    FROM (  
      SELECT  
        ns.nspname, tbl.relname, hdr, ma, bs,  
        SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth,  
        MAX(coalesce(null_frac,0)) AS maxfracsum,  
        hdr+(  
          SELECT 1+count(*)/8  
          FROM pg_stats s2  
          WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname  
        ) AS nullhdr  
      FROM pg_attribute att   
      JOIN pg_class tbl ON att.attrelid = tbl.oid  
      JOIN pg_namespace ns ON ns.oid = tbl.relnamespace   
      LEFT JOIN pg_stats s ON s.schemaname=ns.nspname  
      AND s.tablename = tbl.relname  
      AND s.inherited=false  
      AND s.attname=att.attname,  
      (  
        SELECT  
          (SELECT current_setting($$block_size$$)::numeric) AS bs,  
            CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$)  
              IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr,  
          CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma  
        FROM (SELECT version() AS v) AS foo  
      ) AS constants  
      WHERE att.attnum > 0 AND tbl.relkind=$$r$$  
      GROUP BY 1,2,3,4,5  
    ) AS foo  
  ) AS rs  
  ON cc.relname = rs.relname AND nn.nspname = rs.nspname  
  LEFT JOIN pg_index i ON indrelid = cc.oid  
  LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid  
) AS sml order by wastedbytes desc limit 5'  
done  

5 索引膨胀

https://raw.githubusercontent.com/digoal/pgsql_admin_script/master/generate_report.sh

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'`  
do  
psql -d $db --pset=pager=off -q -x -c 'SELECT  
  current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta,  
  ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,  
  CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages,  
  CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,  
  CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs*(relpages-otta))::bigint || $$ bytes$$ END AS wastedsize,  
  iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta,  
  ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,  
  CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages,  
  CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes,  
  CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs*(ipages-iotta))::bigint || $$ bytes$$ END AS wastedisize,  
  CASE WHEN relpages < otta THEN  
    CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta::bigint) END  
    ELSE CASE WHEN ipages < iotta THEN bs*(relpages-otta::bigint)  
      ELSE bs*(relpages-otta::bigint + ipages-iotta::bigint) END  
  END AS totalwastedbytes  
FROM (  
  SELECT  
    nn.nspname AS schemaname,  
    cc.relname AS tablename,  
    COALESCE(cc.reltuples,0) AS reltuples,  
    COALESCE(cc.relpages,0) AS relpages,  
    COALESCE(bs,0) AS bs,  
    COALESCE(CEIL((cc.reltuples*((datahdr+ma-  
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta,  
    COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,  
    COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols  
  FROM  
     pg_class cc  
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$  
  LEFT JOIN  
  (  
    SELECT  
      ma,bs,foo.nspname,foo.relname,  
      (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,  
      (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2  
    FROM (  
      SELECT  
        ns.nspname, tbl.relname, hdr, ma, bs,  
        SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth,  
        MAX(coalesce(null_frac,0)) AS maxfracsum,  
        hdr+(  
          SELECT 1+count(*)/8  
          FROM pg_stats s2  
          WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname  
        ) AS nullhdr  
      FROM pg_attribute att   
      JOIN pg_class tbl ON att.attrelid = tbl.oid  
      JOIN pg_namespace ns ON ns.oid = tbl.relnamespace   
      LEFT JOIN pg_stats s ON s.schemaname=ns.nspname  
      AND s.tablename = tbl.relname  
      AND s.inherited=false  
      AND s.attname=att.attname,  
      (  
        SELECT  
          (SELECT current_setting($$block_size$$)::numeric) AS bs,  
            CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$)  
              IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr,  
          CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma  
        FROM (SELECT version() AS v) AS foo  
      ) AS constants  
      WHERE att.attnum > 0 AND tbl.relkind=$$r$$  
      GROUP BY 1,2,3,4,5  
    ) AS foo  
  ) AS rs  
  ON cc.relname = rs.relname AND nn.nspname = rs.nspname  
  LEFT JOIN pg_index i ON indrelid = cc.oid  
  LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid  
) AS sml order by wastedibytes desc limit 5'  
done  

6 TOP SQL

《PostgreSQL 如何查找TOP SQL (例如IO消耗最高的SQL) - 珍藏级》

7 未使用索引

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'`  
do  
psql -d $db --pset=pager=off -q -x -c '  
select current_database(),* from pg_stat_all_indexes where idx_scan=0 or idx_tup_read=0 or idx_tup_fetch=0;  
'  
done  

8 未使用(查询)表

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'`  
do  
psql -d $db --pset=pager=off -q -x -c '  
select current_database(),* from pg_stat_all_tables where seq_scan=0 and idx_scan=0;  
'  
done  

9 热表

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'`  
do  
psql -d $db --pset=pager=off -q -x -c '  
select current_database(),* from pg_stat_all_tables order by seq_scan+idx_scan desc limit 10;  
'  
done  

10 冷表

针对性删除表,或者使用OSS冷存储。

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'`  
do  
psql -d $db --pset=pager=off -q -x -c '  
select current_database(),* from pg_stat_all_tables order by seq_scan+idx_scan limit 10;  
'  
done  

11 热索引

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'`  
do  
psql -d $db --pset=pager=off -q -x -c '  
select current_database(),* from pg_stat_all_indexes order by idx_scan desc limit 10;  
'  
done  

12 冷索引

针对性删除索引。

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'`  
do  
psql -d $db --pset=pager=off -q -x -c '  
select current_database(),* from pg_stat_all_indexes order by idx_scan limit 10;  
'  
done  

13 全表扫描次数TOP对象

针对性创建索引。

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'`  
do  
psql -d $db --pset=pager=off -q -x -c '  
select current_database(),* from pg_stat_all_tables order by seq_scan desc limit 10;  
'  
done  

14 全表扫描返回记录数TOP对象

针对性创建索引。

for db in `psql --pset=pager=off -t -A -q -c 'select datname from pg_database where datname not in ($$template0$$, $$template1$$)'`  
do  
psql -d $db --pset=pager=off -q -x -c '  
select current_database(),* from pg_stat_all_tables order by seq_tup_read desc limit 10;  
'  
done  

15 监控统计信息不准的对象(表、物化视图)

背景知识

1、建立测试表

postgres=# create table a(id int);  
CREATE TABLE  

2、禁止收集统计信息

postgres=# alter table a set (autovacuum_enabled =off);  
ALTER TABLE  

3、写入大量数据

postgres=# insert into a select generate_series(1,10000000);  
INSERT 0 10000000  

4、查看统计信息项,目前占用多少空间(pages)

postgres=# select relpages from pg_class where relname='a';  
 relpages   
----------  
        0  
(1 row)  

5、查看真实空间占用

postgres=# select pg_relation_size('a');  
 pg_relation_size   
------------------  
        362479616  
(1 row)  

6、真实空间占用,转换为PAGES

postgres=# select pg_relation_size('a')/current_setting('block_size')::float8;  
 ?column?   
----------  
    44248  
(1 row)  

7、收集统计信息,查看统计信息项,目前占用多少空间(pages)

postgres=# analyze a;  
ANALYZE  
  
postgres=# select relpages from pg_class where relname='a';  
 relpages   
----------  
    44248  
(1 row)  

8、根据以上原理,可以设计评估统计信息不准确的表 SQL如下:

select oid::regclass as table, relpages, pg_relation_size(oid)/current_setting('block_size')::float8 as real_pages from pg_class   
where relkind in ('r', 'm')  -- 表和物化视图  
and pg_relation_size(oid) > 1048576  -- 大于1MB  
and (pg_relation_size(oid)/current_setting('block_size')::float8 - relpages)/(pg_relation_size(oid)/current_setting('block_size')::float8) > 0.2;  -- 大于 20% 偏差   
  
  
 table | relpages | real_pages   
-------+----------+------------  
 a     |        0 |      44248  
(1 row)  

内核增加功能

1、读写磁盘吞吐快照区间统计,区分索引,表,垃圾回收,FREEZE,AUTOANALYZE。分类统计。

2、锁等待时长快照区间统计,区分锁粒度,下钻到对象。

注意

stat的信息在使用pg_stat_reset()后会清零。请注意。

参考

https://raw.githubusercontent.com/digoal/pgsql_admin_script/master/generate_report.sh

《PostgreSQL Freeze 风暴预测续 - 珍藏级SQL》

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
相关文章
|
8月前
|
弹性计算 监控 测试技术
基于Prometheus+Grafana的PolarDB-X监控体系
《PolarDB-X 动手实践》系列第十期,体验如何监控PolarDB-X集群。
438 1
|
8月前
|
关系型数据库 分布式数据库 数据库
沉浸式学习PostgreSQL|PolarDB 8: 电商|短视频|新闻|内容推荐业务(根据用户行为推荐相似内容)、监控预测报警系统(基于相似指标预判告警)、音视图文多媒体相似搜索、人脸|指纹识别|比对 - 向量搜索应用
1、在电商业务中, 用户浏览商品的行为会构成一组用户在某个时间段的特征, 这个特征可以用向量来表达(多维浮点数组), 同时商品、店铺也可以用向量来表达它的特征. 那么为了提升用户的浏览体验(快速找到用户想要购买的商品), 可以根据用户向量在商品和店铺向量中进行相似度匹配搜索. 按相似度来推荐商品和店铺给用户. 2、在短视频业务中, 用户浏览视频的行为, 构成了这个用户在某个时间段的兴趣特征, 这个特征可以用向量来表达(多维浮点数组), 同时短视频也可以用向量来表达它的特征. 那么为了提升用户的观感体验(推荐他想看的视频), 可以在短视频向量中进行与用户特征向量的相似度搜索.
235 0
|
12月前
|
监控 Cloud Native 关系型数据库
《阿里云认证的解析与实战-数据仓库ACP认证》——云原生数据仓库AnalyticDB PostgreSQL版解析与实践(下)——二、监控报警使用、监控巡检
《阿里云认证的解析与实战-数据仓库ACP认证》——云原生数据仓库AnalyticDB PostgreSQL版解析与实践(下)——二、监控报警使用、监控巡检
|
12月前
|
监控 Cloud Native 关系型数据库
《阿里云认证的解析与实战-数据仓库ACP认证》——云原生数据仓库AnalyticDB PostgreSQL版功能演示(下)——二、功能演示2:增加监控报警规则
《阿里云认证的解析与实战-数据仓库ACP认证》——云原生数据仓库AnalyticDB PostgreSQL版功能演示(下)——二、功能演示2:增加监控报警规则
|
SQL 弹性计算 监控
实践教程之基于Prometheus+Grafana的PolarDB-X监控体系
PolarDB-X 为了方便用户体验,提供了免费的实验环境,您可以在实验环境里体验 PolarDB-X 的安装部署和各种内核特性。除了免费的实验,PolarDB-X 也提供免费的视频课程,手把手教你玩转 PolarDB-X 分布式数据库。
实践教程之基于Prometheus+Grafana的PolarDB-X监控体系
QGS
|
监控 关系型数据库 Linux
zabbix5.0—agent2监控PostgreSQL-11.4(linux)
记zabbix5.0—agent2监控PostgreSQL-11.4(linux)
QGS
557 0
zabbix5.0—agent2监控PostgreSQL-11.4(linux)
|
监控 关系型数据库 PostgreSQL
|
搜索推荐 关系型数据库 数据库
《阿里云RDS PostgreSQL实践课 2 实时用户画像数据库实践》电子版地址
阿里云RDS PostgreSQL实践课 2 实时用户画像数据库实践
114 0
《阿里云RDS PostgreSQL实践课 2 实时用户画像数据库实践》电子版地址
|
SQL 缓存 监控
PostgreSQL 监控1统计进程和统计信息的解读|学习笔记(二)
快速学习PostgreSQL 监控1统计进程和统计信息的解读
329 0
PostgreSQL 监控1统计进程和统计信息的解读|学习笔记(二)
|
SQL 监控 IDE
PostgreSQL监控1统计进程和统计信息的解读|学习笔记(一)
快速学习PostgreSQL监控1统计进程和统计信息的解读
495 0
PostgreSQL监控1统计进程和统计信息的解读|学习笔记(一)

相关产品

  • 云原生数据库 PolarDB