使用pt-query-digest分析mysql slow query log-阿里云开发者社区

使用pt-query-digest分析mysql slow query log

2017-11-09 1525

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

云数据库 RDS MySQL，集群系列 2核4GB

RDS MySQL Serverless 基础系列，0.5-2RCU 50GB

云数据库 RDS PostgreSQL，集群系列 2核4GB

简介：

使用pt-query-digest分析mysql slow query log

下载地址：

http://www.percona.com/software/percona-toolkit/

官方文档：

http://www.percona.com/doc/percona-toolkit/pt-query-digest.html

1，yum安装，先配置好 yum 源

 
        vim 
        /etc/yum
        .repos.d
        /percona
        .repo   
       
        [percona]
       
        name = CentOS $releasever - Percona
       
        baseurl=http:
        //repo
        .percona.com
        /centos/
        $releasever
        /os/
        $basearch/ 
       
        enabled = 1
       
        gpgkey = 
        file
        :
        ///etc/pki/rpm-gpg/RPM-GPG-KEY-percona 
       
        gpgcheck = 0

2，开始安装：

 
        yum 
        install 
        percona-toolkit -y

3，打开慢查询日志

请先确定在my.cnf中打开了mysql的slow_query_log，并且保证long_query_time参数设置得很合理。

 
        vim  my.cnf
       
        slow-query-log
       
        slow-query-log-
        file 
        = slow.log 
       
        long-query-
        time 
        = 1

4，下载分析工具

pt-query-digest是一个perl脚本，只需下载即可

 
        [arno.sun@srv-nc-ssh1 slowlog]$ wget percona.com
        /get/pt-query-digest 
       
        [arno.sun@srv-nc-ssh1 slowlog]$ 
        file 
        pt-query-digest pt-query-digest: a perl script text executable  
       
        [arno.sun@srv-nc-ssh1 slowlog]$ 
        chmod 
        +x pt-query-digest

5,使用：

直接上就行了，简单粗暴也没有问题。

事实上，这个工具确实有点简单粗暴，如果slow log够大的话，会消耗相当多的cpu和内存，所以最好把slow log和pt-query-digest放到其它的server上面运行。

pt-query-digest  slow.log > slow_report.log

分析结果解释说明：

第一部分是摘要：

cat slow_report.log

# 620ms user time, 10ms system time, 19.76M rss, 115.84M vsz

# Current date: Wed Mar 20 16:09:35 2013

# Hostname: srv-nc-ssh1

# Files: slow.log

# Overall: 371 total, 35 unique, 0.00 QPS, 0.05x concurrency _____________

# Time range: 2013-03-18 14:08:55 to 2013-03-19 12:23:36

# Attribute total min max avg 95% stddev median

# ============ ======= ======= ======= ======= ======= ======= =======

# Exec time 3959s 1s 73s 11s 37s 12s 7s

# Lock time 246s 0 42s 663ms 204us 4s 66us

# Rows sent 37.53M 0 6.10M 103.58k 485.50k 580.16k 0

# Rows examine 71.32M 0 6.10M 196.86k 961.27k 607.20k 0

# Rows affecte 1.03M 0 973.91k 2.83k 0.99 49.98k 0.99

# Rows read 37.53M 0 6.10M 103.58k 485.50k 580.16k 0

# Bytes sent 4.48G 14 383.55M 12.36M 101.56M 45.74M 13.83

# Tmp tables 110 0 5 0.30 0.99 0.79 0

# Tmp disk tbl 12 0 1 0.03 0 0.18 0

# Tmp tbl size 21.67M 0 1009.90k 59.82k 245.21k 158.04k 0

# Query size 71.10k 31 983 196.25 400.73 100.16 166.51

对于第一部分摘要的解释：

========================================================================================================

# 620ms user time, 10ms system time, 19.76M rss, 115.84M vsz

该工具执行日志分析的用户时间，系统时间，物理内存占用大小，虚拟内存占用大小。

# Hostname: srv-nc-ssh1

运行分析工具的主机名

# Files: slow.log

被分析的文件名

# Overall: 371 total, 35 unique, 0.00 QPS, 0.05x concurrency _____________

语句总数量（371），唯一的语句（35），Qps, 并发数

# Time range: 2013-03-18 14:08:55 to 2013-03-19 12:23:36

日志记录的时间范围

# Attribute total min max avg 95% stddev median

在这些值中，最有意义的恐怕就是95%了，与中位数类似，它也是把所有值从小到大排列，位置位于95%的那个数。

#Row sent 发送到客户端的行数

#Query size 查询的字符大小

========================================================================================================

继续看第二部分：

# Profile

# Rank Query ID Response time Calls R/Call Apdx V/M

Item# ==== ================== =============== ===== ======= ==== ===== =======

# 1 0x3BE81BF6A30F4C74 1702.9604 43.0% 182 9.3569 0.15 5.91 INSERT u_search_record

# 2 0x861AC23E20A17B65 1490.0836 37.6% 54 27.5941 0.05 13.54 SELECT UNION t_ask_price_info t_vouch_info t_cust_book t_hn_info

# 3 0xD43C719B4CE15C37 96.9039 2.4% 14 6.9217 0.11 1.42 SELECT u_car_info t_stas_auc_car

# 4 0x414D67056BE15CF4 58.2516 1.5% 20 2.9126 0.40 0.56 INSERT u_auction_back_cache

# 5 0x4A78E978D2543BCD 56.5418 1.4% 3 18.8473 0.00 3.14 SELECT t_cust_book

# 6 0x3A12FD01A8D9DA10 52.8541 1.3% 3 17.6180 0.00 0.00 SELECT t_auction_back_cache

# 7 0x9186BF39CBE58A0E 50.4508 1.3% 3 16.8169 0.00 0.01 SELECT t_check_result

# 8 0x68738A978FAB0D06 42.6112 1.1% 6 7.1019 0.17 4.66 SELECT t_sys_config t_hn_info t_hn_quote_list u_car_info

# 9 0x65EBDC4319D9955A 36.9794 0.9% 1 36.9794 0.00 0.00 INSERT SELECT t_hn_info t_hn_audit_quote

# 10 0x5203D60E3716D608 35.1022 0.9% 5 7.0204 0.00 0.05 SELECT mina_send

# 11 0x64C380BEB00DFB63 28.7720 0.7% 12 2.3977 0.50 0.40 SELECT u_vehicle_type

# 12 0xCDDA52E5A6B9F0B7 27.5927 0.7% 3 9.1976 0.00 0.00 SELECT rbvehicle

# 13 0xA5E766B81112B13A 27.5218 0.7% 4 6.8804 0.12 3.51 SELECT u_auction_back_cache

# 14 0x597A26236611758F 26.7460 0.7% 3 8.9153 0.00 0.65 SELECT t_hn_audit_quote t_ask_price_info

# 15 0x443D2230FC99811C 25.2928 0.6% 3 8.4309 0.00 0.00 SELECT t_hn_quote_list

Response: 总的响应时间。
time: 该查询在本次分析中总的时间占比。
calls: 执行次数，即本次分析总共有多少条这种类型的查询语句。
R/Call: 平均每次执行的响应时间。
Item : 查询对象

这一部分显示了最慢的十六种类型的SQL语句。

我这里最慢的是INSERT INTO u_search_record…… 共有182条语句，虽然每次插入的数据都是不同的，但也被归于同一类型的语句了。

第三部分最重要了。

以排名第一的SQL为例。

# Query 1: 0.00 QPS, 0.02x concurrency, ID 0x3BE81BF6A30F4C74 at byte 121216

# This item is included in the report because it matches --limit.

# Scores: Apdex = 0.15 [1.0], V/M = 5.91

# Query_time sparkline: |

# Time range: 2013-03-18 15:53:08 to 2013-03-19 12:23:36

# Attribute pct total min max avg 95% stddev median

# ============ === ======= ======= ======= ======= ======= ======= =======

# Count 49 182

# Exec time 43 1703s 1s 42s 9s 21s 7s 8s

# Lock time 0 12ms 41us 133us 64us 84us 15us 60us

# Rows sent 0 0 0 0 0 0 0 0

# Rows examine 0 0 0 0 0 0 0 0

# Rows affecte 0 182 1 1 1 1 0 1

# Rows read 0 0 0 0 0 0 0 0

# Bytes sent 0 2.49k 14 14 14 14 0 14

# Tmp tables 0 0 0 0 0 0 0 0

# Tmp disk tbl 0 0 0 0 0 0 0 0

# Tmp tbl size 0 0 0 0 0 0 0 0

# Query size 44 31.47k 166 291 177.04 192.76 24.93 158.58

# String:

# Databases xinche

# Hosts

# InnoDB trxID 855383 (1/0%), 85538E (1/0%), 855391 (1/0%)... 179 more

# Last errno 0

# Users carsingweb

# Query_time distribution

# 1us

# 10us

# 100us

# 1ms

# 10ms

# 100ms

# 1s #################################################################

# 10s+ #############################################

# Tables

# SHOW TABLE STATUS FROM `xinche` LIKE 'u_search_record'\G# SHOW CREATE TABLE `xinche`.`u_search_record`\G

insert into u_search_record (ip, uri, params, create_date) values ('127.0.0.1', '/car/car!ajaxGetCarInfo.action', '[{"carinfo.id":["91202"]}]', '2013-03-18 17:17:08')\G

从上面可以看出，共有182条语句

[95%]Exec time是21s，时间长得比较离谱了

数据库为xinche，用户名为carsingweb

然后是query time的分布图，这个图太恶心了，不过也可以看得出来大部分是处于1-10s之间的，还有

一些超过10秒了。

最后是几条SQL语句，是pt-query-digest生成的，这些语句有助于分析问题。

常见用法列表：

例(1)直接分析慢查询文件:pt-query-digest  slow.log > slow_report.log

(2)分析最近12小时内的查询：

pt-query-digest --since=12h slow.log > slow_report2.log

(3)分析指定时间范围内的查询：

pt-query-digest slow.log --since '2014-04-17 09:30:00' --until '2014-04-17 10:00:00'> > slow_report3.log

(4)分析指含有select语句的慢查询
pt-query-digest--filter '$event->{fingerprint} =~ m/^select/i' slow.log> slow_report4.log

(5) 针对某个用户的慢查询
pt-query-digest--filter '($event->{user} || "") =~ m/^root/i' slow.log> slow_report5.log

(6) 查询所有所有的全表扫描或full join的慢查询
pt-query-digest--filter '(($event->{Full_scan} || "") eq "yes") ||(($event->{Full_join} || "") eq "yes")' slow.log> slow_report6.log

(7)把查询保存到query_review表
pt-query-digest --user=root –password=abc123 --review h=localhost,D=test,t=query_review--create-review-table slow.log

(8)把查询保存到query_history表
pt-query-digest --user=root –password=abc123 --review h=localhost,D=test,t=query_ history--create-review-table slow.log_20140401
pt-query-digest --user=root –password=abc123--review h=localhost,D=test,t=query_history--create-review-table slow.log_20140402

(9)通过tcpdump抓取mysql的tcp协议数据，然后再分析
tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000 port 3306 > mysql.tcp.txt
pt-query-digest --type tcpdump mysql.tcp.txt> slow_report9.log

(10)分析binlog
mysqlbinlog mysql-bin.000093 > mysql-bin000093.sql
pt-query-digest --type=binlog mysql-

bin000093.sql > slow_report10.log

(11)分析general log
pt-query-digest --type=genlog localhost.log > slow_report11.log

官网资料

http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html

本文转自crazy_charles 51CTO博客，原文链接：http://blog.51cto.com/douya/1597679，如需转载请自行联系原作者

使用pt-query-digest分析mysql slow query log

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

直播

下载

镜像站

技术资料

使用pt-query-digest分析mysql slow query log

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像