pt-tools系统:pt-kill 实战

本文涉及的产品
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
云数据库 RDS MySQL,高可用系列 2核4GB
简介: 列出几种常用场景,并进行分析实战测试 特殊、 打印出执行时间超过3秒的connection,仅仅打印,不kill 每2秒循环一次,超过10秒就退出pt-kill程序pt-kill --host xx.

列出几种常用场景,并进行分析实战测试

前提

* pt-kill必须获得的权限

1. PROCESS ,  SUPER 
2. 否则,你只能看到打印出kill id,但是实际上并没有被kill掉  

特殊、 打印出执行时间超过3秒的connection,仅仅打印,不kill

  • 每2秒循环一次,超过10秒就退出pt-kill程序
pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --print --ignore-self --busy-time=3 --interval 2 --run-time=10

重点注意: 这里的--busy-time=3,指的是Command=Query的连接,其他的并不会被匹配哦 , 所以一般情况下删除的都是比较安全的用户thread
重点注意2: ddl,dml,select,都是属于Command=Query

+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------+
| Id   | User          | Host                 | db           | Command          | Time   | State                                                         | Info                                      |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------+
|    3 | repl          | xx.xxx.126.166:60528 | NULL         | Binlog Dump GTID | 492777 | Master has sent all binlog to slave; waiting for more updates | NULL                                      |
|    4 | repl          | xx.xxx.126.165:48604 | NULL         | Binlog Dump GTID | 492765 | Master has sent all binlog to slave; waiting for more updates | NULL                                      |
|  502 | job_heartbeat | xx.xxx.2.217:34626   | heartbeat_db | Sleep            |      0 |                                                               | NULL                                      |
| 1053 | dbadmin       | localhost            | heartbeat_db | Query            |      1 | altering table                                                | alter table heartbeat add column ts2 date |
| 1055 | pt_kill       | xx.xxx.126.166:63167 | NULL         | Sleep            |      1 |                                                               | NULL                                      |
| 1056 | dbadmin       | localhost            | NULL         | Query            |      0 | starting                                                      | show processlist                          |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------+


+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------------+
| Id   | User          | Host                 | db           | Command          | Time   | State                                                         | Info                                            |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------------+
|    3 | repl          | xx.xxx.126.166:60528 | NULL         | Binlog Dump GTID | 492716 | Master has sent all binlog to slave; waiting for more updates | NULL                                            |
|    4 | repl          | xx.xxx.126.165:48604 | NULL         | Binlog Dump GTID | 492704 | Master has sent all binlog to slave; waiting for more updates | NULL                                            |
|  502 | job_heartbeat | xx.xxx.2.217:34626   | heartbeat_db | Sleep            |      0 |                                                               | NULL                                            |
| 1053 | dbadmin       | localhost            | heartbeat_db | Query            |      3 | updating                                                      | update heartbeat set ts = '2018-09-06 00:07:58' |
| 1055 | pt_kill       | xx.xxx.126.166:63167 | NULL         | Sleep            |      0 |                                                               | NULL                                            |
| 1056 | dbadmin       | localhost            | NULL         | Query            |      0 | starting                                                      | show processlist                                |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------------+

+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------+
| Id   | User          | Host                 | db           | Command          | Time   | State                                                         | Info              |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------+
|    3 | repl          | xx.xxx.126.166:60528 | NULL         | Binlog Dump GTID | 492613 | Master has sent all binlog to slave; waiting for more updates | NULL              |
|    4 | repl          | xx.xxx.126.165:48604 | NULL         | Binlog Dump GTID | 492601 | Master has sent all binlog to slave; waiting for more updates | NULL              |
|  502 | job_heartbeat | xx.xxx.2.217:34626   | heartbeat_db | Sleep            |      0 |                                                               | NULL              |
| 1053 | dbadmin       | localhost            | heartbeat_db | Query            |      3 | User sleep                                                    | select 1,sleep(4) |
| 1055 | pt_kill       | xx.xxx.126.166:63167 | NULL         | Sleep            |      1 |                                                               | NULL              |
| 1056 | dbadmin       | localhost            | NULL         | Query            |      0 | starting                                                      | show processlist  |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------+

+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+------------------+
| Id   | User          | Host                 | db           | Command          | Time   | State                                                         | Info             |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+------------------+
|    3 | repl          | xx.xxx.126.166:60528 | NULL         | Binlog Dump GTID | 492740 | Master has sent all binlog to slave; waiting for more updates | NULL             |
|    4 | repl          | xx.xxx.126.165:48604 | NULL         | Binlog Dump GTID | 492728 | Master has sent all binlog to slave; waiting for more updates | NULL             |
|  502 | job_heartbeat | xx.xxx.2.217:34626   | heartbeat_db | Sleep            |      0 |                                                               | NULL             |
| 1053 | dbadmin       | localhost            | heartbeat_db | Query            |      4 | starting                                                      | rollback         |
| 1055 | pt_kill       | xx.xxx.126.166:63167 | NULL         | Sleep            |      0 |                                                               | NULL             |
| 1056 | dbadmin       | localhost            | NULL         | Query            |      0 | starting                                                      | show processlist |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+------------------+

一、打印出sleep时间超过3秒的connection,仅仅打印,不kill

  • 每2秒循环一次,无限循环下去
pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --print --ignore-self --idle-time=3 --interval 2
  • 每2秒循环一次,超过10秒就退出pt-kill程序
pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --print --ignore-self --idle-time=3 --interval 2 --run-time=10

二、kill掉query语句中带有sleep关键字(不区分大小写)的connection, 且Time超过3秒

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --print --ignore-self  --interval 2 --match-info "(?i-xsm:(sleep))" --busy-time=3 --kill --victims all

三、kill掉非系统用户的select开头,且执行时间超过3秒的 connection

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))"  --ignore-user="root|repl"  --busy-time=3 --kill --victims all

四、kill掉非系统用户的select,update,delete开头,且执行时间超过3秒的 connection

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))"  --ignore-user="root|repl"  --busy-time=3 --kill --victims all

五、kill掉指定特征的query语句

  • kill掉非系统用户,且query语句中同时包含heartbeat 和 where ,且heartbeat在前where在后,且执行时间超过3秒的 connection

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info ".*heartbeat.*where.*"  --ignore-user="root|repl"  --busy-time=3 --kill --victims all


# 2018-11-15T10:38:03 KILL 1053 (Query 27 sec) select *,sleep(10) from heartbeat  where id < '1000000000'
# 2018-11-15T10:38:05 KILL 1053 (Query 29 sec) select *,sleep(10) from heartbeat  where id < '1000000000'
# 2018-11-15T10:38:07 KILL 1053 (Query 31 sec) select *,sleep(10) from heartbeat  where id < '1000000000'
# 2018-11-15T10:38:09 KILL 1053 (Query 33 sec) select *,sleep(10) from heartbeat  where id < '1000000000'

六、kill掉非系统库的select开头,且执行时间超过3秒的 connection


pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))"  --ignore-db="mysql|information_schema"   --ignore-user="root|repl"  --busy-time=3 --kill --victims all

七、kill掉非系统用户的select,update,delete开头,且执行时间超过3秒,且不是被locked住 的 connection

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))"  --ignore-state="Locked"   --ignore-user="root|repl"  --busy-time=3  --kill  --victims all

八、kill掉非系统用户,指定state(Locked、login、Updating、Sorting for order等状态),且执行时间超过3秒 的 connection

  • 指定Locked的connection删除掉
pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))"  --match-state="Locked"   --ignore-user="root|repl"  --busy-time=3 --kill --victims all

九、kill掉非系统用户,指定Command(Query、Sleep、Binlog Dump、Connect等状态),且执行时间超过3秒 的 connection

  • kill掉指定Connect的command connection
pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))"  --match-command="Connect"   --ignore-user="root|repl"  --busy-time=3 --kill --victims all

十、kill掉指定来源host ip ,且select开头的,且执行时间超过3s的connection

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info "(?i-xsm:^(select))"  --ignore-host="x.x.x.x"   --ignore-user="root|repl"  --busy-time=3 --kill --victims all

十一、kill掉非系统用户,Command=Sleep,且空闲时间为3s的connection


pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2  --ignore-user="root|repl"  --idle-time=3 --kill --victims all

十二、kill掉非系统用户,指定特征的query,在后台运行,并打印日志

pt-kill  --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass  --ignore-self --print  --interval 2 --match-info ".*heartbeat.*where.*"  --ignore-user="root|repl"  --busy-time=3 --daemonize --log='/root/kill.log' --kill --victims all

十三、--victims的用法

  • 背景
pid:1103 , 是最早开启事务的空闲进程 T1
pid:1095 , 是第二早开启事务的ddl进程 T2
pid:502  , 是最后一个开启事务的dml进程 T3

事务顺序是: T1 锁住了 T2,T3,  T2锁住了T3 , T3被T1,T2锁住

+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+---------------------------------------------------------+
| Id   | User          | Host                 | db           | Command          | Time   | State                                                         | Info                                                    |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+---------------------------------------------------------+
|    3 | repl          | xx.xxx.126.166:60528 | NULL         | Binlog Dump GTID | 507504 | Master has sent all binlog to slave; waiting for more updates | NULL                                                    |
|    4 | repl          | xx.xxx.126.165:48604 | NULL         | Binlog Dump GTID | 507492 | Master has sent all binlog to slave; waiting for more updates | NULL                                                    |
|  502 | job_heartbeat | xx.xxx.2.217:34626   | heartbeat_db | Query            |    328 | Waiting for table metadata lock                               | insert into heartbeat(ts) values('2018-11-15 13:43:50') |
| 1095 | dbadxxx       | localhost            | heartbeat_db | Query            |    329 | Waiting for table metadata lock                               | alter table heartbeat add column ts3 date               |
| 1103 | dbadxxx       | localhost            | heartbeat_db | Sleep            |    341 |                                                               | NULL                                                    |
| 1104 | dbadxxx       | localhost            | NULL         | Query            |      0 | starting                                                      | show processlist                                        |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+---------------------------------------------------------+
6 rows in set (0.00 sec)
  • --match-command="Query|Sleep" --victims oldest --busy-time=3

只kill最老的command为Query|Sleep的最老的链接

pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --ignore-user="root|repl"   --match-command="Query|Sleep"      --busy-time=3  --victims oldest

Enter MySQL password:
# 2018-11-15T13:49:07 KILL 1103 (Sleep 330 sec) NULL
# 2018-11-15T13:49:09 KILL 1103 (Sleep 332 sec) NULL
  • --match-command="Query|Sleep" --victims all --busy-time=3

kill 所有command="Query|Sleep" 的所有链接

Enter MySQL password:
# 2018-11-15T13:48:41 KILL 1103 (Sleep 304 sec) NULL
# 2018-11-15T13:48:41 KILL 1095 (Query 292 sec) alter table heartbeat add column ts3 date
# 2018-11-15T13:48:41 KILL 502 (Query 291 sec) insert into heartbeat(ts) values('2018-11-15 13:43:50')
# 2018-11-15T13:48:41 KILL 1104 (Sleep 262 sec) NULL

  • --match-command="Query" --victims all --busy-time=3

kill 所有 command="Query"(默认不填也就是Query)的所有链接

Enter MySQL password:
# 2018-11-15T13:46:59 KILL 1095 (Query 190 sec) alter table heartbeat add column ts3 date
# 2018-11-15T13:46:59 KILL 502 (Query 189 sec) insert into heartbeat(ts) values('2018-11-15 13:43:50')
  • --match-command="Query" --victims oldest --busy-time=3

kill 所有 command="Query"(默认不填也就是Query)的最老的链接

Enter MySQL password:
# 2018-11-15T13:59:01 KILL 1095 (Query 912 sec) alter table heartbeat add column ts3 date

十四、kill掉非系统用户,指定库的所有链接

举例一、kill掉非root和repl账号,使用open_db或xf_loupan_db的所有链接

 pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --ignore-user="root|repl"  --match-db="open_db|xf_loupan_db"  --victims all --kill

dbadmin:xf_loupan_db> select *,sleep(100) from loupan_grade_photo;
ERROR 2013 (HY000): Lost connection to MySQL server during query
dbadmin:xf_loupan_db> select *,sleep(100) from loupan_grade_photo;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    245
Current database: xf_loupan_db

ERROR 2013 (HY000): Lost connection to MySQL server during query


Enter MySQL password:
# 2018-12-26T12:34:46 KILL 243 (Query 7 sec) select *,sleep(100) from loupan_grade_photo
# 2018-12-26T12:35:00 KILL 245 (Query 1 sec) select *,sleep(100) from loupan_grade_photo

特别注意: 这里面只针对db这一列是open_db或xf_loupan_db才会有效果,如果这个用户拥有多个库的权限
那么他在xx库去 select * from xf_loupan_db.loupan_grade_photo 是不会生效的
所以,要保证这个库账号只能对应一个库,才会有效果

比如这样是没有效果的:

root:(none)> show processlist;
+-----+---------+----------------------+------+---------+------+------------+----------------------------------------------------------+
| Id  | User    | Host                 | db   | Command | Time | State      | Info                                                     |
+-----+---------+----------------------+------+---------+------+------------+----------------------------------------------------------+
| 248 | root    | localhost            | NULL | Query   |    0 | starting   | show processlist                                         |
| 252 | pt_kill | xx.xxx.126.166:33088 | NULL | Sleep   |    1 |            | NULL                                                     |
| 257 | dbadmin | localhost            | sys  | Query   |   19 | User sleep | select *,sleep(100) from xf_loupan_db.loupan_grade_photo |
+-----+---------+----------------------+------+---------+------+------------+----------------------------------------------------------+
3 rows in set (0.00 sec)

十五、kill掉指定用户的所有链接

一般我们什么时候使用它呢? 一般是在迁移某个DB,需要将这个DB上的账号链接全部清理掉,防止长连接


举例:kill掉所有lc_rx 和 rc_ronly 两个账号的所有链接

pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-user="lc_rx|rc_ronly"  --victims all --kill



相关实践学习
如何在云端创建MySQL数据库
开始实验后,系统会自动创建一台自建MySQL的 源数据库 ECS 实例和一台 目标数据库 RDS。
全面了解阿里云能为你做什么
阿里云在全球各地部署高效节能的绿色数据中心,利用清洁计算为万物互联的新世界提供源源不断的能源动力,目前开服的区域包括中国(华北、华东、华南、香港)、新加坡、美国(美东、美西)、欧洲、中东、澳大利亚、日本。目前阿里云的产品涵盖弹性计算、数据库、存储与CDN、分析与搜索、云通信、网络、管理与监控、应用服务、互联网中间件、移动服务、视频服务等。通过本课程,来了解阿里云能够为你的业务带来哪些帮助 &nbsp; &nbsp; 相关的阿里云产品:云服务器ECS 云服务器 ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。产品详情: https://www.aliyun.com/product/ecs
目录
相关文章
|
关系型数据库 MySQL 数据库
|
SQL 关系型数据库 MySQL
pt-tools系列:pt-online-schema-change 最佳实践
pt的详细步骤 Step 1: Create the new table. Step 2: Alter the new, empty table. This should be very quick, or die if the user specified a bad alter statement.
5012 0
|
Linux 程序员 C语言
start_kernel函数部分分析
start_kernel函数部分分析
156 0
start_kernel函数部分分析
|
编解码 并行计算
wrf--运行real.exe时报错:“Could not find level above ground“ error
在修改wrf初始场资料时,如果做了带通滤波处理,会发现在运行real.exe时报错:“Could not find level above ground” error 。
wrf--运行real.exe时报错:“Could not find level above ground“ error
|
开发者
Stack Overflow 宣布 :将于 3 月 31 日中断 Jobs、Developer Story 等功能
Stack Overflow 宣布 :将于 3 月 31 日中断 Jobs、Developer Story 等功能
156 0
Stack Overflow 宣布 :将于 3 月 31 日中断 Jobs、Developer Story 等功能
|
Linux 计算机视觉
Linux下OpenCV出现错误:ASSERT false in file qasciikey.cpp, line 501
Linux下OpenCV出现错误:ASSERT false in file qasciikey.cpp, line 501
150 2
|
关系型数据库 MySQL
pt-kill
pt-kill 是属于percona-toolkit中的一个工具,作用是kill mysql的连接,pt-kill连接mysql,通过show processlist获得查询. 参数: --kill #kill mysql的连接 --busy-time=m   #匹配运行时...
1062 0