
暂无个人介绍
暂时未有相关通用技术能力~
阿里云技能认证
详细说明MySQL是最流行、使用最广泛的开源数据库,上手很快,但对一个DBA,要真正掌握MySQL,需要花很多时间进行系统化的学习、实践。 MySQL基础运维,需要掌握MySQL软件的安装、配置,了解MySQL大致的架构,业界这方面的书很多。MySQL官方文档要熟悉,知道官方文档包含哪些内容,遇到问题的时候能知道在什么地方查找需要的信息。 MySQL官方文档介绍 每个版本的mysql都有一个对应的参考手册, 现在建议可以从MySQL 5.7 reference入手学习。 MySQL安装部署 学习MySQL安装部署的几种方式 rpm包安装 源码编译安装 二进制安装 基于docker的mysql安装部署 熟悉MySQL软件中的各类文件 了解各种安装方式下mysql 二进制文件和lib文件的路径,资源文件,数据常用的mysql命令行工具及其适用 mysql mysqld mysqladmin, mysqladmin extended-status mysqldump mysqlcheck mysqlbinlog mysql_config熟悉这些命令的作用,熟悉这些命令的常用参数。 mysql数据库初始化过程 mysql_install_db mysql 5.7 mysqld --bootstrap 熟悉MySQL实例启动脚本 mysqld_safe rpm安装的mysql启动脚本 熟悉MySQL的启动过程 mysql参数文件加载过程 熟悉MySQL实例相关文件 熟悉mysql实例中各类文件的作用,路径 pid文件, sock文件, error log, slow log, general log 表相关的文件(frm, ibd) innodb 共享表空间,日志 binlog, binlog index, relay log, relay log index master.info, relay-log.info ssl认证相关文件 熟悉上面这些文件的默认路径,以及配置这些路径的参数。 basedir datadir innodb_data_home_dir innodb_log_group_home_dir lc_messages_dir plugin_dir slave_load_tmpdir tmpdir 熟悉MySQL各类配置参数 学习官方文档参数相关内容 参数列表 和 参数含义 了解参数的作用域,全局参数和session参数。动态参数是不是修改了就立刻生效?是否对当前回话生效?是否都已经建立的老的连接生效? 单机多实例MySQL安装配置 使用MySQL MySQL支持的数据类型 掌握常见数据类型的适用场景。 MySQL字符集 掌握字符集转换 掌握常见引起乱码的原因 掌握字符集相关的几个配置参数的作用(show global variables like '%char%'; MySQL内建函数 从官方文档学习MySQL的内建函数。 MySQL DDL DDL的语法 不同类型DDL的执行过程 了解DDL对业务的影响 了解在线DDL的原理 MySQL帐号和权限体系 MySQL账号认证过程 MySQL支持的不同类型的权限 权限相关的表(user, db, table_privs) MySQL的存储过程 熟悉存储过程的语法,能使用存储过程处理批量数据订正。 MySQL触发器 熟悉触发器的语法,不同类型触发器的作用。 MySQL event scheduler
数据库优化是DBA日常工作中很重要的职责。能在各种场景下优化好数据库,也是DBA能力的重要体现。而SQL优化,是数据库优化中的一项核心任务。 如何真正掌握MySQL的SQL优化呢? MySQL优化 理解执行计划 MySQL中使用explain查看执行计划,需要对执行计划输出中的每一项内容都非常熟悉。 官方文档中对此有详细的描述:https://dev.mysql.com/doc/refman/5.7/en/explain-output.html 《深入理解MariaDB与MySQL》中第四章,第五章对MySQL执行计划和SQL优化的各种方法有详细介绍,建议详细阅读。 执行计划中的几项关键内容: possible_keys, key, key_len, rows。 extra列中有时候也会有一些重要的信息,官方文档中对此有详细描述。 Explain extended 执行explain extended之后,再执行show warnings,可以看到一些额外的信息。如字段类型隐式转换导致索引不可用。 +---------+------+-----------------------------------------------------------------------------------------+ | Level | Code | Message | +---------+------+-----------------------------------------------------------------------------------------+ | Warning | 1739 | Cannot use ref access on index 'ind' due to type or collation conversion on field 'a' | | Warning | 1739 | Cannot use range access on index 'ind' due to type or collation conversion on field 'a' | | Note | 1003 | /* select#1 */ select `test`.`a`.`a` AS `a` from `test`.`a` where (`test`.`a`.`a` = 1) | 理解索引 索引在SQL优化中占有比较重要的作用,需要深入理解索引、联合索引对各类SQL的作用。 理解单表访问路径:全表扫描,索引扫描。 理解索引扫描的过程 理解覆盖索引和非覆盖索引的差别 了解最基本的分页SQL优化方法 不走索引的几种情况 隐式转换的规律 理解InnoDB Cluster Index 学习官方文档关于Index的部分Optimization and indexes SQL优化案例 MySQL的优化器基于COST和一些规则来选择具体的执行路径。学习使用optimizer_trace来观察优化器的优化过程。从官方文档学习optimizer trace的几个相关参数的作用 optimizer_trace optimizer_trace_features optimizer_trace_limit optimizer_trace_max_mem_size optimizer_trace_offset MySQL SQL 优化 仔细阅读官方文档中的Optimizing SQL Statements 章节的内容搞清楚下面这些内容的含义 Range Optimization, 参数range_optimizer_max_mem_size的作用。参数eq_range_index_dive_limit的作用。 Index Merge Engine Condition Pushdown 和 Index Condition Push Down 表关联的方法, nested loop, join buffer的作用,理解参数join_buffer_size order by的优化,排序相关几个参数的作用: max_length_for_sort_data, max_sort_length,sort_buffer_size group by, distinct limit对执行计划的影响 学习子查询、派生表和视图等相关的优化内容Optimizing Subqueries, Derived Tables, and View References 到官方文档查找排序相关参数的作用 show global variables like '%sort%' | Variable_name | Value | +--------------------------------+---------------------+ | innodb_disable_sort_file_cache | OFF | | innodb_ft_sort_pll_degree | 2 | | innodb_sort_buffer_size | 1048576 | | max_length_for_sort_data | 1024 | | max_sort_length | 1024 | | myisam_max_sort_file_size | 9223372036853727232 | | myisam_sort_buffer_size | 8388608 | | sort_buffer_size | 262144 | 其它优化相关的参数 optimizer_search_depth optimizer_switch中每个开关的作用,大致了解对应的算法和适用场景
常见的数据库很多,包括MySQL,Oracle,SQL Server,PostgreSQL,DB2,新型的数据库如MongoDB,Redis,ElasticSearch,Hbase。国产的数据库如Oceanbase、Polar DB, 分布式数据库如DRDS等。 全面掌握所有数据库的方方面面,当然有难度,一个人的时间是有限的。但是熟练掌握几种平时工作中最常遇到的数据库。或者在新的工作任务中,遇到以前没有接触过的数据库,需要有快速学习的能力,能在几个月或几周的时间里迅速上手,并在工作的过程中持续加深掌握的程度,最终成为某一数据库的熟练工,成为专家。 何为熟练?就是遇到任何问题,都能迅速找到解决思路。 何为专家?就是大家遇到某个解决不了的问题时,第一个想要求助的人就是你,而且在这个问题上,你通常不会让大家失望,能得到满意的答案,那你就是解决这个问题的专家。 数据库种类很多,运维的技术栈很宽,阿里云云产品种类繁多,客户会遇到的问题也纷繁复杂,很难有一个人在所有的方面都是专家,但是每个人都能在其中某几个方面成为专家。 培养良好的运维意识和习惯 具备基本的职业道德。运维和DBA的权限很大,要很清楚哪些事情是不能做的。不查看和工作无关的数据,不泄漏帐号,不泄漏数据,不执行未经授权的操作。也要清楚哪些事情是要去做的。要有责任心,认真完成每天的日常工作。伙伴遇到麻烦了,要有意愿去帮忙。客户遇到麻烦了,有些事情不在服务范围,但在我们的能力范围内的事情,也要通过合适的途径去支持。心存敬畏之心,不作恶,与人为善。 运维风险意识。权限越大,风险越大。任何操作都要谨慎,了解任何一个操作的影响(如大表DDL的影响),进行不可逆操作时要尤其谨慎。 良好的工作习惯。生产环境的变更要做好变更计划、审批,在测试环境验证。操作前确认操作环境。注意观察和记录变更过程中的日志。变更过程中和变更完成后要确认上下游是否都正常。平时工作中多做记录,多总结分享。细心,遇到疑惑的问题多思考、讨论。 团队合作。团队合作分两方面看,首先要承担自己在团队中的那部分职责,做好自己的本职工作,这是最基本的。同时也要关注团队中发生的事情,如果别的小伙伴遇到麻烦,要能主动提供帮助。另一方面,要利用团队的力量来完成一些任务。比如我因各种原因休假,就要提前安排好相关工作,让团队中其它人能临时处理相关任务。比如通宵变更发布,那第二天,要让团队中其它人能顶上来做白天的值班和相关任务处理。怎么让其他人能接替我在做的任务,或者我如何顺利接手其他伙伴手头的任务?要在平时的工作中培养,要互相关注,了解其它同学在做的事情。避免成为单点。 应对故障。要迅速评估故障的业务影响面,定位故障原因。重大问题及时(5分钟)升级。 提高效率,学会使用自动化工具完成重复的工作。 学会学习 要有好的学习渠道。各类数据库的官方文档是最重要的学习资料。MySQL,Oracle,SQL Server,MongoDB,Redis,ElasticSearch等产品都有很详细的官方文档。学会看官方文档。官方文档通常很长,要熟悉文档的结构,遇到问题,能快速在文档中定位到相关的章节。平时也要花一些时间系统地阅读文档,关于数据库架构、运维、优化,很多问题都能从文档里都能找到靠谱的答案。除了官方文档,也要阅读每个领域内写得最好的书。 要实践,做实验,设计一些场景,验证文档中的内容。 在工作中学习,从实践中学习,理论的知识要能和实际工作中遇到的问题结合起来。用理论指导实践,在实践中加深对理论的理解。工作中遇到的每一个难题,每一个自己不曾遇到过的问题,每一次故障,都是很好的学习的机会。每一个问题都有一个简单,但是没用的答案。要花时间找到那个真正的答案。 从别人的工作中学习,向同行学习,向合作的伙伴学习。要找到每个领域内做得好的那些人,看看他们是怎么思考问题解决问题的,看看他们写的书和文章。 总结、沉淀、提高。要善于总结,处理了一个故障,就把故障处理的过程记录下来。问题是怎么发现的,有哪些征兆,怎么精确定位问题的,最终又是通过什么方法解决问题的。用于分析问题的理论,是不是能解释所有现象,是不是有存在矛盾的地方。暂时无法搞清楚所有的问题也不要紧,把现场的各种数据保留下来,进行归类整理。以后也可能会遇到相同或类似的问题,当时想不清楚的问题,或许过一段时间就能想清楚了。不要轻易放个一个问题,一个疑点。 持续学习。每天学习。充分利用好自己的时间。如果每天比别人多花1-2个小时学习,一年就能多几百个小时。长期持续下来,一定能有所收获。 学会使用数据库 熟练掌握常用的访问数据库的工具,包括命令行工具和图形化工具。如MySQL客户端,Oracle的sqlplus,SQL Server的企业管理器、SQLCMD,ElasticSearch的kibana。 掌握和数据库交互的语言。如SQL(DML,DDL),redis的各类command,mongodb、ElasticSearch的DSL。学会使用存储过程和触发器完成特定的需求。 掌握数据库的各类对象(如各种类型的表、索引、分区,redis的list、dict、kv,mongodb的collection、document、index,elasticsearch的index、分片)。 熟悉数据库支持的各自数据类型和存储方式(精确数字类型int,decimal,非精确的数值类型如float、double,文本类型char,varchar, json,日期和时间),熟悉各类数据类型的使用场景,熟悉操作各类数据类型的常见函数。掌握字符编码和字符集的概念,掌握各种乱码问题的分析、解决。 能使用脚本自动化处理工作中的需求,如批量数据导入导出、大批量数据订正。 学习使用常用的编程语言访问数据库,学习常用的编程接口。如使用jdbc访问数据库,使用python、perl等接口访问数据库。安装部署数据库 熟悉数据库的各种安装部署方式。如rpm安装、二进制安装、源码编译安装,Oracle静默安装,SQL Server脚本安装。 熟悉数据库的参数配置方式,核心配置参数的作用和设置标准。(如oracle的pfile, spfile。mysql的session参数和全局参数、参数生效范围) 熟悉组成数据库的各类文件和目录结构。(如软件二进制文件和lib,oracle的配置文件、控制文件、数据文件、日志文件、归档文件、告警日志、监听日志。MySQL的配置文件、数据文件、binlog、relaylog、慢日志、错误日志等) 熟悉组成数据库实例的相关进程和线程,监听端口。(如oracle的监听、各个进程pmon,smon,dbwr,lgwr等,mysql的各类线程如master thread、server thread等) 掌握数据库高可用环境的搭建。如Oracle的DG,SQL Server的mirror、always on,MySQL的slave, group replication。使用proxy进行读写分离和高可用环境的搭建。使用keepalived、mha等进行自动切换。 了解数据库集群的运作方式,数据如何分片,查询如何路由、聚合,分布式事物。 掌握数据库备份和恢复 熟练掌握数据库备份的原理、工具、操作。理解数据库保障数据不丢的机制。掌握不同备份类型的适用场景和作用,如MySQL的物理备份、逻辑备份、binlog备份,全量备份、增量备份、日志备份、快照备份等方式。给自己运维的数据库制定合理的备份策略,备份文件的存储和保存策略。 熟练掌握各类数据库各种备份方式的恢复原理、工具、操作。熟练掌握各种异常场景下的数据恢复过程。如误操作delete如何恢复,truncate table如何恢复,sql server数据文件损害如何恢复,oracle控制文件丢失如何恢复,在线redo日志丢失如何恢复。如何将数据库恢复到一个指定的时间点。如何快速恢复数据。 定期演练。定期验证备份的有效性。定期模拟各种数据丢失场景进行恢复演练。 掌握数据库运行状况 学会观察数据库的内部运行情况,观察数据库的工作负载和资源消耗情况。如通过mysql的global status, innodb status, Oracle的sysstat, Sql Server的performance counter观察数据库的QPS,TPS,网络吞吐量,IO请求情况等,CPU使用情况。 观察数据库session运行情况。如mysql的process list,oracle的v$session, sql server的task运行情况,了解任务的运行时间、运行状态,任务是谁发起的,执行了什么操作。 观察实例SQL的整体运行情况。如mysql的慢sql,oracle的awr,或通过各类审计工具,或通过一些解析网络报文的工具统计sql执行情况,哪些sql执行次数多,哪些sql运行时间多,哪些sql消耗资源多。 观察空间使用情况。观察各类文件占用的空间情况,如数据文件、日志文件、临时文件的空间使用情况。观察表空间和表的空间占用情况。 观察实例内部锁定情况,其它各类等待事件。oracle可以通过等待事件相关视图,active_session_history,sql trace等方式查看细粒度的情况,其它数据库或多或少有一些途径能查看内部等待情况。 观察实例内存使用情况。 观察数据库告警日志。如oracle 的alert log, 监听日志,mysql的错误日志。通过各类告警日志有时能发现其他途径观察不到或容易遗漏的问题。 观察单个sql的执行情况,熟练掌握各类SQL优化的技能。熟悉优化器和sql的执行计划,通过各种方法优化sql执行效率。 对于集群数据库,观察集群的运行情况,观察各个节点的运行状况,集群间的通信情况,不同节点的负载分配情况。 日常经常用到的工具、命令、小脚本、SQL语句等,记录下来,整理好,遇到问题时能快速拿来复用。 熟悉数据库的运行环境 我们接触到的大部分数据库运行在linux环境下。要熟悉linux的使用。从os层面查看各类资源(cpu,内存,网络流量,IO等,文件系统,tcp连接等)的消耗情况。熟悉常用工具的使用。查看OS的各类日志,如/var/log/message。OS的参数配置,如网络栈参数配置,系统资源限制(ulimit)。 如果是windows,同样有完善的工具集,能查看各种数据。 了解硬件的能力,cpu、io设备、网络设备等能力。 熟悉tcp网络协议。我们接触到的数据库,基本上绝大多数都是使用tcp协议通信。熟悉网络协议,学会使用网络报文分析,在解决某些类型的问题上很有用。 学习一些理论的知识 我们接触的很多数据库,都属于关系型数据库。学习一些关系型数据库的理论。大学的数据库课程,就是很好的一个基础。学习关系型理论中的几个范式,每种范式都是为了解决什么问题,有什么缺点。 关系型数据库理论中, transaction是很基础的一个概念。有时间可以读读《Transaction Processing》这本书。据说InnoDB的设计,很多就是从这本书里参考的。关于事物,隔离级别,锁,数据一致性,Btree索引,数据库文件结构,这本书里有详细的描述。 对于分布式数据库,有时间可以学习一些分布式的理论,Cap理论,保障多节点数据一致性的paxos算法等。 学习一些新型数据库的存储结构,如LSM tree,Hbase,OceanBase,LevelDB等数据库就使用LSM tree的存储模型。也可以了解下Elasticsearch底层索引的存储格式。 这些理论知识不一定你的工作有直接的帮助,但是能帮你更好的认识各种数据库。 学习一些操作系统、网络的基础知识。计算机系大学课程应该都有。 学习一些编程和数据结构和算法的知识。计算机系大学课程应该都有。学习几门编程语言。python就很好,其它一些使用广泛的语言有时间也可以学学,golang,javascript,c/c++,当然还有java 。 我们接触的大多是OLTP类型的数据库、OLTP类型的业务场景。有时间也可以了解一些数据仓库的知识,了解数仓分层建模的方式,了解MaxCompute或Hive的sql是如何转化为分布式的map reduce job进行运算、join,了解一些实时计算的原理。
使用single_beat, 但是发现不起作用,2台机器都启动了celery beat。 root@web01:# ps -elf grep beat 0 S root 5719 5676 0 80 0 - 12807 ep_pol May18 ? 00:00:35 single-beat celery -A taskqueue beat -l info 0 S root 6137 5719 0 80 0 - 49567 poll_s May18 ? 00:00:13 celery -A taskqueue beat -l info root@web02:# ps -elf | grep beat 0 S root 12872 12869 0 80 0 - 12807 ep_pol May18 ? 00:00:35 single-beat celery -A taskqueue beat -l info 0 S root 13326 12872 0 80 0 - 49568 poll_s May18 ? 00:00:14 celery -A taskqueue beat -l info 到redis看了下 > get SINGLE_BEAT_myserver "iZx33xw1xxT:28827" 但是主机名改成了web01, web02, 重启web01 web02 上的single-beat程序,再次查看, > get SINGLE_BEAT_myserver "web02:8234" 这次beat只启动了一个。 root@web01:# ps -elf | grep beat 0 S root 15307 5676 0 80 0 - 12807 ep_pol 15:18 ? 00:00:00 single-beat celery -A taskqueue beat -l info root@web02:# ps -elf | grep beat 0 S root 8003 12869 0 80 0 - 12807 ep_pol 15:17 ? 00:00:00 single-beat celery -A taskqueue beat -l info 0 S root 8234 8003 0 80 0 - 49568 poll_s 15:18 ? 00:00:00 celery -A taskqueue beat -l info 所以,修改主机名之后, single-beat需要重启一下。
cx_Oracle是python连Oracle常用的lib,今天介绍如何使用静态链接的方式build cx_Oracle,避免对oracle共享库的依赖。 首先需要生成oracle客户端的静态lib: cd $ORACLE_HOME/bin ./genclntst ar: creating /u01/app/oracle/product/11.2.0/db_1/lib/libclntst11.a Created /u01/app/oracle/product/11.2.0/db_1/lib/libclntst11.a 再build cx_Oracle python setup.py build running build running build_ext building 'cx_Oracle' extension gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/u01/app/oracle/product/11.2.0/db_1/rdbms/demo -I/u01/app/oracle/product/11.2.0/db_1/rdbms/public -I/usr/include/python2.6 -c cx_Oracle.c -o build/temp.linux-x86_64-2.6-11g/cx_Oracle.o -DBUILD_VERSION=5.2.1 creating build/lib.linux-x86_64-2.6-11g gcc -pthread -shared build/temp.linux-x86_64-2.6-11g/cx_Oracle.o -L/u01/app/oracle/product/11.2.0/db_1/lib -L/usr/lib64 -lclntsh -lpython2.6 -o build/lib.linux-x86_64-2.6-11g/cx_Oracle.so 最后一行命令,使用了-lclntsh来链接Oracle的客户端动态库,我们需要把这个改成静态lib就可以了 gcc -pthread -shared build/temp.linux-x86_64-2.6-11g/cx_Oracle.o /u01/app/oracle/product/11.2.0/db_1/lib/libclntst11.a -L/usr/lib64 -lpython2.6 -o build/lib.linux-x86_64-2.6-11g/cx_Oracle.so 再运行 python setup.py install running install running bdist_egg ... creating 'dist/cx_Oracle-5.2.1-py2.6-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it 在dist目录下生成了cx_Oracle-5.2.1-py2.6-linux-x86_64.egg,这个就是可以直接使用的cx_Oracle,对oracle动态库没有依赖。
如何build一个静态编译的MySQL-python 这里的静态是指静态链接mysqlclient lib。 MySQL-python是python访问MySQL数据库常用的一个模块,但是它的编译依赖os的很多环境, build-requires = python-devel mysql-devel zlib-devel openssl-devel 要是能先静态编译好,就不需要依赖这些环境了。 修改site.cfg static = True python setup.py build building '_mysql' extension creating build/temp.linux-x86_64-2.6 gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -Dversion_info=(1,2,5,'final',1) -D__version__=1.2.5 -I/usr/local/mysql/include -I/usr/include/python2.6 -c _mysql.c -o build/temp.linux-x86_64-2.6/_mysql.o -fPIC -g -fabi-version=2 -fno-omit-frame-pointer -fno-strict-aliasing 在包含自 _mysql.c:44 的文件中: /usr/local/mysql/include/my_config.h:443:1: 警告:“HAVE_WCSCOLL”重定义 在包含自 /usr/include/python2.6/pyconfig.h:6 的文件中, 从 /usr/include/python2.6/Python.h:8, 从 _mysql.c:29: /usr/include/python2.6/pyconfig-64.h:808:1: 警告:这是先前定义的位置 gcc -pthread -shared build/temp.linux-x86_64-2.6/_mysql.o /usr/local/mysql/lib/libmysqlclient.a -L/usr/local/mysql/lib -L/usr/lib64 -lpthread -lm -lrt -ldl -lpython2.6 -o build/lib.linux-x86_64-2.6/_mysql.so 看上面最后的gcc命令,链接的是MySQL静态lib: libmysqlclient.a, 如果site.cfg里配置了 thread_safe=True,会链接 libmysqlclient_r.a。 gcc -pthread -shared build/temp.linux-x86_64-2.6/_mysql.o /usr/local/mysql/lib/libmysqlclient_r.a -L/usr/local/mysql/lib -L/usr/lib64 -lmysqlclient -lpthread -lm -lrt -ldl -lpython2.6 -o build/lib.linux-x86_64-2.6/_mysql.so 上面的命令还链接了 -lmysqlclient动态库,把这个去掉再执行。 build好之后, python setup.py install >>> import MySQLdb /usr/lib64/python2.6/site-packages/MySQL_python-1.2.5-py2.6-linux-x86_64.egg/_mysql.py:3: UserWarning: Module _mysql was already imported from /usr/lib64/python2.6/site-packages/MySQL_python-1.2.5-py2.6-linux-x86_64.egg/_mysql.pyc, but /root/MySQL-python-1.2.5 is being added to sys.path Traceback (most recent call last): File "<stdin>", line 1, in <module> File "MySQLdb/__init__.py", line 19, in <module> import _mysql File "build/bdist.linux-x86_64/egg/_mysql.py", line 7, in <module> File "build/bdist.linux-x86_64/egg/_mysql.py", line 6, in __bootstrap__ ImportError: /root/.python-eggs/MySQL_python-1.2.5-py2.6-linux-x86_64.egg-tmp/_mysql.so: undefined symbol: __cxa_pure_virtual import 报错,网络上搜一下,可以找到解决方案:把gcc换成g++ g++ -pthread -shared build/temp.linux-x86_64-2.6/_mysql.o /usr/local/mysql/lib/libmysqlclient_r.a -L/usr/local/mysql/lib -L/usr/lib64 -lpthread -lm -lrt -ldl -lpython2.6 -o build/lib.linux-x86_64-2.6/_mysql.so python setup.py install 这次安装之后,MySQLdb就可以正常访问,连接数据库了。 如何在其它机器上安装 以 python 2.6为例 把上面安装好的python egg文件(cp /usr/lib64/python2.6/site-packages/MySQL_python-1.2.5-py2.6-linux-x86_64.egg) copy到其他机器上对应的目录, cp MySQL_python-1.2.5-py2.6-linux-x86_64.egg /usr/lib64/python2.6/site-packages/ echo MySQL_python-1.2.5-py2.6-linux-x86_64.egg > /usr/lib64/python2.6/site-packages/pythonmysql.pth 就可以直接使用MySQLdb模块了。不依赖机器上的mysqlclient。
使用python requests访问一个rest api时,总是报错: 403 {"detail":"Invalid username/password."} 换成curl访问就没有问题。 为什么会这样呢? 把requests的请求header print出啦,看到加了authorization,但是代码并没有使用任何认证。 {'Content-Length': '650', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'user-agent': 'a', 'Connection': 'keep-alive', 'content-type': 'application/json', 'Authorization': 'Basic dGVzdDoxMjM='} 猜测和机器环境有关系。用户认证信息应该是存在某个文件中的吧? 用strace看一下打开了哪些文件 strace python test.py open("/usr/lib64/python2.7/shlex.py", O_RDONLY) = 4 open("/usr/lib64/python2.7/shlex.pyc", O_RDONLY) = 5 open("/root/.netrc", O_RDONLY) = 3 open("/usr/lib64/python2.7/encodings/idna.py", O_RDONLY) = 3 open("/usr/lib64/python2.7/encodings/idna.pyc", O_RDONLY) = 4 open("/usr/lib64/python2.7/stringprep.py", O_RDONLY) = 4 访问了一个可疑文件 ~/.netrc $ cat ~/.netrc default login test password 123 把这个文件拿掉,接口访问就没有报错了。 查一下requests的文档,requests果然会读取.netrc文件http://docs.python-requests.org/en/master/user/authentication/?highlight=netrc
Linux tail -n的功能是如何实现的? 如果文件非常大,如何快速的取到文件的最后n行数据呢? 使用strace 命令跟踪下,发现如下的输出: strace tail -1000 test.log 2>&1 | grep lseek lseek(3, 0, SEEK_CUR) = 0 lseek(3, 0, SEEK_END) = 52876697 lseek(3, 52871168, SEEK_SET) = 52871168 lseek(3, 52862976, SEEK_SET) = 52862976 lseek(3, 52854784, SEEK_SET) = 52854784 ... 使用SEEK_END找到文件的大小(为52876697) 然后使用SEEK_SET定位到文件的最后一个page,从上面这个case看,page大小是8192,最后一个page没满8192字节,seek到52876697 - 52876697 % 8192 == 52871168这个position 读取这一个page的所有数据,统计总共出现多少个换行 如果里面的行数不到n,再往前seek一个page(52871168-8192 == 52862976),读取数据,统计行数。 循环步骤4,直到取到n行数据。 顺序读取数据,输出n行数据。
交付服务器或数据库的时候,我们需要对服务器和数据库的性能有一定的了解。可以使用sysbench对系统做一些压测。 sysbench sysbench下载 https://github.com/akopytov/sysbench 可以使用0.5版本,支持lua插件压测数据库 sysbench 数据库压测lua脚本例子 https://github.com/percona/sysbench-scripts Usage: sysbench --test=<test-name> [options]... <command> Commands: prepare run cleanup help version General options: --num-threads=N number of threads to use [1] --max-requests=N limit for total number of requests [10000] --max-time=N limit for total execution time in seconds [0] --forced-shutdown=STRING amount of time to wait after --max-time before forcing shutdown [off] --thread-stack-size=SIZE size of stack per thread [64K] --tx-rate=N target transaction rate (tps) [0] --report-interval=N periodically report intermediate statistics with a specified interval in seconds. 0 disables intermediate reports [0] --report-checkpoints=[LIST,...]dump full statistics and reset all counters at specified points in time. The argument is a list of comma-separated values representing the amount of time in seconds elapsed from start of test when report checkpoint(s) must be performed. Report checkpoints are off by default. [] --test=STRING test to run --debug=[on|off] print more debugging info [off] --validate=[on|off] perform validation checks where possible [off] --help=[on|off] print help and exit --version=[on|off] print version and exit [off] --rand-init=[on|off] initialize random number generator [off] --rand-type=STRING random numbers distribution {uniform,gaussian,special,pareto} [special] --rand-spec-iter=N number of iterations used for numbers generation [12] --rand-spec-pct=N percentage of values to be treated as 'special' (for special distribution) [1] --rand-spec-res=N percentage of 'special' values to use (for special distribution) [75] --rand-seed=N seed for random number generator, ignored when 0 [0] --rand-pareto-h=N parameter h for pareto distibution [0.2] --config-file=FILENAME File containing command line options Log options: --verbosity=N verbosity level {5 - debug, 0 - only critical messages} [3] --percentile=N percentile rank of query response times to count [95] General database options: --db-driver=STRING specifies database driver to use ('help' to get list of available drivers) --db-ps-mode=STRING prepared statements usage mode {auto, disable} [auto] --db-debug=[on|off] print database-specific debug information [off] Compiled-in database drivers: mysql - MySQL driver mysql options: --mysql-host=[LIST,...] MySQL server host [localhost] --mysql-port=N MySQL server port [3306] --mysql-socket=[LIST,...] MySQL socket --mysql-user=STRING MySQL user [sbtest] --mysql-password=STRING MySQL password [] --mysql-db=STRING MySQL database name [sbtest] --mysql-table-engine=STRING storage engine to use for the test table {myisam,innodb,bdb,heap,ndbcluster,federated} [innodb] --mysql-engine-trx=STRING whether storage engine used is transactional or not {yes,no,auto} [auto] --mysql-ssl=[on|off] use SSL connections, if available in the client library [off] --mysql-compression=[on|off] use compression, if available in the client library [off] --myisam-max-rows=N max-rows parameter for MyISAM tables [1000000] --mysql-debug=[on|off] dump all client library calls [off] --mysql-ignore-errors=[LIST,...]list of errors to ignore, or "all" [1213,1020,1205] --mysql-dry-run=[on|off] Dry run, pretent that all MySQL client API calls are successful without executing them [off] Compiled-in tests: fileio - File I/O test cpu - CPU performance test memory - Memory functions speed test threads - Threads subsystem performance test mutex - Mutex performance test See 'sysbench --test=<name> help' for a list of options for each test. cpu sysbench --test=cpu run 指定多个线程: sysbench --test=cpu --num-threads=30 run 指定测试时间: sysbench --test=cpu --max-time=10 run 指定测试请求数 sysbench --test=cpu --max-requests=100000 --num-threads=100 run io $ sysbench --test=fileio help sysbench 0.5: multi-threaded system evaluation benchmark fileio options: --file-num=N number of files to create [128] --file-block-size=N block size to use in all IO operations [16384] --file-total-size=SIZE total size of files to create [2G] --file-test-mode=STRING test mode {seqwr, seqrewr, seqrd, rndrd, rndwr, rndrw} --file-io-mode=STRING file operations mode {sync,async,mmap} [sync] --file-async-backlog=N number of asynchronous operatons to queue per thread [128] --file-extra-flags=STRING additional flags to use on opening files {sync,dsync,direct} [] --file-fsync-freq=N do fsync() after this number of requests (0 - don't use fsync()) [100] --file-fsync-all=[on|off] do fsync() after each write operation [off] --file-fsync-end=[on|off] do fsync() at the end of test [on] --file-fsync-mode=STRING which method to use for synchronization {fsync, fdatasync} [fsync] --file-merged-requests=N merge at most this number of IO requests if possible (0 - don't merge) [0] --file-rw-ratio=N reads/writes ratio for combined test [1.5] io测试中比较重要的几个参数: file-test-mode: 读还是写?顺序访问还是随机访问? 对于数据库,需要测试重点关注随机IO的性能。 file-fsync-freq: 测试fsync的性能。 file-rw-ratio:读写比例 io 测试 准备测试文件 mkdir test cd test sysbench --test=fileio --file-test-mode=rndrw prepare dtstack@test01:~/test$ ls -lh total 2.0G -rw------- 1 dtstack dtstack 16M Jun 28 23:00 test_file.0 -rw------- 1 dtstack dtstack 16M Jun 28 23:00 test_file.1 -rw------- 1 dtstack dtstack 16M Jun 28 23:00 test_file.10 -rw------- 1 dtstack dtstack 16M Jun 28 23:00 test_file.100 -rw------- 1 dtstack dtstack 16M Jun 28 23:00 test_file.101 运行压测, 随机读写 sysbench --test=fileio --file-test-mode=rndrw --max-requests=100000 --num-threads=16 run 压测运行过程中,注意观察io性能。比如我们的这个测试机器,写入iops 500左右,写数据吞度量8M/s, 响应时间近30ms,性能比较一般。 iostat -kx 3 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.33 115.00 5.33 488.00 114.67 7934.67 32.63 13.67 27.76 14.00 27.91 2.03 100.00 avg-cpu: %user %nice %system %iowait %steal %idle 9.62 0.17 3.12 54.37 0.00 32.73 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 121.00 5.00 474.00 128.00 7732.00 32.82 14.15 29.00 6.40 29.24 2.09 100.00 avg-cpu: %user %nice %system %iowait %steal %idle 8.98 0.17 2.12 62.49 0.00 26.25 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 2.33 118.00 3.67 488.67 106.67 7909.33 32.56 14.56 30.03 17.09 30.12 2.03 100.00 测试内存分配和访问 dtstack@test01:~/test$ sysbench --test=memory help sysbench 0.5: multi-threaded system evaluation benchmark memory options: --memory-block-size=SIZE size of memory block for test [1K] --memory-total-size=SIZE total size of data to transfer [100G] --memory-scope=STRING memory access scope {global,local} [global] --memory-hugetlb=[on|off] allocate memory from HugeTLB pool [off] --memory-oper=STRING type of memory operations {read, write, none} [write] --memory-access-mode=STRING memory access mode {seq,rnd} [seq] dtstack@test01:~/test$ sysbench --test=memory --num-threads=64 run sysbench 0.5: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 64 Random number generator seed is 0 and will be ignored Initializing worker threads... Threads started! Operations performed: 104857600 (890624.94 ops/sec) 102400.00 MB transferred (869.75 MB/sec) MySQL压测 sysbench 0.5 使用lua插件的形式压测数据库。 --test 参数指定lua脚本,lua脚本的内容参考下面的例子, 或者参考 https://github.com/percona/sysbench-scripts dtstack@test01:~/junda/sysbench-scripts/timeseries$ sysbench --test=insertv1.lua --mysql-host=127.0.0.1 --mysql-user=root --mysql-password=xx --max-requests=1000 prepare sysbench 0.5: multi-threaded system evaluation benchmark Creating table 'sensordata' ... dtstack@test01:~/junda/sysbench-scripts/timeseries$ sysbench --test=insertv1.lua --mysql-host=127.0.0.1 --mysql-user=root --mysql-password=xx --max-requests=1000 run sysbench 0.5: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 1 Random number generator seed is 0 and will be ignored Initializing worker threads... 最简单的lua脚本,往一个表插入数据 $ cat test.lua function event(thread_id) db_query("insert into t1(a ) values(100)") end $ cat create.lua function event(thread_id) for i = 1,10 do db_query("create table t" .. i .. "(a int)") end end $ cat drop.lua function event(thread_id) for i = 1,10 do db_query("drop table if exists t" .. i ) end end
MySQL性能优化 平时我们在使用MySQL的时候,怎么评估系统的运行状态,怎么快速定位系统瓶颈,又如何快速解决问题呢? 本文总结了多年来MySQL优化的经验,系统介绍MySQL优化的方法。 OS性能分析 使用top观察top cpu/memory进程 ~ top top - 09:34:29 up 10 days, 20:11, 1 user, load average: 0.61, 0.59, 0.60 Tasks: 208 total, 1 running, 207 sleeping, 0 stopped, 0 zombie %Cpu0 : 7.0 us, 2.3 sy, 0.0 ni, 90.0 id, 0.3 wa, 0.3 hi, 0.0 si, 0.0 st %Cpu1 : 3.0 us, 2.7 sy, 0.7 ni, 91.3 id, 2.0 wa, 0.3 hi, 0.0 si, 0.0 st KiB Mem: 4046820 total, 3851092 used, 195728 free, 133212 buffers KiB Swap: 0 total, 0 used, 0 free. 270700 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1035 root 20 0 436892 260444 2444 S 4.0 6.4 91:51.04 celery 9644 root 25 5 845648 120256 1808 S 1.3 3.0 60:00.06 python 3718 root 20 0 808752 6292 2224 S 0.7 0.2 27:55.43 AliHids 4322 root 20 0 250296 5020 2016 S 0.7 0.1 2:28.77 zabbix_server 16331 admin 20 0 1575644 307948 1252 S 0.7 7.6 5:04.47 ruby2.1 22518 997 20 0 759948 251540 1184 S 0.7 6.2 73:24.72 mysqld 7 root 20 0 0 0 0 S 0.3 0.0 16:37.29 rcu_sched 8 root 20 0 0 0 0 S 0.3 0.0 15:10.74 rcuos/0 top - 09:35:29 up 10 days, 20:12, 1 user, load average: 0.22, 0.48, 0.56 Tasks: 207 total, 4 running, 203 sleeping, 0 stopped, 0 zombie %Cpu(s): 4.5 us, 2.5 sy, 0.7 ni, 90.8 id, 1.2 wa, 0.3 hi, 0.0 si, 0.0 st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13103 mysql 20 0 1752220 580048 3852 S 0.3 14.3 5:01.95 mysqld 16331 admin 20 0 1575644 307948 1252 S 0.0 7.6 5:04.62 ruby2.1 1035 root 20 0 436892 260444 2444 S 2.0 6.4 91:52.69 celery 1034 root 20 0 433440 256924 2436 S 0.0 6.3 92:49.27 celery 13841 admin 20 0 681696 254392 2364 S 0.0 6.3 0:03.20 ruby2.1 13844 admin 20 0 681564 254180 2360 S 0.0 6.3 0:03.21 ruby2.1 13838 admin 20 0 680188 252484 2276 S 0.0 6.2 0:02.68 ruby2.1 22518 997 20 0 759948 251536 1180 S 0.7 6.2 73:24.95 mysqld 13189 admin 20 0 675812 250620 2448 S 0.0 6.2 0:21.44 ruby2.1 9644 root 25 5 845648 120300 1808 S 2.0 3.0 60:01.63 python 使用mpstat观察每个CPU核心的CPU使用情况 ~ mpstat -P ALL 1 Linux 3.13.0-32-generic (dtstack-dev1) 04/16/2016 _x86_64_ (2 CPU) 09:36:22 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 09:36:23 AM all 3.02 0.50 2.01 1.01 0.00 0.00 0.00 0.00 0.00 93.47 09:36:23 AM 0 3.00 0.00 1.00 0.00 1.00 0.00 0.00 0.00 0.00 95.00 09:36:23 AM 1 2.97 0.99 2.97 1.98 0.00 0.00 0.00 0.00 0.00 91.09 使用iostat观察系统io状况 ~ iostat -kx 3 Linux 3.13.0-32-generic (dtstack-dev1) 04/16/2016 _x86_64_ (2 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 6.86 0.18 3.08 3.41 0.09 86.39 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvda 1.13 18.64 14.80 50.55 237.69 804.84 31.91 1.38 21.16 58.68 10.18 1.14 7.48 使用sar -n DEV观察网卡流量 ~ sar -n DEV 3 Linux 3.13.0-32-generic (dtstack-dev1) 04/16/2016 _x86_64_ (2 CPU) 09:40:02 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil 09:40:05 AM veth9b40f2c 68.67 64.67 5.81 6.50 0.00 0.00 0.00 0.00 09:40:05 AM eth0 76.33 69.00 5.26 5.79 0.00 0.00 0.00 0.00 09:40:05 AM eth1 162.33 159.00 26.10 27.22 0.00 0.00 0.00 0.00 09:40:05 AM lo 133.00 133.00 30.23 30.23 0.00 0.00 0.00 0.00 使用vmstat查看系统内存使用情况 ~ vmstat -w 3 100 procs ---------------memory-------------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 265108 119284 218556 0 0 119 403 19 1 7 3 86 3 0 0 0 0 261616 119316 222416 0 0 4 6096 1198 1732 8 4 81 7 0 查看系统日志 ~ tail -2 /var/log/kern.log Apr 16 04:07:18 localhost kernel: [917047.906866] Out of memory: Kill process 2063 (ruby2.1) score 84 or sacrifice child Apr 16 04:07:18 localhost kernel: [917047.907696] Killed process 2063 (ruby2.1) total-vm:710128kB, anon-rss:342320kB, file-rss:0kB 使用dstat ~ dstat -a -i --tcp --vm ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- ----interrupts--- ----tcp-sockets---- -----virtual-memory---- usr sys idl wai hiq siq| read writ| recv send| in out | int csw | 77 78 79 |lis act syn tim clo|majpf minpf alloc free 7 3 86 3 0 0| 475k 1610k| 0 0 | 0 0 |1138 1972 | 70 127 68 | 20 117 0 545 0| 5 6100 3234 3242 2 2 95 1 1 0|8192B 928k| 149k 138k| 0 0 |1005 2396 | 30 97 10 | 20 117 1 546 0| 0 1357 544 818 2 2 94 2 0 0| 0 984k| 23k 27k| 0 0 | 958 1687 | 39 198 8 | 20 118 0 547 0| 0 2880 1324 1053 记录和查看历史数据 ~ dpkg -l | grep sysstat ii sysstat 10.2.0-1 amd64 system performance tools for Linux cron.d cat /etc/cron.d/sysstat # The first element of the path is a directory where the debian-sa1 # script is located PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday 5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file 59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 查看昨天的数据 cron.d sar -u -1 | head Linux 3.13.0-32-generic (dtstack-dev1) 04/15/2016 _x86_64_ (2 CPU) 12:00:01 AM CPU %user %nice %system %iowait %steal %idle 12:05:01 AM all 9.75 0.78 3.99 5.06 0.10 80.31 12:15:01 AM all 9.59 0.83 3.84 4.27 0.10 81.37 12:25:01 AM all 8.90 0.83 3.78 4.22 0.10 82.17 查看swap cron.d sar -W | head Linux 3.13.0-32-generic (dtstack-dev1) 04/16/2016 _x86_64_ (2 CPU) 12:00:01 AM pswpin/s pswpout/s 12:05:01 AM 0.00 0.00 12:15:01 AM 0.00 0.00 12:25:01 AM 0.00 0.00 查看load cron.d sar -q | head Linux 3.13.0-32-generic (dtstack-dev1) 04/16/2016 _x86_64_ (2 CPU) 12:00:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 blocked 12:05:01 AM 4 511 0.14 0.34 0.39 0 12:15:01 AM 6 512 0.14 0.30 0.37 0 12:25:01 AM 3 507 0.41 0.40 0.41 0 查看IO使用情况 cron.d sar -d | head Linux 3.13.0-32-generic (dtstack-dev1) 04/16/2016 _x86_64_ (2 CPU) 12:00:01 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 12:05:01 AM dev202-0 81.14 143.50 2076.97 27.37 0.61 7.47 1.36 11.00 12:15:01 AM dev202-0 62.26 47.31 1836.64 30.26 0.56 8.97 1.29 8.02 12:25:01 AM dev202-0 67.85 46.25 1904.55 28.75 0.52 7.59 1.28 8.72 查看内存使用情况 cron.d sar -r | head Linux 3.13.0-32-generic (dtstack-dev1) 04/16/2016 _x86_64_ (2 CPU) 12:00:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty 12:05:01 AM 238284 3808536 94.11 95652 180648 6231508 153.99 3486028 135316 0 12:15:01 AM 280972 3765848 93.06 82324 173376 6215156 153.58 3437744 142076 0 12:25:01 AM 240208 3806612 94.06 110392 190312 6195792 153.10 3467544 153012 0 查看多核CPU使用情况 cron.d sar -u -P ALL | head Linux 3.13.0-32-generic (dtstack-dev1) 04/16/2016 _x86_64_ (2 CPU) 12:00:01 AM CPU %user %nice %system %iowait %steal %idle 12:05:01 AM all 7.31 0.80 3.59 4.41 0.10 83.80 12:05:01 AM 0 7.89 0.93 3.95 6.55 0.06 80.61 12:05:01 AM 1 6.72 0.68 3.24 2.26 0.13 86.97 MySQL性能分析 配置slowlog mysql> show variables like 'long_query_time'; +-----------------+----------+ | Variable_name | Value | +-----------------+----------+ | long_query_time | 2.000000 | +-----------------+----------+ 1 row in set (0.00 sec) mysql> show variables like '%slow%'; +---------------------------+--------------------------------------+ | Variable_name | Value | +---------------------------+--------------------------------------+ | log_slow_admin_statements | OFF | | log_slow_slave_statements | OFF | | slow_launch_time | 2 | | slow_query_log | OFF | | slow_query_log_file | /var/lib/mysql/dtstack-dev1-slow.log | mysql> set global slow_query_log=ON; Query OK, 0 rows affected (0.06 sec) mysql> set global long_query_time=0.01; Query OK, 0 rows affected (0.02 sec) 分析slow log # User@Host: dtlog[dtlog] @ [10.0.0.5] Id: 279883 # Query_time: 0.012148 Lock_time: 0.000088 Rows_sent: 0 Rows_examined: 0 use dtops_perf; SET timestamp=1460773285; insert into dtops_process_stat(user_id, tenant_id, host_name, proc_exec, proc_args, proc_cnt, cpu_used, io_read_mb, io_write_mb, mem_rss_mb, check_time ) values ( 1,1,'iZ233z4w1cdZ', 'python', '/usr/bin/python manage.py runserver 0.0.0.0:8000 ', '1', '0.71', '0.0', '0.02734375', '490.796875', '2016-04-16 02:21:25.636334' ); 使用pk-query-digest mysql pt-query-digest dtstack-dev1-slow.log | head -100 # 540ms user time, 20ms system time, 26.32M rss, 84.35M vsz # Current date: Sat Apr 16 10:30:05 2016 # Hostname: dtstack-dev1 # Files: dtstack-dev1-slow.log # Overall: 1.04k total, 46 unique, 0.00 QPS, 0.00x concurrency ___________ # Time range: 2016-02-18 18:00:39 to 2016-04-16 10:29:08 # Attribute total min max avg 95% stddev median # ============ ======= ======= ======= ======= ======= ======= ======= # Exec time 15021s 3ms 110s 14s 40s 15s 15s # Lock time 5141s 0 69s 5s 18s 7s 1s # Rows sent 92.31k 0 6.53k 91.15 329.68 313.17 0 # Rows examine 785.62k 0 360.68k 775.77 329.68 14.45k 0 # Query size 265.90k 6 1.69k 262.57 874.75 332.24 56.92 # Profile # Rank Query ID Response time Calls R/Call V/M Item # ==== ================== =============== ===== ======= ===== ============ # 1 0x8D6600C5091DA2FB 2545.6864 16.9% 83 30.6709 3.77 INSERT dtops_mysql_stat # 2 0xFA23984B8E060B24 2179.4857 14.5% 53 41.1224 10.99 SELECT INFORMATION_SCHEMA.INNODB_METRICS # 3 0x28FC5B5D583E2DA6 1773.7506 11.8% 83 21.3705 0.48 SHOW GLOBAL STATUS # 4 0xE2A7DFF3037A3A81 1537.1841 10.2% 80 19.2148 0.53 SHOW /*!50003 GLOBAL*/ STATUS # 5 0xAC0F6B2C861D21D7 826.3786 5.5% 31 26.6574 6.46 SELECT ecs_ecsinstance # 6 0x026EA5484A97C3C0 801.5408 5.3% 31 25.8562 3.18 SELECT information_schema.processlist # 7 0x94F6F954ED486D6F 791.8242 5.3% 23 34.4271 9.62 SELECT ecs_ecsdisk # 8 0xD8AB5099D4B1CC21 675.0423 4.5% 30 22.5014 0.57 SELECT auth_user # 9 0xDE28A3CBA34F3335 589.1568 3.9% 26 22.6599 1.72 SHOW /*!40003 GLOBAL*/ VARIABLES # 10 0x0ED17AA0B68A517F 521.2626 3.5% 29 17.9746 10.80 INSERT dtops_linux_stat # 11 0x53168008EC50F512 358.3756 2.4% 29 12.3578 0.17 SELECT # 12 0x3EB3BC982F72ED66 319.3946 2.1% 13 24.5688 1.07 INSERT dtops_disk_free # 13 0x50C47AC11BB2ABD8 280.2534 1.9% 14 20.0181 2.62 SELECT ecs_hourinstancemonitordata # 14 0x3783A78AF527ACA4 207.8257 1.4% 10 20.7826 2.79 SELECT dtops_host_list # 15 0xEF2E30AE81C60FE3 192.6594 1.3% 18 10.7033 13.90 INSERT dtops_disk_stat # 16 0xEE7686F2E0817183 181.4489 1.2% 7 25.9213 5.01 SELECT ecs_ecsinstancestatistic # 17 0x6238F0E765A0C029 133.1747 0.9% 5 26.6349 0.99 SELECT information_schema.innodb_trx information_schema.innodb_lock_waits # 18 0x577997B04D055D37 114.8435 0.8% 5 22.9687 2.10 SHOW TABLES # 19 0xED823F6211063056 106.3436 0.7% 4 26.5859 3.22 UPDATE ecs_ecsinstancestatistic # 20 0xD2BA5922D8365D80 86.9031 0.6% 2 43.4515 0.16 SELECT rds_rdssqlreport # MISC 0xMISC 798.2521 5.3% 461 1.7316 0.0 <26 ITEMS> # Query 1: 0.00 QPS, 0.00x concurrency, ID 0x8D6600C5091DA2FB at byte 144946 # This item is included in the report because it matches --limit. # Scores: V/M = 3.77 # Time range: 2016-02-18 22:22:02 to 2016-03-01 14:03:43 # Attribute pct total min max avg 95% stddev median # ============ === ======= ======= ======= ======= ======= ======= ======= # Count 8 83 # Exec time 16 2546s 13s 83s 31s 42s 11s 29s # Lock time 30 1549s 1s 69s 19s 32s 11s 17s # Rows sent 0 0 0 0 0 0 0 0 # Rows examine 0 0 0 0 0 0 0 0 # Query size 26 71.63k 866 912 883.69 874.75 20.83 874.75 # String: # Databases dtops_perf # Hosts dtstack-dev1 # Users lazydba # Query_time distribution # 1us # 10us # 100us # 1ms # 10ms # 100ms # 1s # 10s+ ################################################################ # Tables # SHOW TABLE STATUS FROM `dtops_perf` LIKE 'dtops_mysql_stat'\G # SHOW CREATE TABLE `dtops_perf`.`dtops_mysql_stat`\G insert into dtops_mysql_stat(user_id, tenant_id, instance_name, innodb_rows_deleted, threads_connected, innodb_buffer_pool_pages_flushed, ins, innodb_os_log_written, innodb_rows_updated, innodb_data_fsyncs, select_scan, questions, innodb_log_writes, innodb_data_read, qps, open_files, sel, recv, threads_running, innodb_rows_read, innodb_rows_inserted, innodb_row_lock_waits, sort_rows, innodb_row_lock_time_avg, upd, open_tables, innodb_data_written, sqps, innodb_data_writes, tps, innodb_bp_reads, innodb_buffer_pool_pages_dirty, del, innodb_data_reads, innodb_bp_read_requests, sent, innodb_os_log_fsyncs, check_time) values(1,1,'dtstack-dev:3306', '0', '14', '1', '0', '204', '0', '1', '2', '8', '0', '0', '2', '-1', '2', '690', '5', '0', '0', '0', '0', '0', '0', '0', '39628', '0', '1', '0', '0', '0', '0', '0', '3', '14535', '0', '2016-02-26 16:31:50.036858')\G TCPDUMP + pt-query-digest # tcpdump -s 65535 -i any -c 1000 port 3306 -w tmp.pcap tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes 1000 packets captured 1323 packets received by filter 0 packets dropped by kernel # tcpdump -r tmp.pcap -s 65535 -x -nn -q -tttt | pt-query-digest --type tcpdump reading from file tmp.pcap, link-type LINUX_SLL (Linux cooked) Reading from STDIN ... TCP session 192.168.0.1:53744 had errors, will save them in /tmp/pt-query-digest-errors.dYTKmpS # 480ms user time, 20ms system time, 28.12M rss, 86.31M vsz # Current date: Sat Apr 16 10:57:03 2016 # Hostname: dtstack-dev1 # Files: STDIN # Overall: 131 total, 38 unique, 28.04 QPS, 0.03x concurrency ____________ # Time range: 2016-04-16 10:56:21.810539 to 10:56:26.482281 # Attribute total min max avg 95% stddev median # ============ ======= ======= ======= ======= ======= ======= ======= # Exec time 128ms 0 18ms 976us 5ms 3ms 98us # Rows affecte 51 0 10 0.39 0.99 1.37 0 # Query size 14.78k 6 920 115.52 537.02 204.71 26.08 # Warning coun 0 0 0 0 0 0 0 # Boolean: # No index use 12% yes, 87% no # Profile # Rank Query ID Response time Calls R/Call V/M Item # ==== ================== ============= ===== ====== ===== =============== # 1 0x5D51E5F01B88B79E 0.0667 52.1% 12 0.0056 0.01 ADMIN CONNECT # 2 0x11D0C2162F1C5721 0.0216 16.9% 2 0.0108 0.01 INSERT dtops_disk_stat # 3 0x0ED17AA0B68A517F 0.0154 12.0% 2 0.0077 0.00 INSERT dtops_linux_stat # 4 0x0E9D1E6D473DBD20 0.0052 4.1% 1 0.0052 0.00 SELECT hosts items functions triggers # 5 0x251BDDE46C5899A9 0.0034 2.6% 1 0.0034 0.00 SELECT hosts items functions triggers # 6 0x3AEAAD0E15D725B5 0.0015 1.2% 24 0.0001 0.00 SET # 7 0x4256B25BD80D888C 0.0013 1.0% 4 0.0003 0.00 SELECT dtuic_userprofile # 8 0x860902B0157FAFD9 0.0012 0.9% 5 0.0002 0.00 INSERT history_uint # 9 0x60038AB9468C9A61 0.0009 0.7% 1 0.0009 0.00 SELECT items hosts # 10 0x8A10718E7D316421 0.0009 0.7% 8 0.0001 0.00 BEGIN # 11 0xCC47B42511EA22DD 0.0009 0.7% 12 0.0001 0.00 SET # 12 0x3B93991D5396A384 0.0008 0.6% 12 0.0001 0.00 SET # 13 0x2719E98AB2E682FE 0.0007 0.5% 2 0.0003 0.00 UPDATE hosts # 14 0xAC7F5F65B4A51C07 0.0006 0.5% 8 0.0001 0.00 COMMIT # 15 0xBF081E5AB46DF811 0.0006 0.4% 1 0.0006 0.00 SELECT httptest hosts # MISC 0xMISC 0.0063 5.0% 36 0.0002 0.0 <23 ITEMS> # Query 1: 12.87 QPS, 0.07x concurrency, ID 0x5D51E5F01B88B79E at byte 1657076 # This item is included in the report because it matches --limit. # Scores: V/M = 0.01 # Time range: 2016-04-16 10:56:23.901443 to 10:56:24.833671 # Attribute pct total min max avg 95% stddev median # ============ === ======= ======= ======= ======= ======= ======= ======= # Count 9 12 # Exec time 52 67ms 279us 18ms 6ms 16ms 7ms 445us # Rows affecte 0 0 0 0 0 0 0 0 # Query size 2 360 30 30 30 30 0 30 # Warning coun 0 0 0 0 0 0 0 0 # String: # Databases dtuicmysq... (8/66%), (4/33%) # Hosts 192.168.1.3 (8/66%), 218.74.37.4 (4/33%) # Users dtcloud_de... (8/66%), dtlog (4/33%) # Query_time distribution # 1us # 10us # 100us ################################################################ # 1ms # 10ms ################################ # 100ms # 1s # 10s+ administrator command: Connect\G # Query 2: 2.57 QPS, 0.03x concurrency, ID 0x11D0C2162F1C5721 at byte 1660770 # This item is included in the report because it matches --limit. # Scores: V/M = 0.01 # Time range: 2016-04-16 10:56:24.096396 to 10:56:24.874351 # Attribute pct total min max avg 95% stddev median # ============ === ======= ======= ======= ======= ======= ======= ======= # Count 1 2 # Exec time 16 22ms 5ms 16ms 11ms 16ms 8ms 11ms # Rows affecte 3 2 1 1 1 1 0 1 # Query size 6 962 481 481 481 481 0 481 # Warning coun 0 0 0 0 0 0 0 0 # String: # Databases # Hosts 123.74.33.4 # Users dtlog # Query_time distribution # 1us # 10us # 100us # 1ms ################################################################ # 10ms ################################################################ # 100ms # 1s # 10s+ # Tables # SHOW TABLE STATUS FROM `1` LIKE 'dtops_disk_stat'\G # SHOW CREATE TABLE `1`.`dtops_disk_stat`\G insert into dtops_disk_stat( user_id, tenant_id, host_name, disk_name, rd_s, rd_avgkb, rd_mb_s, rd_mrg_s, rd_cnc, rd_rt, wr_s, wr_avgkb, wr_mb_s, wr_mrg_s, wr_cnc, wr_rt, busy, in_prg, io_s, qtime, stime, check_time) values (1, 1, 'test01', 'sda', '0.0','0.0','0.0','0.0','0.0','0.0','6.4','11.56','0.07','1.0','0.07','10.0','6.48','0.0','6.4','1.24','8.76','2016-04-16 02:54:12.369120') \G 通过processlist查看系统当前运行情况 mysql> select * from information_schema.processlist where command !='Sleep' order by time desc limit 10\G *************************** 1. row *************************** ID: 292486 USER: root HOST: localhost DB: NULL COMMAND: Query TIME: 0 STATE: executing INFO: select * from information_schema.processlist where command !='Sleep' order by time desc limit 10 通过genenral log查看系统运行的SQL mysql> show variables like '%general%'; +------------------+---------------------------------+ | Variable_name | Value | +------------------+---------------------------------+ | general_log | OFF | | general_log_file | /var/lib/mysql/dtstack-dev1.log | +------------------+---------------------------------+ 2 rows in set (0.00 sec) mysql> set global general_log=ON; tail -10f /var/lib/mysql/dtstack-dev1.log 294645 Query SET NAMES utf8 294645 Query set autocommit=0 294645 Query set autocommit=1 294645 Query SET SQL_AUTO_IS_NULL = 0 294645 Quit 294646 Connect user@host on dtuic 294646 Query SET NAMES utf8 294646 Query set autocommit=0 294646 Query set autocommit=1 294646 Query SET SQL_AUTO_IS_NULL = 0 294646 Query SELECT * FROM `dtuic_userprofile` WHERE `dtuic_userprofile`.`agent_token` = 'xx' 294646 Quit 294648 Connect user@host on dtops_db 294648 Query SET NAMES utf8 294648 Query SET NAMES utf8 294648 Query set autocommit=0 294648 Query set autocommit=1 294648 Query SET SQL_AUTO_IS_NULL = 0 294648 Query insert into dtops_linux_stat(user_id, tenant_id, host_name, load1, load5, load15, cpu_usr, cpu_sys, cpu_idle, cpu_wai, mem_used, mem_free, mem_buff, mem_cache, swap_used, swap_free, io_read, io_write, io_queue, io_await, io_svctm, io_util, io_read_mb, io_write_mb, net_recv, net_send, proc_run, proc_blk, proc_new, intr, csw, softirq, page_in, page_out, swap_in, swap_out, page_fault, page_mjfault, check_time) values(1, 1, 'test01', '0.63', '0.72', '0.79', '6.44', '1.21', '91.4', '0.95', '9264.0', '281.0', '253.0', '5851.0', '15680.0', '696.0', '0.0', '11.2', '0.0', '13.15', '5.15', '6.8', '0.0', '0.1', '1.2', '1.46', '0.4', '1.0', '0.0', '9953.4', '10081.6', '516.6', '0.0', '102.0', '0.0', '0.0', '13.0', '0.0', '2016-04-16 02:39:03.487738') 294648 Quit 通过binlog查看系统DML root@test01:/var/lib/my3307# mysqlbinlog -v mysql-bin.000002 > tmp.sql root@test01:/var/lib/my3307# head -100 tmp.sql /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; /*!40019 SET @@session.max_insert_delayed_threads=0*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 4 #160413 19:28:37 server id 10 end_log_pos 120 CRC32 0xef6cb6a8 Start: binlog v 4, server v 5.6.29-log created 160413 19:28:37 # Warning: this binlog is either in use or was not closed properly. BINLOG ' ZS0OVw8KAAAAdAAAAHgAAAABAAQANS42LjI5LWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAEzgNAAgAEgAEBAQEEgAAXAAEGggAAAAICAgCAAAACgoKGRkAAai2 bO8= '/*!*/; # at 120 #160413 19:28:37 server id 10 end_log_pos 191 CRC32 0xaaca06a8 Previous-GTIDs # 08eef41e-df72-11e5-be92-0024e84cb72f:1-12 # at 191 #160413 19:29:43 server id 6033 end_log_pos 239 CRC32 0xde16d70b GTID [commit=yes] SET @@SESSION.GTID_NEXT= '08eef41e-df72-11e5-be92-0024e84cb72f:13'/*!*/; # at 239 #160413 19:29:43 server id 6033 end_log_pos 333 CRC32 0x61812d71 Query thread_id=539 exec_time=0 error_code=0 SET TIMESTAMP=1460546983/*!*/; SET @@session.pseudo_thread_id=539/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1075838976/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C utf8 *//*!*/; SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; create database hell /*!*/; # at 333 #160413 20:10:37 server id 10 end_log_pos 381 CRC32 0x590a33e4 GTID [commit=yes] SET @@SESSION.GTID_NEXT= 'c59b9266-0090-11e6-968b-0024e84cb72f:1'/*!*/; # at 381 #160413 20:10:37 server id 10 end_log_pos 472 CRC32 0x1b5e7367 Query thread_id=512 exec_time=0 error_code=0 SET TIMESTAMP=1460549437/*!*/; create database yyy # at 863 #160413 20:14:21 server id 6033 end_log_pos 906 CRC32 0xfd739e72 Table_map: `xxx`.`a` mapped to number 71 # at 906 #160413 20:14:21 server id 6033 end_log_pos 946 CRC32 0x5d365030 Write_rows: table id 71 flags: STMT_END_F BINLOG ' HTgOVxORFwAAKwAAAIoDAAAAAEcAAAAAAAEAA3h4eAABYQABAwABcp5z/Q== HTgOVx6RFwAAKAAAALIDAAAAAEcAAAAAAAEAAgAB//5kAAAAMFA2XQ== '/*!*/; ### INSERT INTO `xxx`.`a` ### SET ### @1=100 # at 946 查看MySQL错误日志 mysql> show variables like 'log_error'; +---------------+--------------------------+ | Variable_name | Value | +---------------+--------------------------+ | log_error | /var/log/mysql/error.log | $ tail -100 /var/log/mysql/error.log 2016-04-12 20:38:21 25937 [Warning] Slave: Operation DROP USER failed for 'monitor'@'127.0.0.1' Error_code: 1396 2016-04-12 20:38:21 25937 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000002' position 2461 2016-04-13 19:28:37 25937 [Note] @@GLOBAL.GTID_PURGED was changed from '' to '08eef41e-df72-11e5-be92-0024e84cb72f:1-12'. 2016-04-13 19:28:37 25937 [Note] @@GLOBAL.GTID_EXECUTED was changed from '' to '08eef41e-df72-11e5-be92-0024e84cb72f:1-12'. 2016-04-13 19:29:04 25937 [Warning] Slave SQL: If a crash happens this configuration does not guarantee that the relay log info will be consistent, Error_code: 0 2016-04-13 19:29:04 25937 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000002' at position 2461, relay log './mysqld-relay-bin.000003' position: 1178 2016-04-13 20:11:50 25937 [Note] Error reading relay log event: slave SQL thread was killed 2016-04-13 20:11:50 25937 [ERROR] Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013) 2016-04-13 20:11:50 25937 [Note] Slave I/O thread killed while reading event 2016-04-13 20:11:50 25937 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000002', position 346980 查看MySQL状态 mysql> show global status like '%threads_runn%'; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | Threads_running | 1 | +-----------------+-------+ | Com_select | 307064 | | Com_insert | 203395 | | Com_delete | 0 | | Com_update | 32287 | mysql> show global status like '%innodb_rows%'; +----------------------+----------+ | Variable_name | Value | +----------------------+----------+ | Innodb_rows_deleted | 0 | | Innodb_rows_inserted | 184427 | | Innodb_rows_read | 26099334 | | Innodb_rows_updated | 32551 | show global status like '%innodb_buffer_pool_pages%'; +----------------------------------+--------+ | Variable_name | Value | +----------------------------------+--------+ | Innodb_buffer_pool_pages_data | 6901 | | Innodb_buffer_pool_pages_dirty | 69 | | Innodb_buffer_pool_pages_flushed | 205928 | | Innodb_buffer_pool_pages_free | 1024 | | Innodb_buffer_pool_pages_misc | 266 | | Innodb_buffer_pool_pages_total | 8191 | mysql> show global status like '%innodb_buffer_pool_read%'; +---------------------------------------+----------+ | Variable_name | Value | +---------------------------------------+----------+ | Innodb_buffer_pool_read_ahead_rnd | 0 | | Innodb_buffer_pool_read_ahead | 2426 | | Innodb_buffer_pool_read_ahead_evicted | 0 | | Innodb_buffer_pool_read_requests | 16119505 | | Innodb_buffer_pool_reads | 6115 | mysql> show global status like '%innodb_row_lock%'; +-------------------------------+-------+ | Variable_name | Value | +-------------------------------+-------+ | Innodb_row_lock_current_waits | 0 | | Innodb_row_lock_time | 10 | | Innodb_row_lock_time_avg | 0 | | Innodb_row_lock_time_max | 2 | | Innodb_row_lock_waits | 22 | mysqladmin extended-status -i 3 -r | grep -i innodb_rows | Innodb_rows_deleted | 0 | | Innodb_rows_inserted | 182053 | | Innodb_rows_read | 25740195 | | Innodb_rows_updated | 32287 | | Innodb_rows_deleted | 0 | | Innodb_rows_inserted | 43 | | Innodb_rows_read | 943 | | Innodb_rows_updated | 0 | | Innodb_rows_deleted | 0 | | Innodb_rows_inserted | 8 | | Innodb_rows_read | 390 | | Innodb_rows_updated | 0 | 查看Innodb状态 mysql> show engine innodb status\G *************************** 1. row *************************** Type: InnoDB Name: Status: ===================================== 2016-04-16 12:02:59 7fef13d0d700 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 47 seconds ----------------- BACKGROUND THREAD ----------------- srv_master_thread loops: 15203 srv_active, 0 srv_shutdown, 13276 srv_idle srv_master_thread log flush and writes: 28472 ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 49397 OS WAIT ARRAY INFO: signal count 49296 Mutex spin waits 47682, rounds 1356865, OS waits 44752 RW-shared spins 3909, rounds 116413, OS waits 3858 RW-excl spins 828, rounds 24083, OS waits 755 Spin rounds per wait: 28.46 mutex, 29.78 RW-shared, 29.09 RW-excl ------------ TRANSACTIONS ------------ Trx id counter 65462789 Purge done for trx's n:o < 65461073 undo n:o < 0 state: running but idle History list length 2881 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0, not started MySQL thread id 349713, OS thread handle 0x7fef13d0d700, query id 2399249 localhost root init show engine innodb status ---TRANSACTION 65462442, not started MySQL thread id 355002, OS thread handle 0x7fef13c76700, query id 2396568 180.173.242.219 dtcloud_dev cleaning up ---TRANSACTION 65396897, not started MySQL thread id 316553, OS thread handle 0x7fef13e02700, query id 2129942 180.173.242.219 dtcloud_dev cleaning up ---TRANSACTION 65352135, not started MySQL thread id 288067, OS thread handle 0x7fef13bb2700, query id 1941170 180.173.242.219 dtcloud_dev cleaning up ---TRANSACTION 65327485, not started MySQL thread id 274710, OS thread handle 0x7fef13c45700, query id 1847425 180.173.242.219 dtcloud_dev cleaning up -------- FILE I/O -------- I/O thread 0 state: waiting for completed aio requests (insert buffer thread) I/O thread 1 state: waiting for completed aio requests (log thread) I/O thread 2 state: waiting for completed aio requests (read thread) I/O thread 3 state: waiting for completed aio requests (read thread) I/O thread 4 state: waiting for completed aio requests (read thread) I/O thread 5 state: waiting for completed aio requests (read thread) I/O thread 6 state: waiting for completed aio requests (write thread) I/O thread 7 state: waiting for completed aio requests (write thread) I/O thread 8 state: waiting for completed aio requests (write thread) I/O thread 9 state: waiting for completed aio requests (write thread) Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] , ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0 Pending flushes (fsync) log: 0; buffer pool: 0 9793 OS file reads, 457309 OS file writes, 263759 OS fsyncs 0.00 reads/s, 0 avg bytes/read, 18.87 writes/s, 10.87 fsyncs/s ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 1, free list len 11, seg size 13, 551 merges merged operations: insert 1709, delete mark 0, delete 0 discarded operations: insert 0, delete mark 0, delete 0 Hash table size 276671, node heap has 266 buffer(s) 13.79 hash searches/s, 21.34 non-hash searches/s --- LOG --- Log sequence number 9675441693 Log flushed up to 9675441693 Pages flushed up to 9675441693 Last checkpoint at 9675441693 0 pending log writes, 0 pending chkp writes 243472 log i/o's done, 9.98 log i/o's/second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 137363456; in additional pool allocated 0 Dictionary memory allocated 1945330 Buffer pool size 8191 Free buffers 1024 Database pages 6901 Old database pages 2527 Modified db pages 0 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 20983, not young 80923 0.00 youngs/s, 0.00 non-youngs/s Pages read 9086, created 2328, written 207366 0.00 reads/s, 0.06 creates/s, 8.60 writes/s Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000 Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s LRU len: 6901, unzip_LRU len: 0 I/O sum[398]:cur[6], unzip sum[0]:cur[0] -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue 0 read views open inside InnoDB Main thread process no. 13103, id 140664767006464, state: sleeping Number of rows inserted 186160, updated 33021, deleted 0, read 26128219 7.83 inserts/s, 0.00 updates/s, 0.00 deletes/s, 49.34 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================ 1 row in set (0.08 sec) 数据库参数调优 MySQL优化器参数 mysql> show variables like '%optimizer%'; +------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Variable_name | Value | +------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | optimizer_prune_level | 1 | | optimizer_search_depth | 62 | | optimizer_switch | index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,subquery_materialization_cost_based=on,use_index_extensions=on | | optimizer_trace | enabled=off,one_line=off | | optimizer_trace_features | greedy_search=on,range_optimizer=on,dynamic_range=on,repeated_subselect=on | | optimizer_trace_limit | 1 | | optimizer_trace_max_mem_size | 16384 | | optimizer_trace_offset | -1 myisam参数 show variables like '%myisam%'; +---------------------------+----------------------+ | Variable_name | Value | +---------------------------+----------------------+ | myisam_data_pointer_size | 6 | | myisam_max_sort_file_size | 9223372036853727232 | | myisam_mmap_size | 18446744073709551615 | | myisam_recover_options | OFF | | myisam_repair_threads | 1 | | myisam_sort_buffer_size | 8388608 | | myisam_stats_method | nulls_unequal | | myisam_use_mmap | OFF | mysql> show variables like '%key%'; +--------------------------+----------------------+ | Variable_name | Value | +--------------------------+----------------------+ | delay_key_write | ON | | foreign_key_checks | ON | | have_rtree_keys | YES | | key_buffer_size | 8388608 | | key_cache_age_threshold | 300 | | key_cache_block_size | 1024 | | key_cache_division_limit | 100 | | max_seeks_for_key | 18446744073709551615 | innodb mysql> show variables like '%innodb_flush%'; +--------------------------------+-------+ | Variable_name | Value | +--------------------------------+-------+ | innodb_flush_log_at_timeout | 1 | | innodb_flush_log_at_trx_commit | 1 | | innodb_flush_method | | | innodb_flush_neighbors | 1 | | innodb_flushing_avg_loops | 30 | mysql> show variables like '%innodb_buffer%'; +-------------------------------------+----------------+ | Variable_name | Value | +-------------------------------------+----------------+ | innodb_buffer_pool_dump_at_shutdown | OFF | | innodb_buffer_pool_dump_now | OFF | | innodb_buffer_pool_filename | ib_buffer_pool | | innodb_buffer_pool_instances | 8 | | innodb_buffer_pool_load_abort | OFF | | innodb_buffer_pool_load_at_startup | OFF | | innodb_buffer_pool_load_now | OFF | | innodb_buffer_pool_size | 134217728 | mysql> show variables like '%innodb_io%'; +------------------------+-------+ | Variable_name | Value | +------------------------+-------+ | innodb_io_capacity | 200 | | innodb_io_capacity_max | 2000 | mysql> show variables like '%innodb%thread%'; +---------------------------+-------+ | Variable_name | Value | +---------------------------+-------+ | innodb_purge_threads | 1 | | innodb_read_io_threads | 4 | | innodb_thread_concurrency | 0 | | innodb_thread_sleep_delay | 10000 | | innodb_write_io_threads | 4 | mysql> show variables like '%double%'; +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | innodb_doublewrite | ON | mysql> show variables like '%innodb_stats%'; +--------------------------------------+-------------+ | Variable_name | Value | +--------------------------------------+-------------+ | innodb_stats_auto_recalc | ON | | innodb_stats_method | nulls_equal | | innodb_stats_on_metadata | OFF | | innodb_stats_persistent | ON | | innodb_stats_persistent_sample_pages | 20 | | innodb_stats_sample_pages | 8 | | innodb_stats_transient_sample_pages | 8 |
机器上有个进程cpu使用率很高,近100%了, Tasks: 120 total, 2 running, 118 sleeping, 0 stopped, 0 zombie%Cpu(s): 99.0 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 0.0 si, 0.0 stKiB Mem: 4048308 total, 3301480 used, 746828 free, 33500 buffersKiB Swap: 0 total, 0 used, 0 free. 149784 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND18400 admin 20 0 2201508 304712 3948 S 98.1 7.5 17335:45 java17782 root 20 0 3130780 893164 4476 S 0.7 22.1 120:19.01 java 7 root 20 0 0 0 0 S 0.3 0.0 4:01.17 rcu_sched 1114 root 20 0 154200 9676 2360 S 0.3 0.2 40:41.38 AliYunDun 看下这个进程是干什么的,发现是logstash的一个进程,这个进程的作用是从kafka读取数据,并将数据写入到elasticsearch。 cat /proc/18400/cmdline | sed -e 's/x00/ /g'/opt/dtstack/java/bin/java -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Xmx128m -Xss2048k -Djffi.boot.library.path=/opt/dtstack/logstash-2.1.1/vendor/jruby/lib/jni -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/dtstack/logstash/heapdump.hprof -Xbootclasspath/a:/opt/dtstack/logstash-2.1.1/vendor/jruby/lib/jruby.jar -classpath : -Djruby.home=/opt/dtstack/logstash-2.1.1/vendor/jruby -Djruby.lib=/opt/dtstack/logstash-2.1.1/vendor/jruby/lib -Djruby.script=jruby -Djruby.shell=/bin/sh org.jruby.Main --1.9 /opt/dtstack/logstash/lib/bootstrap/environment.rb logstash/runner.rb agent -f /home/admin/logstash/dtlog/conf/logindexer-kafka.conf -l /home/admin/logstash/dtlog/log/logindexer-kafka.log 看一下是哪个线程CPU消耗得厉害,可以使用top的-H选项查看线程的情况,使用-p选择指定pid。 发现有个command是kafka的线程cpu使用率一直很高。 top -H -p 18400top - 12:08:19 up 19 days, 57 min, 4 users, load average: 1.04, 1.19, 1.26Threads: 31 total, 1 running, 30 sleeping, 0 stopped, 0 zombie%Cpu(s): 99.0 us, 0.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.3 hi, 0.0 si, 0.0 stKiB Mem: 4048308 total, 3321036 used, 727272 free, 36228 buffersKiB Swap: 0 total, 0 used, 0 free. 167372 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND18435 admin 20 0 2201508 304920 4120 R 97.9 7.5 17276:37 17837 admin 20 0 2201508 304920 4120 S 0.3 7.5 27:30.14 18400 admin 20 0 2201508 304920 4120 S 0.0 7.5 0:00.00 java18413 admin 20 0 2201508 304920 4120 S 0.0 7.5 1:58.54 LogStash::Runne18414 admin 20 0 2201508 304920 4120 S 0.0 7.5 0:00.00 java18415 admin 20 0 2201508 304920 4120 S 0.0 7.5 0:23.43 java18416 admin 20 0 2201508 304920 4120 S 0.0 7.5 1:42.88 java18417 admin 20 0 2201508 304920 4120 S 0.0 7.5 0:00.35 java18418 admin 20 0 2201508 304920 4120 S 0.0 7.5 0:00.00 java 可以使用jstack看看线程的调用栈,不过里面的内容没看明白。 jstack -F 18400 > tmp1.loggrep -A 50 18435 tmp1.logThread 18435: (state = IN_JAVA) org.jruby.runtime.callsite.CachingCallSite.call(org.jruby.runtime.ThreadContext, org.jruby.runtime.builtin.IRubyObject, org.jruby.runtime.builtin.IRubyObject) @bci=36, line=134 (Compiled frame; information may be imprecise) org.jruby.ast.CallNoArgNode.interpret(org.jruby.Ruby, org.jruby.runtime.ThreadContext, org.jruby.runtime.builtin.IRubyObject, org.jruby.runtime.Block) @bci=18, line=60 (Compiled frame) org.jruby.ast.WhileNode.interpret(org.jruby.Ruby, org.jruby.runtime.ThreadContext, org.jruby.runtime.builtin.IRubyObject, org.jruby.runtime.Block) @bci=23, line=127 (Compiled frame) org.jruby.ast.NewlineNode.interpret(org.jruby.Ruby, org.jruby.runtime.ThreadContext, org.jruby.runtime.builtin.IRubyObject, org.jruby.runtime.Block) @bci=41, line=105 (Compiled frame) org.jruby.ast.BlockNode.interpret(org.jruby.Ruby, org.jruby.runtime.ThreadContext, org.jruby.runtime.builtin.IRubyObject, org.jruby.runtime.Block) @bci=33, line=71 (Compiled 到网上找找看是否有别人遇到这个问题,搜索关键词 logstash kafka cpu high 发现还真有人遇到过这个问题: http://stackoverflow.com/questions/34486960/logstash-kafka-input-filter-with-high-cpu-usage The logstash-input-kafka plugin had a bug in its tight loop which unnecessarily checked for an empty queue and skipped to the next iteration instead of blocking. This has been fixed in this pull request and version 2.0.3 of the plugin has been released with it. To test this, please update the plugin using: bin/plugin install --version 2.0.3 logstash-input-kafka 查看我们机器上安装的plugin版本, $ /opt/dtstack/logstash/bin/plugin list --verbose kafkalogstash-input-kafka (2.0.2)logstash-output-kafka (2.0.1) 升级一下版本试试 $ /opt/dtstack/logstash/bin/plugin install logstash-input-kafka-2.0.3.gemValidating logstash-input-kafka-2.0.3.gemInstalling logstash-input-kafkaInstallation successful$ /opt/dtstack/logstash/bin/plugin list --verbose kafkalogstash-input-kafka (2.0.3)logstash-output-kafka (2.0.1) CPU使用率恢复正常了。
在一台ubuntu测试机器上启动一个mysql实例,本来应该是一件很简单的事情, 启动的时候却报错了: mysqld_safe --defaults-file=/etc/mysql/my3307.cnf 160218 15:45:13 mysqld_safe Can't log to error log and syslog at the same time. Remove all --log-error configuration options for --syslog to take effect. 160218 15:45:13 mysqld_safe Logging to '/var/log/mysql/error3307.log'. 160218 15:45:13 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql3307 160218 15:45:15 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld3307.pid ended 查看错误日志/var/log/mysql/error3307.log 发现是由于权限问题无法创建ibdata文件: 160218 12:19:09 InnoDB: Completed initialization of buffer pool 160218 12:19:09 InnoDB: Operating system error number 13 in a file operation. InnoDB: The error means mysqld does not have the access rights to InnoDB: the directory. InnoDB: File name ./ibdata1 InnoDB: File operation call: 'create'. InnoDB: Cannot continue operation. 查看目录,理论上应该有读写权限的啊: ls -ld /var/lib/mysql3307 drwx------ 4 mysql mysql 4096 Feb 18 15:09 /var/lib/mysql3307 难道还有其它什么权限限制? 用dmesg看一下,确实有一些报错, [4489099.208660] type=1400 audit(1455768562.142:57): apparmor="DENIED" operation="mknod" profile="/usr/sbin/mysqld" name="/var/lib/mysql3307/ibdata1" pid=19775 comm="mysqld" requested_mask="c" denied_mask="c" fsuid=105 ouid=105 查看/usr/sbin/mysqld的配置: vi /etc/apparmor.d/usr.sbin.mysqld 增加mysql3307目录的权限,重启apparmor: /var/lib/mysql3307/ r, /var/lib/mysql3307/** rwk, /etc/init.d/apparmor restart 再次重新启动mysqld,发现还是报错,不过这次报错的信息不一样了: 160218 12:44:03 [ERROR] Can't start server : Bind on unix socket: Permission denied 160218 12:44:03 [ERROR] Do you already have another mysqld server running on socket: /var/run/mysqld/mysqld3307.sock ? 160218 12:44:03 [ERROR] Aborting dmesg也能看到错误消息: [4491181.040673] type=1400 audit(1455770643.975:99): apparmor="DENIED" operation="mknod" profile="/usr/sbin/mysqld" name="/run/mysqld/mysqld3307.sock" pid=24093 comm="mysqld" requested_mask="c" denied_mask="c" fsuid=105 ouid=105 到/etc/apparmor.d/usr.sbin.mysqld增加配置,重启apparmor /var/run/mysqld/mysqld3307.pid rw, /var/run/mysqld/mysqld3307.sock w, /run/mysqld/mysqld3307.pid rw, /run/mysqld/mysqld3307.sock w, 再次重启mysqld,还是报错,不过这次是新的报错:160218 12:57:27 [Note] Server socket created on IP: '0.0.0.0'. 160218 12:57:27 [ERROR] Fatal error: Can't open and lock privilege tables: Table 'mysql.host' doesn't exist 160218 12:57:27 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld3307.pid ended 跑一下mysql_install_db --defaults-file=/etc/mysql/my3307.cnf 这次mysqld终于可以正常启动了: 160218 14:44:59 [Note] Server socket created on IP: '0.0.0.0'. 160218 14:44:59 [Note] Event Scheduler: Loaded 0 events 160218 14:44:59 [Note] /usr/sbin/mysqld: ready for connections. Version: '5.5.46-0ubuntu0.14.04.2' socket: '/var/run/mysqld/mysqld3307.sock' port: 3307 (Ubuntu)