慢查询在数据库分析中是一个非常重要的参考。
我记得在PostgreSQL Conference 2009 中曾经就有人专门讲解了Hit the mole.(90%的数据库性能问题往往出现在10%的SQL上面.)
在MySQL,PostgreSQL数据库中,用日志来记录慢查询。(比如为数据库设置一下阀值100MS,运行时间超过100MS的SQL将记录到日志中.)
在mongoDB中也比较类似,使用profile 来配置慢查询的阀值.已经是否开启记录慢查询的功能。mongoDB的慢查询被记录到system.profile collection中。
在mongoDB 1.7.x 以下版本中使用getProfilingLevel()查看当前配置的数据库profile.1.7.x以后改为getProfilingStatus().
配置profile可以通过参数文件,或在mongo shell中执行命令来配置.
1. 参数文件
--profile arg 0=off 1=slow, 2=all (0表示关闭profile,1表示只记录执行时间超过slowms配置的值的执行内容,2表示记录所有执行内容)
--slowms arg (=100) value of slow for profile and console log (如果profile配置为1并且没有配置slowms的话默认是100毫秒)
2. mongo shell
db.setProfilingLevel( level , slowms )
3. 查看profile
> db.getProfilingLevel()
1
配置举例:
# 配置slowms=1,profile=1
> db.setProfilingLevel(1,1)
{ "was" : 0, "slowms" : 100, "ok" : 1 }
# 注意这里返回的slowms : 100,实际上已经配置为1了,估计是个BUG。在1.7.x里面可以通过以下命令查看:
> db.system.profile.find({"info" : /userinfo/})
{ "ts" : "Sun Jan 09 2011 09:02:56 GMT+0800 (CST)", "info" : "insert admin.userinfo", "millis" : 1 }
{ "ts" : "Sun Jan 09 2011 09:07:21 GMT+0800 (CST)", "info" : "query admin.userinfo reslen:202 nscanned:10000 \nquery: { query: { firstname: \"zhou\" }, $explain: true } nreturned:1 5ms", "millis" : 5 }
# 或者按millis排个序
> db.system.profile.find().sort({"millis" : -1})
{ "ts" : "Sun Jan 09 2011 09:07:21 GMT+0800 (CST)", "info" : "query admin.userinfo reslen:202 nscanned:10000 \nquery: { query: { firstname: \"zhou\" }, $explain: true } nreturned:1 5ms", "millis" : 5 }
{ "ts" : "Sun Jan 09 2011 09:02:56 GMT+0800 (CST)", "info" : "insert admin.userinfo", "millis" : 1 }
# 由于system.profile是capped collection,使用$natural 可以按照记录插入的插入顺序查看结果.
> db.system.profile.find().sort({"$natural" : -1})
{ "ts" : "Sun Jan 09 2011 09:07:21 GMT+0800 (CST)", "info" : "query admin.userinfo reslen:202 nscanned:10000 \nquery: { query: { firstname: \"zhou\" }, $explain: true } nreturned:1 5ms", "millis" : 5 }
{ "ts" : "Sun Jan 09 2011 09:02:56 GMT+0800 (CST)", "info" : "insert admin.userinfo", "millis" : 1 }
# 使用show profile 查看最近5%的超过1ms执行时间的事件
> show profile
5ms Sun Jan 09 2011 09:07:21
query admin.userinfo reslen:202 nscanned:10000
query: { query: { firstname: "zhou" }, $explain: true } nreturned:1 5ms
1ms Sun Jan 09 2011 09:02:56
insert admin.userinfo
输出格式:
1. ts : 记录下捕获的时间戳,(SQL执行结束时间,如下)
> Date()
Sun Jan 09 2011 09:42:31 GMT+0800 (CST)
> db.userinfo.ensureIndex({"lastname" : 1,"detail.city" : 1})
> db.system.profile.find().sort({"$natural" : -1}).limit(1)
{ "ts" : "Sun Jan 09 2011 09:42:46 GMT+0800 (CST)", "info" : "insert admin.system.indexes 14374ms", "millis" : 14374 }
# 从时间上看是执行结束时间.
2. millis : 执行时间,注意不包含lock请求时间和network传输时间,只包含了server处理这条请求的时间.
3. info : 执行详细信息,包含(query,update,insert,getmore)
query : 数据库查询操作,包含以下信息:
ntoreturn : Number of objects the client requested for return from a query. For example, <code>findOne()</code> sets ntoreturn to 1.<code>limit()</code> sets the appropriate limit. Zero indicates no limit.
query : Details of the query spec.
nscanned : Number of objects scanned in executing the operation.
reslen : Query result length in bytes.
nreturned : Number of objects returned from query.
update : A database update operation. <code>save()</code> calls generate either an update or insert operation.
fastmod : Indicates a fast modify operation. See Updates. These operations are normally quite fast.
fastmodinsert : indicates a fast modify operation that performed an upsert.
upsert : Indicates on upsert performed.
moved : Indicates the update moved the object on disk (not updated in place). This is slower than an in place update, and normally occurs when an object grows.
insert : A database insert.
getmore : For large queries, the database initially returns partial information. getmore indicates a call to retrieve further information.
# 例:
> db.userinfo.find().sort({"age" : -1}).limit(1)
{ "_id" : ObjectId("4d2910734f7371f2497bd6ad"), "firstname" : "zhou", "lastname" : "digoal999998", "age" : 999998, "detail" : { "city" : "HangZhou", "corp" : "sky-mobi" } }
> db.system.profile.find().sort({"$natural" : -1}).limit(1)
{ "ts" : "Sun Jan 09 2011 09:53:42 GMT+0800 (CST)", "info" : "query admin.userinfo ntoreturn:1 scanAndOrder reslen:169 nscanned:2009998 \nquery: { query: {}, orderby: { age: -1.0 } } nreturned:1 3140ms", "millis" : 3140 }
# 性能优化相关
从上面的profile结果看出ntoreturn = 1,nscanned = 2009998 ,也就是说返回结果是1条,但是扫描了200多W条记录。如果这种查询非常多的话建议在"age"上建立索引。对于update操作同样适用.(当然,还要考虑索引对写入带来的overhead)
如果reslen这个值比较大,表示返回结果非常多,建议调整find()限制输出结果.
调整profileSIZE:
1. 关闭profile
> db.setProfilingLevel(0)
{ "was" : 1, "slowms" : 1, "ok" : 1 }
2. 调整profile size
> db.system.profile.renameCollection("system.oldprofile")
{ "ok" : 1 }
> show collections
system.indexes
system.oldprofile
system.users
userinfo
version
# 调整为100MB
> db.createCollection("system.profile",{"capped" : true, "size" : 102400000})
{ "ok" : 1 }
> db.system.profile.stats()
{
"ns" : "admin.system.profile",
"count" : 0,
"size" : 0,
"avgObjSize" : NaN,
"storageSize" : 102400256,
"numExtents" : 1,
"nindexes" : 0,
"lastExtentSize" : 102400256,
"paddingFactor" : 1,
"flags" : 0,
"totalIndexSize" : 0,
"indexSizes" : {
},
"capped" : 1,
"max" : 2147483647,
"ok" : 1
}
3. 重新开启profile
> db.setProfilingLevel(1,1)
{ "was" : 0, "slowms" : 1, "ok" : 1 }
4. oldprofile中的数据无法插入到profile
测试:
> for (var c = db.system.oldprofile.find(); c.hasNext();) {
... b = c.next();
... db.system.profile.insert({"ts" : c.ts,"info" : c.info,"millis" : c.millis});
... }
> db.system.profile.count()
0
# 没有插进去
# 非常抱歉,插入到test是OK的. system.profile 不可以手工插入记录
> for (var c = db.system.oldprofile.find(); c.hasNext();) {
... b = c.next();
... db.test.insert({"ts" : c.ts,"info" : c.info,"millis" : c.millis});
... }
> db.test.count()
6
5. 删除oldprofile
> db.system.oldprofile.drop()
Sun Jan 9 10:38:57 uncaught exception: drop failed: {
"ns" : "admin.system.oldprofile",
"errmsg" : "exception: can't drop system ns",
"code" : 12502,
"ok" : 0
}
# system下面的collection不允许删除,所以要先rename一下.
> db.system.oldprofile.renameCollection("oldprofile")
{ "ok" : 1 }
> db.oldprofile.drop()
true
注意事项:
1. system.profile 是capped collection,默认配置比较小,所以如果需要放更多的内容需要调整size和max (具体的话可以参考我以前写过的capped collection topic)
如:
> db.system.profile.stats()
{
"ns" : "admin.system.profile",
"count" : 1,
"size" : 68,
"avgObjSize" : 68,
"storageSize" : 131328,
"numExtents" : 1,
"nindexes" : 0,
"lastExtentSize" : 131328,
"paddingFactor" : 1,
"flags" : 0,
"totalIndexSize" : 0,
"indexSizes" : {
},
"capped" : 1,
"max" : 2147483647,
"ok" : 1
}
> db.system.profile.totalSize()
131328
2. 开启profile对性能略有影响.
3. 通过命令行开启profile只对当前数据库生效,并且重启数据库后失效.
# 例如在admin库中使用mongo shell开启了profile,转到test库后是没有开启的.
> use test
switched to db test
> show collections
system.indexes
system.users
userinfo
> db.getProfilingLevel()
0
> db.setProfilingLevel(1,1)
{ "was" : 0, "slowms" : 1, "ok" : 1 }
> db.getProfilingLevel()
1
# 使用mongo shell开启的profile重启后失效,如下
# 重启mongoDB
[root@db5 ~]# /opt/mongodb/bin/mongo 127.0.0.1:5281/admin
MongoDB shell version: 1.6.5
connecting to: 127.0.0.1:5281/admin
> db.auth("digoal","DIGOAL")
1
> db.getProfilingLevel()
0
# 显然system.profile是还在的.
> show collections
system.indexes
system.profile
system.users
userinfo
version
> db.system.profile.find()
{ "ts" : "Sun Jan 09 2011 09:02:56 GMT+0800 (CST)", "info" : "insert admin.userinfo", "millis" : 1 }
{ "ts" : "Sun Jan 09 2011 09:07:21 GMT+0800 (CST)", "info" : "query admin.userinfo reslen:202 nscanned:10000 \nquery: { query: { firstname: \"zhou\" }, $explain: true } nreturned:1 5ms", "millis" : 5 }
4. system.profile 不可以手工插入记录
我记得在PostgreSQL Conference 2009 中曾经就有人专门讲解了Hit the mole.(90%的数据库性能问题往往出现在10%的SQL上面.)
在MySQL,PostgreSQL数据库中,用日志来记录慢查询。(比如为数据库设置一下阀值100MS,运行时间超过100MS的SQL将记录到日志中.)
在mongoDB中也比较类似,使用profile 来配置慢查询的阀值.已经是否开启记录慢查询的功能。mongoDB的慢查询被记录到system.profile collection中。
在mongoDB 1.7.x 以下版本中使用getProfilingLevel()查看当前配置的数据库profile.1.7.x以后改为getProfilingStatus().
配置profile可以通过参数文件,或在mongo shell中执行命令来配置.
1. 参数文件
--profile arg 0=off 1=slow, 2=all (0表示关闭profile,1表示只记录执行时间超过slowms配置的值的执行内容,2表示记录所有执行内容)
--slowms arg (=100) value of slow for profile and console log (如果profile配置为1并且没有配置slowms的话默认是100毫秒)
2. mongo shell
db.setProfilingLevel( level , slowms )
3. 查看profile
> db.getProfilingLevel()
1
配置举例:
# 配置slowms=1,profile=1
> db.setProfilingLevel(1,1)
{ "was" : 0, "slowms" : 100, "ok" : 1 }
# 注意这里返回的slowms : 100,实际上已经配置为1了,估计是个BUG。在1.7.x里面可以通过以下命令查看:
> db.getProfilingStatus()# 也可以使用规则表达式过滤出你关心的collection,如下.
{ "was" : 1, "slowms" : 1 }
# 插入测试数据
> for (var i=0;i<10000;i++) {
... db.userinfo.insert({"firstname" : "zhou","lastname" : "digoal" + i,"age" : i})
... }
# 在1.6.5的环境下,没有办法通过db.getProfilingLevel()查看slowms的具体配置,不过通过以下测试可以知道,前面配置的slowms=1已经生效.
> db.userinfo.find({"firstname" : "zhou"}).explain()
{
"cursor" : "BasicCursor",
"nscanned" : 10000,
"nscannedObjects" : 10000,
"n" : 10000,
"millis" : 5,
"indexBounds" : {
}
}
> db.system.profile.find()
{ "ts" : "Sun Jan 09 2011 09:02:56 GMT+0800 (CST)", "info" : "insert admin.userinfo", "millis" : 1 }
{ "ts" : "Sun Jan 09 2011 09:07:21 GMT+0800 (CST)", "info" : "query admin.userinfo reslen:202 nscanned:10000 \nquery: { query: { firstname: \"zhou\" }, $explain: true } nreturned:1 5ms", "millis" : 5 }
# 使用以下方式查看效果也是一样的
> db.system.profile.find( function() { return this.info.indexOf('$cmd')<0; } )
{ "ts" : "Sun Jan 09 2011 09:02:56 GMT+0800 (CST)", "info" : "insert admin.userinfo", "millis" : 1 }
{ "ts" : "Sun Jan 09 2011 09:07:21 GMT+0800 (CST)", "info" : "query admin.userinfo reslen:202 nscanned:10000 \nquery: { query: { firstname: \"zhou\" }, $explain: true } nreturned:1 5ms", "millis" : 5 }
> db.system.profile.find({"info" : /userinfo/})
{ "ts" : "Sun Jan 09 2011 09:02:56 GMT+0800 (CST)", "info" : "insert admin.userinfo", "millis" : 1 }
{ "ts" : "Sun Jan 09 2011 09:07:21 GMT+0800 (CST)", "info" : "query admin.userinfo reslen:202 nscanned:10000 \nquery: { query: { firstname: \"zhou\" }, $explain: true } nreturned:1 5ms", "millis" : 5 }
# 或者按millis排个序
> db.system.profile.find().sort({"millis" : -1})
{ "ts" : "Sun Jan 09 2011 09:07:21 GMT+0800 (CST)", "info" : "query admin.userinfo reslen:202 nscanned:10000 \nquery: { query: { firstname: \"zhou\" }, $explain: true } nreturned:1 5ms", "millis" : 5 }
{ "ts" : "Sun Jan 09 2011 09:02:56 GMT+0800 (CST)", "info" : "insert admin.userinfo", "millis" : 1 }
# 由于system.profile是capped collection,使用$natural 可以按照记录插入的插入顺序查看结果.
> db.system.profile.find().sort({"$natural" : -1})
{ "ts" : "Sun Jan 09 2011 09:07:21 GMT+0800 (CST)", "info" : "query admin.userinfo reslen:202 nscanned:10000 \nquery: { query: { firstname: \"zhou\" }, $explain: true } nreturned:1 5ms", "millis" : 5 }
{ "ts" : "Sun Jan 09 2011 09:02:56 GMT+0800 (CST)", "info" : "insert admin.userinfo", "millis" : 1 }
# 使用show profile 查看最近5%的超过1ms执行时间的事件
> show profile
5ms Sun Jan 09 2011 09:07:21
query admin.userinfo reslen:202 nscanned:10000
query: { query: { firstname: "zhou" }, $explain: true } nreturned:1 5ms
1ms Sun Jan 09 2011 09:02:56
insert admin.userinfo
输出格式:
1. ts : 记录下捕获的时间戳,(SQL执行结束时间,如下)
> Date()
Sun Jan 09 2011 09:42:31 GMT+0800 (CST)
> db.userinfo.ensureIndex({"lastname" : 1,"detail.city" : 1})
> db.system.profile.find().sort({"$natural" : -1}).limit(1)
{ "ts" : "Sun Jan 09 2011 09:42:46 GMT+0800 (CST)", "info" : "insert admin.system.indexes 14374ms", "millis" : 14374 }
# 从时间上看是执行结束时间.
2. millis : 执行时间,注意不包含lock请求时间和network传输时间,只包含了server处理这条请求的时间.
3. info : 执行详细信息,包含(query,update,insert,getmore)
query : 数据库查询操作,包含以下信息:
ntoreturn : Number of objects the client requested for return from a query. For example, <code>findOne()</code> sets ntoreturn to 1.<code>limit()</code> sets the appropriate limit. Zero indicates no limit.
query : Details of the query spec.
nscanned : Number of objects scanned in executing the operation.
reslen : Query result length in bytes.
nreturned : Number of objects returned from query.
update : A database update operation. <code>save()</code> calls generate either an update or insert operation.
fastmod : Indicates a fast modify operation. See Updates. These operations are normally quite fast.
fastmodinsert : indicates a fast modify operation that performed an upsert.
upsert : Indicates on upsert performed.
moved : Indicates the update moved the object on disk (not updated in place). This is slower than an in place update, and normally occurs when an object grows.
insert : A database insert.
getmore : For large queries, the database initially returns partial information. getmore indicates a call to retrieve further information.
# 例:
> db.userinfo.find().sort({"age" : -1}).limit(1)
{ "_id" : ObjectId("4d2910734f7371f2497bd6ad"), "firstname" : "zhou", "lastname" : "digoal999998", "age" : 999998, "detail" : { "city" : "HangZhou", "corp" : "sky-mobi" } }
> db.system.profile.find().sort({"$natural" : -1}).limit(1)
{ "ts" : "Sun Jan 09 2011 09:53:42 GMT+0800 (CST)", "info" : "query admin.userinfo ntoreturn:1 scanAndOrder reslen:169 nscanned:2009998 \nquery: { query: {}, orderby: { age: -1.0 } } nreturned:1 3140ms", "millis" : 3140 }
# 性能优化相关
从上面的profile结果看出ntoreturn = 1,nscanned = 2009998 ,也就是说返回结果是1条,但是扫描了200多W条记录。如果这种查询非常多的话建议在"age"上建立索引。对于update操作同样适用.(当然,还要考虑索引对写入带来的overhead)
如果reslen这个值比较大,表示返回结果非常多,建议调整find()限制输出结果.
调整profileSIZE:
1. 关闭profile
> db.setProfilingLevel(0)
{ "was" : 1, "slowms" : 1, "ok" : 1 }
2. 调整profile size
> db.system.profile.renameCollection("system.oldprofile")
{ "ok" : 1 }
> show collections
system.indexes
system.oldprofile
system.users
userinfo
version
# 调整为100MB
> db.createCollection("system.profile",{"capped" : true, "size" : 102400000})
{ "ok" : 1 }
> db.system.profile.stats()
{
"ns" : "admin.system.profile",
"count" : 0,
"size" : 0,
"avgObjSize" : NaN,
"storageSize" : 102400256,
"numExtents" : 1,
"nindexes" : 0,
"lastExtentSize" : 102400256,
"paddingFactor" : 1,
"flags" : 0,
"totalIndexSize" : 0,
"indexSizes" : {
},
"capped" : 1,
"max" : 2147483647,
"ok" : 1
}
3. 重新开启profile
> db.setProfilingLevel(1,1)
{ "was" : 0, "slowms" : 1, "ok" : 1 }
4. oldprofile中的数据无法插入到profile
测试:
> for (var c = db.system.oldprofile.find(); c.hasNext();) {
... b = c.next();
... db.system.profile.insert({"ts" : c.ts,"info" : c.info,"millis" : c.millis});
... }
> db.system.profile.count()
0
# 没有插进去
# 非常抱歉,插入到test是OK的. system.profile 不可以手工插入记录
> for (var c = db.system.oldprofile.find(); c.hasNext();) {
... b = c.next();
... db.test.insert({"ts" : c.ts,"info" : c.info,"millis" : c.millis});
... }
> db.test.count()
6
5. 删除oldprofile
> db.system.oldprofile.drop()
Sun Jan 9 10:38:57 uncaught exception: drop failed: {
"ns" : "admin.system.oldprofile",
"errmsg" : "exception: can't drop system ns",
"code" : 12502,
"ok" : 0
}
# system下面的collection不允许删除,所以要先rename一下.
> db.system.oldprofile.renameCollection("oldprofile")
{ "ok" : 1 }
> db.oldprofile.drop()
true
注意事项:
1. system.profile 是capped collection,默认配置比较小,所以如果需要放更多的内容需要调整size和max (具体的话可以参考我以前写过的capped collection topic)
如:
> db.system.profile.stats()
{
"ns" : "admin.system.profile",
"count" : 1,
"size" : 68,
"avgObjSize" : 68,
"storageSize" : 131328,
"numExtents" : 1,
"nindexes" : 0,
"lastExtentSize" : 131328,
"paddingFactor" : 1,
"flags" : 0,
"totalIndexSize" : 0,
"indexSizes" : {
},
"capped" : 1,
"max" : 2147483647,
"ok" : 1
}
> db.system.profile.totalSize()
131328
2. 开启profile对性能略有影响.
3. 通过命令行开启profile只对当前数据库生效,并且重启数据库后失效.
# 例如在admin库中使用mongo shell开启了profile,转到test库后是没有开启的.
> use test
switched to db test
> show collections
system.indexes
system.users
userinfo
> db.getProfilingLevel()
0
> db.setProfilingLevel(1,1)
{ "was" : 0, "slowms" : 1, "ok" : 1 }
> db.getProfilingLevel()
1
# 使用mongo shell开启的profile重启后失效,如下
# 重启mongoDB
[root@db5 ~]# /opt/mongodb/bin/mongo 127.0.0.1:5281/admin
MongoDB shell version: 1.6.5
connecting to: 127.0.0.1:5281/admin
> db.auth("digoal","DIGOAL")
1
> db.getProfilingLevel()
0
# 显然system.profile是还在的.
> show collections
system.indexes
system.profile
system.users
userinfo
version
> db.system.profile.find()
{ "ts" : "Sun Jan 09 2011 09:02:56 GMT+0800 (CST)", "info" : "insert admin.userinfo", "millis" : 1 }
{ "ts" : "Sun Jan 09 2011 09:07:21 GMT+0800 (CST)", "info" : "query admin.userinfo reslen:202 nscanned:10000 \nquery: { query: { firstname: \"zhou\" }, $explain: true } nreturned:1 5ms", "millis" : 5 }
4. system.profile 不可以手工插入记录