mongoDB's Query Optimizer [ not CBO ]-阿里云开发者社区

mongoDB's Query Optimizer [ not CBO ]

2016-03-31 1286

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

云数据库 MongoDB，通用型 2核4GB

简介：

   mongodb 的优化器为每个客户端提交的查询产生所有的执行计划后并行执行所有的执行计划，最先执行完毕的执行计划将被选出并缓存起来，没有执行完的执行计划将丢弃。下次执行时mongoDB将匹配缓存中的执行计划。
   在mongoDB中，被选出的较优的执行计划，在后期的执行中如果不优了（如因为数据的变更或者查询传入参数的改变，没有达到预期的执行效率）怎么办？mongoDB自动感知到一个执行计划已经不优了的时候，将重新选择并缓存更优的执行计划。
   mongoDB的这种不是基于cost和统计信息的优化选择器有一定的好处，但是也给数据库带来一定的负担。哪个好哪个不好确实不好定论。
   使用CBO的数据库如PostgreSQL,在对数据表进行大批量操作后通常建议重新analyze该表(即收集统计信息)。Oracle也是如此，如果统计信息不正确将导致数据库选择较劣的执行计划给数据库带来严重负担。
    而mongoDB没有统计信息当然也谈不上analyze,如果大批量更新数据导致执行计划不对的话按照mongoDB的说法应该下一次执行就会正常了。但是对于传入参数不一样导致的执行计划不对是不是会出现执行计划紊乱呢？解决办法除了重业务层隔离数据以外，使用非绑定变量也行。不过会给CPU带来严重负担。

【原文】

The MongoDB query optimizer generates query plans for each query submitted by a client. These plans are executed to return results. Thus, MongoDB supports ad hoc queries much like say, MySQL.

The database uses an interesting approach to query optimization though. Traditional approaches (which tend to be cost-based and statistical) are not used, as these approaches have a couple of problems.

First, the optimizer might consistently pick a bad query plan. For example, there might be correlations in the data of which the optimizer is unaware. In a situation like this, the developer might use a query hint.

Also with the traditional approach, query plans can change in production with negative results. No one thinks rolling out new code without testing is a good idea. Yet often in a production system a query plan can change as the statistics in the database change on the underlying data. The query plan in effect may be a plan that never was invoked in QA. If it is slower than it should be, the application could experience an outage.

The Mongo query optimizer is different. It is not cost based -- it does not model the cost of various queries. Instead, the optimizer simply tries different query plans and learn which ones work well. Of course, when the system tries a really bad plan, it may take an extremely long time to run. To solve this, when testing new plans, MongoDB executes multiple query plans in parallel. As soon as one finishes, it terminates the other executions, and the system has learned which plan is good. This works particularly well given the system is non-relational, which makes the space of possible query plans much smaller (as there are no joins).

Sometimes a plan which was working well can work poorly -- for example if the data in the database has changed, or if the parameter values to the query are different. In this case, if the query seems to be taking longer than usual, the database will once again run the query in parallel to try different plans.

This approach adds a little overhead, but has the advantage of being much better at worst-case performance.

相关实践学习

MongoDB数据库入门

MongoDB数据库入门实验。

快速掌握 MongoDB 数据库

本课程主要讲解MongoDB数据库的基本知识，包括MongoDB数据库的安装、配置、服务的启动、数据的CRUD操作函数使用、MongoDB索引的使用（唯一索引、地理索引、过期索引、全文索引等）、MapReduce操作实现、用户管理、Java对MongoDB的操作支持（基于2.x驱动与3.x驱动的完全讲解）。通过学习此课程，读者将具备MongoDB数据库的开发能力，并且能够使用MongoDB进行项目开发。   相关的阿里云产品：云数据库 MongoDB版云数据库MongoDB版支持ReplicaSet和Sharding两种部署架构，具备安全审计，时间点备份等多项企业能力。在互联网、物联网、游戏、金融等领域被广泛采用。云数据库MongoDB版（ApsaraDB for MongoDB）完全兼容MongoDB协议，基于飞天分布式系统和高可靠存储引擎，提供多节点高可用架构、弹性扩容、容灾、备份回滚、性能优化等解决方案。产品详情: https://www.aliyun.com/product/mongodb

mongoDB's Query Optimizer [ not CBO ]

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像