探究 | Elasticsearch CPU高排查思路-阿里云开发者社区

探究 | Elasticsearch CPU高排查思路

2021-11-10 1189

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： 一、可能导致ES CPU高的原因：1、复杂的query查询举例：我这边出现过200个组合wildcard query导致集群down掉的情况；2、有大量的reindex操作3、ES版本较低

二、排查思路

2.1、业务场景排查

问自己几个问题？

- 1）集群中数据类型是怎么样的？

- 2）集群中有多少数据？

- 3）集群中有多少节点数、分片数？

- 4）当前集群索引和检索的速率如何？

- 5）当前在执行哪种类型的查询或者其他操作？

2、建议Htop观察，结合ElaticHQ 观察CPU曲线

3、CPU高的时候，建议看一下ES节点的日志，看看是不是有大量的GC。

4、查看hot_threads。

GET _nodes/hot_threads

::: {test}{ikKuXkFvRc-qFCqG99smGg}{VE-uqoiARoONJwomfPwRBw}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=8481566720, ml.max_open_jobs=20, ml.enabled=true}

Hot threads at 2018-04-09T15:58:21.117Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

0.0% (0s out of 500ms) cpu usage by thread 'Attach Listener'

unique snapshot

三、解决方案：

3.1、集群负载高，增加新节点以缓解负载。

3.2、增加堆内存到系统内存的1半，最大31GB（理论上线32GB）.

如果机器内存不够，那就加大内存吧。

https://github.com/elastic/elasticsearch/issues/10437

https://discuss.elastic.co/t/es-high-cpu-usage-when-idle/87950/4

3.3、插入数据的时候，副本数设置为0.

分片数不可以修改，副本数是可以修改的。

注意：分片过多，会导致：堆内存压力大。

3.4、配置优化

Force all memory to be locked, forcing the JVM to never swap

bootstrap.mlockall: true

Threadpool Settings

Search pool

threadpool.search.type: fixed

threadpool.search.size: 20

threadpool.search.queue_size: 200

Bulk pool

threadpool.bulk.type: fixed

threadpool.bulk.size: 60

threadpool.bulk.queue_size: 3000

Index pool

threadpool.index.type: fixed

threadpool.index.size: 20

threadpool.index.queue_size: 1000

Indices settings

indices.memory.index_buffer_size: 30%

indices.memory.min_shard_index_buffer_size: 12mb

indices.memory.min_index_buffer_size: 96mb

Cache Sizes

indices.fielddata.cache.size: 30%

#indices.fielddata.cache.expire: 6h #will be depreciated & Dev recomend not to use it

indices.cache.filter.size: 30%

#indices.cache.filter.expire: 6h #will be depreciated & Dev recomend not to use it

Indexing Settings for Writes

index.refresh_interval: 30s

#index.translog.flush_threshold_ops: 50000

#index.translog.flush_threshold_size: 1024mb

index.translog.flush_threshold_period: 5m

index.merge.scheduler.max_thread_count: 1

参考：https://github.com/elastic/elasticsearch/issues/4288

相关实践学习

使用阿里云Elasticsearch体验信息检索加速

通过创建登录阿里云Elasticsearch集群，使用DataWorks将MySQL数据同步至Elasticsearch，体验多条件检索效果，简单展示数据同步和信息检索加速的过程和操作。

ElasticSearch 入门精讲

ElasticSearch是一个开源的、基于Lucene的、分布式、高扩展、高实时的搜索与数据分析引擎。根据DB-Engines的排名显示，Elasticsearch是最受欢迎的企业搜索引擎，其次是Apache Solr（也是基于Lucene）。 ElasticSearch的实现原理主要分为以下几个步骤：用户将数据提交到Elastic Search 数据库中通过分词控制器去将对应的语句分词，将其权重和分词结果一并存入数据当用户搜索数据时候，再根据权重将结果排名、打分将返回结果呈现给用户 Elasticsearch可以用于搜索各种文档。它提供可扩展的搜索，具有接近实时的搜索，并支持多租户。

探究 | Elasticsearch CPU高排查思路

热门文章

最新文章

相关课程

相关电子书

相关实验场景

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

探究 | Elasticsearch CPU高排查思路

热门文章

最新文章

相关课程

相关电子书

相关实验场景