ES 遇到 unassigned shard如何处理?-阿里云开发者社区

开发者社区> 开发与运维> 正文
登录阅读全文

ES 遇到 unassigned shard如何处理?

简介:

解决方法:(1)如果是红色的,可以直接分片shard给你认为有最新(或最多)数据的节点。见下:

 

摘自:https://discuss.elastic.co/t/how-to-resolve-the-unassigned-shards/87635

Use the reroute command to assign the unassigned shard to a node.

If your cluster is red then you probably have primary shards unassigned. The command below will allow you to reassign a shard that has gone "stale"(腐烂不新鲜的). This means that ES is not sure which copy of the shard has the most recent data and it will not assign one as primary because if another shard with newer data connects to the cluster later it will be overwritten. If your confident that the shard has all of the data you need then you can assign it to a node with the command below. Just be wary of data loss。

curl -XPOST 'localhost:9200/_cluster/reroute?pretty' -H 'Content-Type: application/json' -d'
{
"commands" : [
{
"allocate_stale_primary" : {
"index" : "test", "shard" : 1,
"node" : "node3",
"accept_data_loss" : true
}
}
]
}
'
https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-reroute.html

 

(2)如果集群是黄色的,可以等其恢复

The status may be one of three values:

green
All primary and replica shards are allocated. Your cluster is 100% operational.
yellow
All primary shards are allocated, but at least one replica is missing. No data is missing, so search results will still be complete. However, your high availability is compromised to some degree. If moreshards disappear, you might lose data. Think of yellow as a warning that should prompt investigation.
red
At least one primary shard (and all of its replicas) is missing. This means that you are missing data: searches will return partial results, and indexing into that shard will return an exception.

The green/yellow/red status is a great way to glance at your cluster and understand what’s going on. The rest of the metrics give you a general summary of your cluster:

  • number_of_nodes and number_of_data_nodes are fairly self-descriptive.
  • active_primary_shards indicates the number of primary shards in your cluster. This is an aggregate total across all indices.
  • active_shards is an aggregate total of all shards across all indices, which includes replica shards.
  • relocating_shards shows the number of shards that are currently moving from one node to another node(现网中遇到,因为kill -9重启es的方法不对,导致node下线,集群重新分配shard). This number is often zero, but can increase when Elasticsearch decides a cluster is not properly balanced, a new node is added, or a node is taken down, for example(原因).
  • initializing_shards is a count of shards that are being freshly created. For example, when you first create an index, the shards will all briefly reside in initializing state. This is typically a transient event, and shards shouldn’t linger in initializing too long. You may also see initializing shards when a node is first restarted: as shards are loaded from disk, they start as initializing.(现网遇到过)
  • unassigned_shards are shards that exist in the cluster state, but cannot be found in the cluster itself. A common source of unassigned shards are unassigned replicas. For example, an index with five shards and one replica will have five unassigned replicas in a single-node cluster. Unassigned shards will also be present if your cluster is red (since primaries are missing).

Drilling Deeper: Finding Problematic Indices

Imagine something goes wrong one day, and you notice that your cluster health looks like this:

{
   "cluster_name": "elasticsearch_zach",
   "status": "red",
   "timed_out": false,
   "number_of_nodes": 8,
   "number_of_data_nodes": 8,
   "active_primary_shards": 90,
   "active_shards": 180,
   "relocating_shards": 0,
   "initializing_shards": 0,
   "unassigned_shards": 20
}

OK, so what can we deduce from this health status? Well, our cluster is red, which means we are missing data (primary + replicas). We know our cluster has 10 nodes, but see only 8 data nodes listed in the health. Two of our nodes have gone missing. We see that there are 20 unassigned shards.

That’s about all the information we can glean. The nature of those missing shards are still a mystery. Are we missing 20 indices with 1 primary shard each? Or 1 index with 20 primary shards? Or 10 indices with 1 primary + 1 replica? Which index?

To answer these questions, we need to ask cluster-health for a little more information by using the level parameter:

GET _cluster/health?level=indices

This parameter will make the cluster-health API add a list of indices in our cluster and details about each of those indices (status, number of shards, unassigned shards, and so forth):

{
   "cluster_name": "elasticsearch_zach",
   "status": "red",
   "timed_out": false,
   "number_of_nodes": 8,
   "number_of_data_nodes": 8,
   "active_primary_shards": 90,
   "active_shards": 180,
   "relocating_shards": 0,
   "initializing_shards": 0,
   "unassigned_shards": 20
   "indices": {
      "v1": {
         "status": "green",
         "number_of_shards": 10,
         "number_of_replicas": 1,
         "active_primary_shards": 10,
         "active_shards": 20,
         "relocating_shards": 0,
         "initializing_shards": 0,
         "unassigned_shards": 0
      },
      "v2": {
         "status": "red", 
         "number_of_shards": 10,
         "number_of_replicas": 1,
         "active_primary_shards": 0,
         "active_shards": 0,
         "relocating_shards": 0,
         "initializing_shards": 0,
         "unassigned_shards": 20 
      },
      "v3": {
         "status": "green",
         "number_of_shards": 10,
         "number_of_replicas": 1,
         "active_primary_shards": 10,
         "active_shards": 20,
         "relocating_shards": 0,
         "initializing_shards": 0,
         "unassigned_shards": 0
      },
      ....
   }
}

We can now see that the v2 index is the index that has made the cluster red.

And it becomes clear that all 20 missing shards are from this index.

Once we ask for the indices output, it becomes immediately clear which index is having problems: the v2 index. We also see that the index has 10 primary shards and one replica, and that all 20 shards are missing. Presumably these 20 shards were on the two nodes that are missing from our cluster.

摘自:https://www.elastic.co/guide/en/elasticsearch/guide/current/_cluster_health.html















本文转自张昺华-sky博客园博客,原文链接:http://www.cnblogs.com/bonelee/p/7458647.html,如需转载请自行联系原作者



版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

分享:
开发与运维
使用钉钉扫一扫加入圈子
+ 订阅

集结各类场景实战经验,助你开发运维畅行无忧

其他文章