Recovering unassigned shards on elasticsearch 2.x——副本shard可以设置replica为0在设置回来-阿里云开发者社区

开发者社区> 云计算> 正文
登录阅读全文

Recovering unassigned shards on elasticsearch 2.x——副本shard可以设置replica为0在设置回来

简介:

Recovering unassigned shards on elasticsearch 2.x

摘自:https://z0z0.me/recovering-unassigned-shards-on-elasticsearch/

I got accross the problem when decided to add a node to the elasticsearch cluster and that node was not able to replicate the indexes of the cluster. This issue is usually happens when there is not enough disk space available, or not available master or different elasticsearch version. While my servers had more than enough disk space and also the master was available with the help of the elasticsearch discuss I found out that the new node was having a different version than old nodes. Basically while installing on Debian jessie I just run apt-get install elasticsearch which ended up installing the latest available version. To install a specific version of the elasticsearch you prety much need to add ={version}.

#apt-get install elasticsearch={version}

Now that I have identified the reasons for unallocated shards and successfully downgraded the elasticsearch to the required version by running the command above after starting the node the cluster was still in red state with unassigned shards all over the place:

#curl http://localhost:9200/_cluster/health?pretty
 {
   "cluster_name" : "z0z0",
   "status" : "red",
   "timed_out" : false,
   "number_of_nodes" : 3,
   "number_of_data_nodes" : 3,
   "active_primary_shards" : 6,
   "active_shards" : 12,
   "relocating_shards" : 0,
   "initializing_shards" : 0,
   "unassigned_shards" : 8,
   "delayed_unassigned_shards" : 0,
   "number_of_pending_tasks" : 0,
   "number_of_in_flight_fetch" : 0,
   "task_max_waiting_in_queue_millis" : 0,
   "active_shards_percent_as_number" : 60.0
 }

#curl http://localhost:9200/_cat/shards
site-id      4 p UNASSIGNED                                                 
site-id      4 r UNASSIGNED                                                 
site-id      1 p UNASSIGNED                                                 
site-id      1 r UNASSIGNED                                                 
site-id      3 p STARTED    0 159b 10.0.0.6 node-2 
site-id      3 r STARTED    0 159b 10.0.0.7 node-3 
site-id      2 r STARTED    0 159b 10.0.0.6 node-2 
site-id      2 p STARTED    0 159b 10.0.0.7 node-3 
site-id      0 r STARTED    0 159b 10.0.0.6 node-2 
site-id      0 p STARTED    0 159b 10.0.0.7 node-3 
subscription 4 p UNASSIGNED                                                 
subscription 4 r UNASSIGNED                                                 
subscription 1 p UNASSIGNED                                                 
subscription 1 r UNASSIGNED                                                 
subscription 3 p STARTED    0 159b 10.0.0.6 node-2 
subscription 3 r STARTED    0 159b 10.0.0.7 node-3 
subscription 2 r STARTED    0 159b 10.0.0.6 node-2 
subscription 2 p STARTED    0 159b 10.0.0.7 node-3 
subscription 0 p STARTED    0 159b 10.0.0.6 node-2 
subscription 0 r STARTED    0 159b 10.0.0.7 node-3

At this point I was pretty desperate and whatever I tried it either did not do anything or ended up in all kind of failures. So I set the number_of_replicas to 0 by running the following query:

#curl -XPUT http://localhost:9200/_settings?pretty -d '
{
  "index" : {
    "number_of_replicas' : 0
  }
}'

and started to stop the nodes one by one until I was having only one live node. 
At this point I decided to start trying to reroute the unassigned shards and if it won't work I would just start over my cluster. So I did run the following:

#curl -XPOST -d '
{
  "commands" : [ {
    "allocate" : {
      "index" : "site-id",
      "shard" : 1,
      "node" : "node-3",
      "allow_primary" : true
    }
  } ]
}' http://localhost:9200/_cluster/reroute?pretty

I've seen that the rerouted shard became initialized then running so I did the same command on the rest of unassigned shards. 
Running curl http://localhost:9200/_cluster/health?pretty confirmed that I am on the good track to fix the cluster.

#curl http://localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "z0z0",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 10,
  "active_shards" : 20,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

So the cluster was green again but was running out of one node. So it was time to bring up the other nodes one by one. When all the nodes were up I set the number_of_replicas to 1 by running the following:

#curl -XPUT http://localhost:9200/_settings -d '
{
  "index" : {
    "number_of_replicas" : 1
  }
}'

So my elasticsearch cluster is back on running 3 nodes and still in green state. After alot of googling and wasted time I decided to write this article so that if anyone would come accross this issue would have a working example of how to fix it.













本文转自张昺华-sky博客园博客,原文链接:http://www.cnblogs.com/bonelee/p/7459391.html,如需转载请自行联系原作者


版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

分享:
云计算
使用钉钉扫一扫加入圈子
+ 订阅

时时分享云计算技术内容,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。

其他文章
最新文章
相关文章