ES 遇到 unassigned shard如何处理?

本文涉及的产品
Elasticsearch Serverless通用抵扣包,测试体验金 200元
简介:

解决方法:(1)如果是红色的,可以直接分片shard给你认为有最新(或最多)数据的节点。见下:

 

摘自:https://discuss.elastic.co/t/how-to-resolve-the-unassigned-shards/87635

Use the reroute command to assign the unassigned shard to a node.

If your cluster is red then you probably have primary shards unassigned. The command below will allow you to reassign a shard that has gone "stale"(腐烂不新鲜的). This means that ES is not sure which copy of the shard has the most recent data and it will not assign one as primary because if another shard with newer data connects to the cluster later it will be overwritten. If your confident that the shard has all of the data you need then you can assign it to a node with the command below. Just be wary of data loss。

curl -XPOST 'localhost:9200/_cluster/reroute?pretty' -H 'Content-Type: application/json' -d'
{
"commands" : [
{
"allocate_stale_primary" : {
"index" : "test", "shard" : 1,
"node" : "node3",
"accept_data_loss" : true
}
}
]
}
'
https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-reroute.html

 

(2)如果集群是黄色的,可以等其恢复

The status may be one of three values:

green
All primary and replica shards are allocated. Your cluster is 100% operational.
yellow
All primary shards are allocated,  but at least one replica is missing.  No data is missing, so search results will still be complete. However, your high availability is compromised to some degree. If  moreshards disappear, you might lose data. Think of yellow as a warning that should prompt investigation.
red
At least one primary shard (and all of its replicas) is missing. This means that  you are missing data: searches will return partial results, and indexing into that shard will return an exception.

The green/yellow/red status is a great way to glance at your cluster and understand what’s going on. The rest of the metrics give you a general summary of your cluster:

  • number_of_nodes and number_of_data_nodes are fairly self-descriptive.
  • active_primary_shards indicates the number of primary shards in your cluster. This is an aggregate total across all indices.
  • active_shards is an aggregate total of all shards across all indices, which includes replica shards.
  • relocating_shards shows the number of shards that are currently moving from one node to another node(现网中遇到,因为kill -9重启es的方法不对,导致node下线,集群重新分配shard). This number is often zero, but can increase when Elasticsearch decides a cluster is not properly balanced, a new node is added, or a node is taken down, for example(原因).
  • initializing_shards is a count of shards that are being freshly created. For example, when you first create an index, the shards will all briefly reside in initializing state. This is typically a transient event, and shards shouldn’t linger in initializing too long. You may also see initializing shards when a node is first restarted: as shards are loaded from disk, they start as initializing.(现网遇到过)
  • unassigned_shards are shards that exist in the cluster state, but cannot be found in the cluster itself. A common source of unassigned shards are unassigned replicas. For example, an index with five shards and one replica will have five unassigned replicas in a single-node cluster. Unassigned shards will also be present if your cluster is red (since primaries are missing).

Drilling Deeper: Finding Problematic Indices

Imagine something goes wrong one day, and you notice that your cluster health looks like this:

{
   "cluster_name": "elasticsearch_zach",
   "status": "red",
   "timed_out": false,
   "number_of_nodes": 8,
   "number_of_data_nodes": 8,
   "active_primary_shards": 90,
   "active_shards": 180,
   "relocating_shards": 0,
   "initializing_shards": 0,
   "unassigned_shards": 20
}

OK, so what can we deduce from this health status? Well, our cluster is red, which means we are missing data (primary + replicas). We know our cluster has 10 nodes, but see only 8 data nodes listed in the health. Two of our nodes have gone missing. We see that there are 20 unassigned shards.

That’s about all the information we can glean. The nature of those missing shards are still a mystery. Are we missing 20 indices with 1 primary shard each? Or 1 index with 20 primary shards? Or 10 indices with 1 primary + 1 replica? Which index?

To answer these questions, we need to ask cluster-health for a little more information by using the level parameter:

GET _cluster/health?level=indices

This parameter will make the cluster-health API add a list of indices in our cluster and details about each of those indices (status, number of shards, unassigned shards, and so forth):

{
   "cluster_name": "elasticsearch_zach",
   "status": "red",
   "timed_out": false,
   "number_of_nodes": 8,
   "number_of_data_nodes": 8,
   "active_primary_shards": 90,
   "active_shards": 180,
   "relocating_shards": 0,
   "initializing_shards": 0,
   "unassigned_shards": 20
   "indices": {
      "v1": {
         "status": "green",
         "number_of_shards": 10,
         "number_of_replicas": 1,
         "active_primary_shards": 10,
         "active_shards": 20,
         "relocating_shards": 0,
         "initializing_shards": 0,
         "unassigned_shards": 0
      },
      "v2": {
         "status": "red", 
         "number_of_shards": 10,
         "number_of_replicas": 1,
         "active_primary_shards": 0,
         "active_shards": 0,
         "relocating_shards": 0,
         "initializing_shards": 0,
         "unassigned_shards": 20 
      },
      "v3": {
         "status": "green",
         "number_of_shards": 10,
         "number_of_replicas": 1,
         "active_primary_shards": 10,
         "active_shards": 20,
         "relocating_shards": 0,
         "initializing_shards": 0,
         "unassigned_shards": 0
      },
      ....
   }
}

We can now see that the v2 index is the index that has made the cluster red.

And it becomes clear that all 20 missing shards are from this index.

Once we ask for the indices output, it becomes immediately clear which index is having problems: the v2 index. We also see that the index has 10 primary shards and one replica, and that all 20 shards are missing. Presumably these 20 shards were on the two nodes that are missing from our cluster.

摘自:https://www.elastic.co/guide/en/elasticsearch/guide/current/_cluster_health.html















本文转自张昺华-sky博客园博客,原文链接:http://www.cnblogs.com/bonelee/p/7458647.html,如需转载请自行联系原作者



相关实践学习
以电商场景为例搭建AI语义搜索应用
本实验旨在通过阿里云Elasticsearch结合阿里云搜索开发工作台AI模型服务,构建一个高效、精准的语义搜索系统,模拟电商场景,深入理解AI搜索技术原理并掌握其实现过程。
ElasticSearch 最新快速入门教程
本课程由千锋教育提供。全文搜索的需求非常大。而开源的解决办法Elasricsearch(Elastic)就是一个非常好的工具。目前是全文搜索引擎的首选。本系列教程由浅入深讲解了在CentOS7系统下如何搭建ElasticSearch,如何使用Kibana实现各种方式的搜索并详细分析了搜索的原理,最后讲解了在Java应用中如何集成ElasticSearch并实现搜索。  
相关文章
|
API 索引
es实战-分片分配失败解决方案
分片无法分配情况的一些解决办法
2594 0
|
Cloud Native Go Windows
Windows 11 电脑如何设置自动开机 (Windows 11 2022H2)
Windows 11 电脑如何设置自动开机 (Windows 11 2022H2)
3195 0
|
存储 Kubernetes 监控
Open-Local - 云原生本地磁盘管理系统
Open-Local是由多个组件构成的本地磁盘管理系统,目标是解决当前 Kubernetes 本地存储能力缺失问题。通过Open-Local,使用本地存储会像集中式存储一样简单。
Open-Local - 云原生本地磁盘管理系统
|
存储 数据采集 Prometheus
【云原生监控系列第一篇】一文详解Prometheus普罗米修斯监控系统(山前前后各有风景,有风无风都很自由)(一)
【云原生监控系列第一篇】一文详解Prometheus普罗米修斯监控系统(山前前后各有风景,有风无风都很自由)(一)
2088 0
【云原生监控系列第一篇】一文详解Prometheus普罗米修斯监控系统(山前前后各有风景,有风无风都很自由)(一)
|
缓存 Linux 开发工具
CentOS 7- 配置阿里镜像源
阿里镜像官方地址http://mirrors.aliyun.com/ 1、点击官方提供的相应系统的帮助 :2、查看不同版本的系统操作: 下载源1、安装wget yum install -y wget2、下载CentOS 7的repo文件wget -O /etc/yum.
255372 0
|
监控 数据可视化 Java
Elasticsearch JVM 堆内存使用率飙升,怎么办?
Elasticsearch JVM 堆内存使用率飙升,怎么办?
|
Prometheus 监控 Cloud Native
性能监控之 node_exporter+Prometheus+Grafana 实现主机监控
【8月更文挑战第3天】性能监控之 node_exporter+Prometheus+Grafana 实现主机监控
1507 0
|
存储 编译器 Python
python文件处理-CSV文件的读取、处理、写入
python文件处理-CSV文件的读取、处理、写入
1033 0
python文件处理-CSV文件的读取、处理、写入
|
存储 Kubernetes 监控
在K8S中,Resource Quotas是什么?如何做资源管理的?
在K8S中,Resource Quotas是什么?如何做资源管理的?
|
Web App开发 小程序 网络安全
Mac Charles 抓包 iPhone Https(详细流程)
Mac Charles 抓包 iPhone Https(详细流程)
1505 2