一起来学ElasticSearch(九)

本文涉及的产品
检索分析服务 Elasticsearch 版,2核4GB开发者规格 1个月
简介: 前言目前正在出一个Es专题系列教程, 篇幅会较多, 喜欢的话,给个关注❤️ ~本节给大家讲下es中如何做聚合操作, 内容有点多,需要耐心看完~本文偏实战一些,好了, 废话不多说直接开整吧~什么是聚合聚合的概念有点类似mysql中group by,sum(...),这么说大家可能就有点印象了, 但是在es中聚合操作功能更强大。在了解es中聚合的概念之前,先来看下这两个概念, 聚合就是一个或多个桶和零个或多个指标的组合。

前言

目前正在出一个Es专题系列教程, 篇幅会较多, 喜欢的话,给个关注❤️ ~

本节给大家讲下es中如何做聚合操作, 内容有点多,需要耐心看完~

本文偏实战一些,好了, 废话不多说直接开整吧~

什么是聚合

聚合的概念有点类似mysqlgroup by,sum(...),这么说大家可能就有点印象了, 但是在es聚合操作功能更强大。

在了解es聚合的概念之前,先来看下这两个概念, 聚合就是一个或多个和零个或多个指标的组合。

聚合有着非常多的场景应用,比如后台报表,通常要做非常多的复杂统计,而且数据量庞大,如果单纯依靠数据库统计,速度会非常慢甚至拖垮数据库,而使用es就会相对容易一些

桶(Buckets)

可以理解为es中存储的文档,通过满足特定条件的集合,这就叫做

当聚合开始被执行,每个文档里面的值通过计算来决定符合哪个桶的条件。如果匹配到,文档将放入相应的桶并接着进行聚合操作。桶也可以被嵌套在其他桶里面,提供层次化的或者有条件的划分方案。

指标(Metrics)

对桶内的文档进行计算,比如通过文档的值计算平均值,计算最大值最小值, 下面就带大家看看如何去进行聚合操作

聚合指令

通常语法如下:

GET index_name/_search
{
"aggs": {
 "NAME": {
   "AGG_TYPE": {}
 }
}
}

 

  • index_name: 索引名称
  • aggs: 聚合修饰符
  • NAME:自定义变量名称,用于返回聚合结果时的变量名
  • AGG_TYPE:聚合类型

注意NAME,  AGG_TYPE是特定参数, 下面我们看看有哪些聚合类型

  • terms:按照匹配条件进行聚合,可以按照条件将文档存入不同的桶中,进行后续操作
  • histogram:条形图(折方图),可以指定步长,按照步长递增进行聚合
  • date_histogram:时间条形图(折方图),可以指定时间频率,按照时间频率进行聚合
  • cardinality:去重计算,存在一定的误差值
  • percentiles:获取字段不同百分比数对应的值
  • percentile_ranks:获取值对应的百分比数
  • filter:对聚合结果进行过滤,对查询结果不过滤
  • post_filter:对聚合结果不过滤,对查询结果过滤
  • avg:计算平均值
  • sum:求和
  • min:最小值
  • max:最大值

下面看一个示例:

本节我们新建索引,下面是一个简单的请求日志索引, 定义了请求方法,路径,耗时,日志创建时间这几个字段

PUT req_log
{
"mappings": {
 "properties" : {
   "method" : {
     "type" : "keyword"
   },
   "path" : {
     "type" : "keyword"
   },
   "times" : {
     "type" : "long"
   },
   "created" : {
     "type" : "date"
   }
 }
}
}

紧接着,往里边塞点数据

POST /req_log/_bulk
{ "index": {}}
{ "times" : 80, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-09" }
{ "index": {}}
{ "times" : 30, "method" : "GET", "path" : "/api/post/2", "created" : "2023-02-07" }
{ "index": {}}
{ "times" : 20, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-08" }
{ "index": {}}
{ "times" : 120, "method" : "GET", "path" : "/api/post/20", "created" : "2023-02-06" }
{ "index": {}}
{ "times" : 150, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-05" }
{ "index": {}}
{ "times" : 80, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-04" }
{ "index": {}}
{ "times" : 960, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-03" }
{ "index": {}}
{ "times" : 9000, "method" : "GET", "path" : "/api/post/8", "created" : "2023-02-02" }
{ "index": {}}
{ "times" : 1300, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-01" }
{ "index": {}}
{ "times" : 400, "method" : "GET", "path" : "/api/post/4", "created" : "2023-02-10" }
{ "index": {}}
{ "times" : 89, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-11" }
{ "index": {}}
{ "times" : 380, "method" : "GET", "path" : "/api/post/2", "created" : "2023-02-12" }
{ "index": {}}
{ "times" : 270, "method" : "GET", "path" : "/api/post/10", "created" : "2023-02-13" }
{ "index": {}}
{ "times" : 630, "method" : "GET", "path" : "/api/post/12", "created" : "2023-02-14" }
{ "index": {}}
{ "times" : 210 , "method" : "GET", "path" : "/api/post/4", "created" : "2023-02-15" }
{ "index": {}}
{ "times" : 900, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-16" }
{ "index": {}}
{ "times" : 870, "method" : "GET", "path" : "/api/post/7", "created" : "2023-02-17" }

查询每个请求路径(path)下的请求数量

GET req_log/_search
{
"aggs": {
  "counts": {
    "terms": {
      "field": "path"
    }
  }
}
}

返回:

{
  "took" : 995,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 17,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "GUK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 80,
          "method" : "GET",
          "path" : "/api/post/1",
          "created" : "2023-02-09"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "GkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 30,
          "method" : "GET",
          "path" : "/api/post/2",
          "created" : "2023-02-07"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "G0K3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 20,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-08"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HEK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 120,
          "method" : "GET",
          "path" : "/api/post/20",
          "created" : "2023-02-06"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HUK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 150,
          "method" : "GET",
          "path" : "/api/post/1",
          "created" : "2023-02-05"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 80,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-04"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "H0K3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 960,
          "method" : "GET",
          "path" : "/api/post/6",
          "created" : "2023-02-03"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "IEK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 9000,
          "method" : "GET",
          "path" : "/api/post/8",
          "created" : "2023-02-02"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "IUK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 1300,
          "method" : "GET",
          "path" : "/api/post/6",
          "created" : "2023-02-01"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "IkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 400,
          "method" : "GET",
          "path" : "/api/post/4",
          "created" : "2023-02-10"
        }
      }
    ]
  },
  "aggregations" : {
    "counts" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "/api/post/3",
          "doc_count" : 3
        },
        {
          "key" : "/api/post/6",
          "doc_count" : 3
        },
        {
          "key" : "/api/post/1",
          "doc_count" : 2
        },
        {
          "key" : "/api/post/2",
          "doc_count" : 2
        },
        {
          "key" : "/api/post/4",
          "doc_count" : 2
        },
        {
          "key" : "/api/post/10",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/12",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/20",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/7",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/8",
          "doc_count" : 1
        }
      ]
    }
  }
}

可以看到自定义返回的字段counts,聚合的类型为terms,聚合字段为path,也就是按照path进行桶划分。

从结果来看也很明显,buckets分为了4个桶,key代表聚合的字段名称,doc_count代表文档的数量

terms还支持以下命令格式:

GET req_log/_search
{
"aggs": {
 "counts": {
   "terms": {
     "field": "path",
     "size": 10,
     "collect_mode": "depth_first",
     "order": {
       "_count": "desc"
     }
   }
 }
}
}

返回:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  ......,
  "aggregations" : {
    "counts" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "/api/post/3",
          "doc_count" : 3
        },
        {
          "key" : "/api/post/6",
          "doc_count" : 3
        },
        {
          "key" : "/api/post/1",
          "doc_count" : 2
        },
        {
          "key" : "/api/post/2",
          "doc_count" : 2
        },
        {
          "key" : "/api/post/4",
          "doc_count" : 2
        },
        {
          "key" : "/api/post/10",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/12",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/20",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/7",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/8",
          "doc_count" : 1
        }
      ]
    }
  }
}
  • size:返回桶中的多少个数据,通常可以结合排序模式进行使用,默认值=10
  • collect_mode: 集合模式,包括深度优先遍历(depth_first)广度优先遍历(breadth_first)两种。对于数组类型的字段,在使用深度优先遍历的情况下,可能会导致占用内存过多的情况。因为深度优先遍历会将数据全部加载到内存中后再进行操作
  • order 排序,默认按照doc_count倒序排列,可以指定默认字段或子聚合字段进行排序

嵌套聚合

es中默认支持聚合的嵌套,可以在一个桶中再次进行桶的划分, 嵌套有分为同级子级,下面看一个例子:

同级嵌套

GET req_log/_search
{
"aggs": {
 "path_count": {
   "terms": {
     "field": "path"
   }
 },
 "method_count": {
   "terms": {
     "field": "method"
   }
 }
}
}

返回:

{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 17,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "GUK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 80,
          "method" : "GET",
          "path" : "/api/post/1",
          "created" : "2023-02-09"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "GkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 30,
          "method" : "GET",
          "path" : "/api/post/2",
          "created" : "2023-02-07"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "G0K3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 20,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-08"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HEK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 120,
          "method" : "GET",
          "path" : "/api/post/20",
          "created" : "2023-02-06"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HUK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 150,
          "method" : "GET",
          "path" : "/api/post/1",
          "created" : "2023-02-05"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 80,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-04"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "H0K3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 960,
          "method" : "GET",
          "path" : "/api/post/6",
          "created" : "2023-02-03"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "IEK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 9000,
          "method" : "GET",
          "path" : "/api/post/8",
          "created" : "2023-02-02"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "IUK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 1300,
          "method" : "GET",
          "path" : "/api/post/6",
          "created" : "2023-02-01"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "IkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 400,
          "method" : "GET",
          "path" : "/api/post/4",
          "created" : "2023-02-10"
        }
      }
    ]
  },
  "aggregations" : {
    "method_count" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "GET",
          "doc_count" : 17
        }
      ]
    },
    "path_count" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "/api/post/3",
          "doc_count" : 3
        },
        {
          "key" : "/api/post/6",
          "doc_count" : 3
        },
        {
          "key" : "/api/post/1",
          "doc_count" : 2
        },
        {
          "key" : "/api/post/2",
          "doc_count" : 2
        },
        {
          "key" : "/api/post/4",
          "doc_count" : 2
        },
        {
          "key" : "/api/post/10",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/12",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/20",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/7",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/8",
          "doc_count" : 1
        }
      ]
    }
  }
}

从结果来看,很明显的看出,不同methodpath下的请求数量

子级嵌套

假如,我现在有一个需求:

  • 同级各个请求方法下的请求数量
  • 各个请求method下各个path的请求数量
  • 各个path下请求耗时的平均值

查询示例:

GET req_log/_search
{
"aggs": {
 "method_count": {
   "terms": {
     "field": "method"
   },
   "aggs": {
     "path_count": {
       "terms": {
         "field": "path"
       },
       "aggs": {
         "avg_times": {
           "avg": {
             "field": "times"
           }
         }
       }
     }
   }
 }
}
}

返回:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 17,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "GUK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 80,
          "method" : "GET",
          "path" : "/api/post/1",
          "created" : "2023-02-09"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "GkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 30,
          "method" : "GET",
          "path" : "/api/post/2",
          "created" : "2023-02-07"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "G0K3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 20,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-08"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HEK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 120,
          "method" : "GET",
          "path" : "/api/post/20",
          "created" : "2023-02-06"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HUK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 150,
          "method" : "GET",
          "path" : "/api/post/1",
          "created" : "2023-02-05"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 80,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-04"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "H0K3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 960,
          "method" : "GET",
          "path" : "/api/post/6",
          "created" : "2023-02-03"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "IEK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 9000,
          "method" : "GET",
          "path" : "/api/post/8",
          "created" : "2023-02-02"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "IUK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 1300,
          "method" : "GET",
          "path" : "/api/post/6",
          "created" : "2023-02-01"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "IkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 400,
          "method" : "GET",
          "path" : "/api/post/4",
          "created" : "2023-02-10"
        }
      }
    ]
  },
  "aggregations" : {
    "method_count" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "GET",
          "doc_count" : 17,
          "path_count" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "/api/post/3",
                "doc_count" : 3,
                "avg_times" : {
                  "value" : 63.0
                }
              },
              {
                "key" : "/api/post/6",
                "doc_count" : 3,
                "avg_times" : {
                  "value" : 1053.3333333333333
                }
              },
              {
                "key" : "/api/post/1",
                "doc_count" : 2,
                "avg_times" : {
                  "value" : 115.0
                }
              },
              {
                "key" : "/api/post/2",
                "doc_count" : 2,
                "avg_times" : {
                  "value" : 205.0
                }
              },
              {
                "key" : "/api/post/4",
                "doc_count" : 2,
                "avg_times" : {
                  "value" : 305.0
                }
              },
              {
                "key" : "/api/post/10",
                "doc_count" : 1,
                "avg_times" : {
                  "value" : 270.0
                }
              },
              {
                "key" : "/api/post/12",
                "doc_count" : 1,
                "avg_times" : {
                  "value" : 630.0
                }
              },
              {
                "key" : "/api/post/20",
                "doc_count" : 1,
                "avg_times" : {
                  "value" : 120.0
                }
              },
              {
                "key" : "/api/post/7",
                "doc_count" : 1,
                "avg_times" : {
                  "value" : 870.0
                }
              },
              {
                "key" : "/api/post/8",
                "doc_count" : 1,
                "avg_times" : {
                  "value" : 9000.0
                }
              }
            ]
          }
        }
      ]
    }
  }
}

从结果来看,桶按照层级嵌套关系

聚合过滤

查询和聚合过滤

这种是最常见的过滤方法,就是对查询结果和聚合结果都进行过滤,在aggs同级加上一个query即可,query前几节都给大家讲过, 下面看一个示例:

GET req_log/_search
{
"query": {
 "constant_score": {
   "filter": {
     "term": {
       "method": "POST"
     }
   }
 }
},
"aggs": {
 "path_count": {
   "terms": {
     "field": "path"
   }
 }
}
}

返回:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "path_count" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ ]
    }
  }
}

可以看到结果是空的,原因是我们过滤了方法,因为整个文档都不存在POST所以为空

有时候我们的需求是这样的,想要拿某个数据和整个文档数据做比较,这个怎么做呢?global:{}可以很方便的聚合全部文档,下面看一个示例:

查询path=/api/post/3下的平均请求耗时和整个请求下的平均请求耗时

GET req_log/_search
{
"query": {
 "constant_score": {
   "filter": {
     "term": {
       "path": "/api/post/3"
     }
   }
 }
},
"aggs": {
 "path_avg": {
   "avg": {
     "field": "times"
   }
 },
 "all_order":{
   "global": {},
   "aggs": {
     "all_avg": {
       "avg": {
         "field": "times"
       }
     }
   }
 }
}
}

返回:

{
  "took" : 11,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "G0K3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 20,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-08"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 80,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-04"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "I0K3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 89,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-11"
        }
      }
    ]
  },
  "aggregations" : {
    "path_avg" : {
      "value" : 63.0
    },
    "all_order" : {
      "doc_count" : 17,
      "all_avg" : {
        "value" : 911.1176470588235
      }
    }
  }
}

从结果比较来看api/post/3下的接口请求速度还是很快的,平均是63, 整个接口平均耗时是911

聚合过滤

有时候,我们不需要过滤查询结果,只需要过滤聚合结果,这个怎么做呢?下面接着看一个示例:

查询method=GET下的请求并计算出path=/api/post/3下的请求平均耗时

  • filter
GET req_log/_search
{
 "query":{
     "constant_score":{
         "filter":{
             "term":{
                 "method":"GET"
             }
         }
     }
 },
 "aggs":{
     "req_count":{
         "aggs":{
             "req_path_order":{
                 "terms":{
                     "field":"path"
                 },
                 "aggs":{
                     "avg_times":{
                         "avg":{
                             "field":"times"
                         }
                     }
                 }
             }
         },
         "filter":{
             "term":{
                 "path":"/api/post/3"
             }
         }
     }
 }
}

来看返回结果:

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 17,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "GUK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 80,
          "method" : "GET",
          "path" : "/api/post/1",
          "created" : "2023-02-09"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "GkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 30,
          "method" : "GET",
          "path" : "/api/post/2",
          "created" : "2023-02-07"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "G0K3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 20,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-08"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HEK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 120,
          "method" : "GET",
          "path" : "/api/post/20",
          "created" : "2023-02-06"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HUK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 150,
          "method" : "GET",
          "path" : "/api/post/1",
          "created" : "2023-02-05"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 80,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-04"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "H0K3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 960,
          "method" : "GET",
          "path" : "/api/post/6",
          "created" : "2023-02-03"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "IEK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 9000,
          "method" : "GET",
          "path" : "/api/post/8",
          "created" : "2023-02-02"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "IUK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 1300,
          "method" : "GET",
          "path" : "/api/post/6",
          "created" : "2023-02-01"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "IkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 400,
          "method" : "GET",
          "path" : "/api/post/4",
          "created" : "2023-02-10"
        }
      }
    ]
  },
  "aggregations" : {
    "req_count" : {
      "doc_count" : 3,
      "req_path_order" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "/api/post/3",
            "doc_count" : 3,
            "avg_times" : {
              "value" : 63.0
            }
          }
        ]
      }
    }
  }
}

从结果来看,请求查询的结果并没有被过滤,只有聚合的结果被过滤了

查询过滤 & 聚合不过滤

与上相反,值过滤查询,不过滤聚合结果, 下面看个示例:

查询path=/api/post/3并且method=GET下的请求,并聚合结果各个path下的请求数

  • post_filter
GET req_log/_search
{
"aggs": {
 "path_count": {
   "terms": {
     "field": "path"
   }
 }
},
"post_filter": {
 "bool": {
   "must": [
     {
       "term": {
         "path": {
           "value": "/api/post/3"
         }
       }
     },
     {
       "term": {
         "method": {
           "value": "GET"
         }
       }
     }
   ]
 }
}
}

返回:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "G0K3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 20,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-08"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "HkK3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 80,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-04"
        }
      },
      {
        "_index" : "req_log",
        "_type" : "_doc",
        "_id" : "I0K3NIYBdXrpvlCF01bz",
        "_score" : 1.0,
        "_source" : {
          "times" : 89,
          "method" : "GET",
          "path" : "/api/post/3",
          "created" : "2023-02-11"
        }
      }
    ]
  },
  "aggregations" : {
    "path_count" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "/api/post/3",
          "doc_count" : 3
        },
        {
          "key" : "/api/post/6",
          "doc_count" : 3
        },
        {
          "key" : "/api/post/1",
          "doc_count" : 2
        },
        {
          "key" : "/api/post/2",
          "doc_count" : 2
        },
        {
          "key" : "/api/post/4",
          "doc_count" : 2
        },
        {
          "key" : "/api/post/10",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/12",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/20",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/7",
          "doc_count" : 1
        },
        {
          "key" : "/api/post/8",
          "doc_count" : 1
        }
      ]
    }
  }
}

从结果来看,查询结果只有api/post/3但是聚合结果并没有过滤这个条件

结束语

本节就到此结束了,还有一些聚合内容放到下节讲,先消化这么多 ~

本着把自己知道的都告诉大家,如果本文对您有所帮助,点赞+关注鼓励一下呗~

相关文章

项目源码(源码已更新 欢迎star⭐️)

往期并发编程内容推荐

博客(阅读体验较佳)

推荐 SpringBoot & SpringCloud (源码已更新 欢迎star⭐️)












































相关实践学习
使用阿里云Elasticsearch体验信息检索加速
通过创建登录阿里云Elasticsearch集群,使用DataWorks将MySQL数据同步至Elasticsearch,体验多条件检索效果,简单展示数据同步和信息检索加速的过程和操作。
ElasticSearch 入门精讲
ElasticSearch是一个开源的、基于Lucene的、分布式、高扩展、高实时的搜索与数据分析引擎。根据DB-Engines的排名显示,Elasticsearch是最受欢迎的企业搜索引擎,其次是Apache Solr(也是基于Lucene)。 ElasticSearch的实现原理主要分为以下几个步骤: 用户将数据提交到Elastic Search 数据库中 通过分词控制器去将对应的语句分词,将其权重和分词结果一并存入数据 当用户搜索数据时候,再根据权重将结果排名、打分 将返回结果呈现给用户 Elasticsearch可以用于搜索各种文档。它提供可扩展的搜索,具有接近实时的搜索,并支持多租户。
相关文章
|
6月前
|
存储 搜索推荐 关系型数据库
为什么需要 Elasticsearch
为什么需要 Elasticsearch
50 0
|
5月前
|
自然语言处理 数据挖掘 定位技术
深入探索Elasticsearch中的QueryBuilders
深入探索Elasticsearch中的QueryBuilders
294 0
|
缓存 API 索引
Elasticsearch(五)
Elasticsearch(五)
62 0
Elasticsearch(五)
|
存储 关系型数据库 MySQL
Elasticsearch(一)
Elasticsearch(一)
68 0
|
存储 缓存 索引
Elasticsearch(四)
Elasticsearch(四)
59 0
下一篇
无影云桌面