前言
目前正在出一个Es专题
系列教程, 篇幅会较多, 喜欢的话,给个关注❤️ ~
本节给大家讲下es
中如何做聚合
操作, 内容有点多,需要耐心看完~
本文偏实战一些,好了, 废话不多说直接开整吧~
什么是聚合
聚合
的概念有点类似mysql
中group by
,sum(...)
,这么说大家可能就有点印象了, 但是在es
中聚合
操作功能更强大。
在了解es
中聚合
的概念之前,先来看下这两个概念, 聚合
就是一个或多个桶
和零个或多个指标
的组合。
聚合有着非常多的场景应用,比如后台报表,通常要做非常多的复杂统计,而且数据量庞大,如果单纯依靠数据库统计,速度会非常慢甚至拖垮数据库,而使用es
就会相对容易一些
桶(Buckets)
可以理解为es
中存储的文档,通过满足特定条件的集合,这就叫做桶
。
当聚合开始被执行,每个文档里面的值通过计算来决定符合哪个桶的条件。如果匹配到,文档将放入相应的桶并接着进行聚合操作。桶也可以被嵌套在其他桶里面,提供层次化的或者有条件的划分方案。
指标(Metrics)
对桶内的文档进行计算,比如通过文档的值计算平均值,计算最大值最小值, 下面就带大家看看如何去进行聚合
操作
聚合指令
通常语法如下:
GET index_name/_search { "aggs": { "NAME": { "AGG_TYPE": {} } } }
index_name
: 索引名称aggs
: 聚合修饰符NAME
:自定义变量名称,用于返回聚合结果时的变量名AGG_TYPE
:聚合类型
注意NAME
, AGG_TYPE
是特定参数, 下面我们看看有哪些聚合类型
:
terms
:按照匹配条件进行聚合,可以按照条件将文档存入不同的桶中,进行后续操作histogram
:条形图(折方图),可以指定步长,按照步长递增进行聚合date_histogram
:时间条形图(折方图),可以指定时间频率,按照时间频率进行聚合cardinality
:去重计算,存在一定的误差值percentiles
:获取字段不同百分比数对应的值percentile_ranks
:获取值对应的百分比数filter
:对聚合结果进行过滤,对查询结果不过滤post_filter
:对聚合结果不过滤,对查询结果过滤avg
:计算平均值sum
:求和min
:最小值max
:最大值
下面看一个示例:
本节我们新建索引,下面是一个简单的请求日志索引, 定义了请求方法,路径,耗时,日志创建时间这几个字段
PUT req_log { "mappings": { "properties" : { "method" : { "type" : "keyword" }, "path" : { "type" : "keyword" }, "times" : { "type" : "long" }, "created" : { "type" : "date" } } } }
紧接着,往里边塞点数据
POST /req_log/_bulk { "index": {}} { "times" : 80, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-09" } { "index": {}} { "times" : 30, "method" : "GET", "path" : "/api/post/2", "created" : "2023-02-07" } { "index": {}} { "times" : 20, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-08" } { "index": {}} { "times" : 120, "method" : "GET", "path" : "/api/post/20", "created" : "2023-02-06" } { "index": {}} { "times" : 150, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-05" } { "index": {}} { "times" : 80, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-04" } { "index": {}} { "times" : 960, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-03" } { "index": {}} { "times" : 9000, "method" : "GET", "path" : "/api/post/8", "created" : "2023-02-02" } { "index": {}} { "times" : 1300, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-01" } { "index": {}} { "times" : 400, "method" : "GET", "path" : "/api/post/4", "created" : "2023-02-10" } { "index": {}} { "times" : 89, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-11" } { "index": {}} { "times" : 380, "method" : "GET", "path" : "/api/post/2", "created" : "2023-02-12" } { "index": {}} { "times" : 270, "method" : "GET", "path" : "/api/post/10", "created" : "2023-02-13" } { "index": {}} { "times" : 630, "method" : "GET", "path" : "/api/post/12", "created" : "2023-02-14" } { "index": {}} { "times" : 210 , "method" : "GET", "path" : "/api/post/4", "created" : "2023-02-15" } { "index": {}} { "times" : 900, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-16" } { "index": {}} { "times" : 870, "method" : "GET", "path" : "/api/post/7", "created" : "2023-02-17" }
查询每个请求路径(path)
下的请求数量
GET req_log/_search { "aggs": { "counts": { "terms": { "field": "path" } } } }
返回:
{ "took" : 995, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 17, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "req_log", "_type" : "_doc", "_id" : "GUK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 80, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-09" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "GkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 30, "method" : "GET", "path" : "/api/post/2", "created" : "2023-02-07" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "G0K3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 20, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-08" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HEK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 120, "method" : "GET", "path" : "/api/post/20", "created" : "2023-02-06" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HUK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 150, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-05" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 80, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-04" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "H0K3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 960, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-03" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "IEK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 9000, "method" : "GET", "path" : "/api/post/8", "created" : "2023-02-02" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "IUK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 1300, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-01" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "IkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 400, "method" : "GET", "path" : "/api/post/4", "created" : "2023-02-10" } } ] }, "aggregations" : { "counts" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "/api/post/3", "doc_count" : 3 }, { "key" : "/api/post/6", "doc_count" : 3 }, { "key" : "/api/post/1", "doc_count" : 2 }, { "key" : "/api/post/2", "doc_count" : 2 }, { "key" : "/api/post/4", "doc_count" : 2 }, { "key" : "/api/post/10", "doc_count" : 1 }, { "key" : "/api/post/12", "doc_count" : 1 }, { "key" : "/api/post/20", "doc_count" : 1 }, { "key" : "/api/post/7", "doc_count" : 1 }, { "key" : "/api/post/8", "doc_count" : 1 } ] } } }
可以看到自定义返回的字段counts
,聚合的类型为terms
,聚合字段为path
,也就是按照path
进行桶划分。
从结果来看也很明显,buckets
分为了4个桶,key
代表聚合的字段名称,doc_count
代表文档的数量
terms
还支持以下命令格式:
GET req_log/_search { "aggs": { "counts": { "terms": { "field": "path", "size": 10, "collect_mode": "depth_first", "order": { "_count": "desc" } } } } }
返回:
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, ......, "aggregations" : { "counts" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "/api/post/3", "doc_count" : 3 }, { "key" : "/api/post/6", "doc_count" : 3 }, { "key" : "/api/post/1", "doc_count" : 2 }, { "key" : "/api/post/2", "doc_count" : 2 }, { "key" : "/api/post/4", "doc_count" : 2 }, { "key" : "/api/post/10", "doc_count" : 1 }, { "key" : "/api/post/12", "doc_count" : 1 }, { "key" : "/api/post/20", "doc_count" : 1 }, { "key" : "/api/post/7", "doc_count" : 1 }, { "key" : "/api/post/8", "doc_count" : 1 } ] } } }
size
:返回桶中的多少个数据,通常可以结合排序模式进行使用,默认值=10
collect_mode
: 集合模式,包括深度优先遍历(depth_first)
和广度优先遍历(breadth_first)
两种。对于数组类型的字段,在使用深度优先遍历的情况下,可能会导致占用内存过多的情况。因为深度优先遍历会将数据全部加载到内存中后再进行操作order
排序,默认按照doc_count
倒序排列,可以指定默认字段或子聚合字段进行排序
嵌套聚合
es
中默认支持聚合的嵌套,可以在一个桶中再次进行桶的划分, 嵌套有分为同级
和子级
,下面看一个例子:
同级嵌套
GET req_log/_search { "aggs": { "path_count": { "terms": { "field": "path" } }, "method_count": { "terms": { "field": "method" } } } }
返回:
{ "took" : 7, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 17, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "req_log", "_type" : "_doc", "_id" : "GUK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 80, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-09" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "GkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 30, "method" : "GET", "path" : "/api/post/2", "created" : "2023-02-07" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "G0K3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 20, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-08" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HEK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 120, "method" : "GET", "path" : "/api/post/20", "created" : "2023-02-06" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HUK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 150, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-05" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 80, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-04" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "H0K3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 960, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-03" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "IEK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 9000, "method" : "GET", "path" : "/api/post/8", "created" : "2023-02-02" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "IUK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 1300, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-01" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "IkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 400, "method" : "GET", "path" : "/api/post/4", "created" : "2023-02-10" } } ] }, "aggregations" : { "method_count" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "GET", "doc_count" : 17 } ] }, "path_count" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "/api/post/3", "doc_count" : 3 }, { "key" : "/api/post/6", "doc_count" : 3 }, { "key" : "/api/post/1", "doc_count" : 2 }, { "key" : "/api/post/2", "doc_count" : 2 }, { "key" : "/api/post/4", "doc_count" : 2 }, { "key" : "/api/post/10", "doc_count" : 1 }, { "key" : "/api/post/12", "doc_count" : 1 }, { "key" : "/api/post/20", "doc_count" : 1 }, { "key" : "/api/post/7", "doc_count" : 1 }, { "key" : "/api/post/8", "doc_count" : 1 } ] } } }
从结果来看,很明显的看出,不同method
和path
下的请求数量
子级嵌套
假如,我现在有一个需求:
- 同级各个请求方法下的请求数量
- 各个请求
method
下各个path
的请求数量 - 各个
path
下请求耗时的平均值
查询示例:
GET req_log/_search { "aggs": { "method_count": { "terms": { "field": "method" }, "aggs": { "path_count": { "terms": { "field": "path" }, "aggs": { "avg_times": { "avg": { "field": "times" } } } } } } } }
返回:
{ "took" : 3, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 17, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "req_log", "_type" : "_doc", "_id" : "GUK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 80, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-09" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "GkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 30, "method" : "GET", "path" : "/api/post/2", "created" : "2023-02-07" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "G0K3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 20, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-08" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HEK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 120, "method" : "GET", "path" : "/api/post/20", "created" : "2023-02-06" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HUK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 150, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-05" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 80, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-04" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "H0K3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 960, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-03" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "IEK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 9000, "method" : "GET", "path" : "/api/post/8", "created" : "2023-02-02" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "IUK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 1300, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-01" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "IkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 400, "method" : "GET", "path" : "/api/post/4", "created" : "2023-02-10" } } ] }, "aggregations" : { "method_count" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "GET", "doc_count" : 17, "path_count" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "/api/post/3", "doc_count" : 3, "avg_times" : { "value" : 63.0 } }, { "key" : "/api/post/6", "doc_count" : 3, "avg_times" : { "value" : 1053.3333333333333 } }, { "key" : "/api/post/1", "doc_count" : 2, "avg_times" : { "value" : 115.0 } }, { "key" : "/api/post/2", "doc_count" : 2, "avg_times" : { "value" : 205.0 } }, { "key" : "/api/post/4", "doc_count" : 2, "avg_times" : { "value" : 305.0 } }, { "key" : "/api/post/10", "doc_count" : 1, "avg_times" : { "value" : 270.0 } }, { "key" : "/api/post/12", "doc_count" : 1, "avg_times" : { "value" : 630.0 } }, { "key" : "/api/post/20", "doc_count" : 1, "avg_times" : { "value" : 120.0 } }, { "key" : "/api/post/7", "doc_count" : 1, "avg_times" : { "value" : 870.0 } }, { "key" : "/api/post/8", "doc_count" : 1, "avg_times" : { "value" : 9000.0 } } ] } } ] } } }
从结果来看,桶按照层级嵌套关系
聚合过滤
查询和聚合过滤
这种是最常见的过滤方法,就是对查询结果和聚合结果都进行过滤,在aggs
同级加上一个query
即可,query
前几节都给大家讲过, 下面看一个示例:
GET req_log/_search { "query": { "constant_score": { "filter": { "term": { "method": "POST" } } } }, "aggs": { "path_count": { "terms": { "field": "path" } } } }
返回:
{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "path_count" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ ] } } }
可以看到结果是空的,原因是我们过滤了方法,因为整个文档都不存在POST
所以为空
有时候我们的需求是这样的,想要拿某个数据和整个文档数据做比较,这个怎么做呢?global:{}
可以很方便的聚合全部文档
,下面看一个示例:
查询path=/api/post/3
下的平均请求耗时和整个请求下的平均请求耗时
GET req_log/_search { "query": { "constant_score": { "filter": { "term": { "path": "/api/post/3" } } } }, "aggs": { "path_avg": { "avg": { "field": "times" } }, "all_order":{ "global": {}, "aggs": { "all_avg": { "avg": { "field": "times" } } } } } }
返回:
{ "took" : 11, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "req_log", "_type" : "_doc", "_id" : "G0K3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 20, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-08" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 80, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-04" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "I0K3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 89, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-11" } } ] }, "aggregations" : { "path_avg" : { "value" : 63.0 }, "all_order" : { "doc_count" : 17, "all_avg" : { "value" : 911.1176470588235 } } } }
从结果比较来看api/post/3
下的接口请求速度还是很快的,平均是63
, 整个接口平均耗时是911
聚合过滤
有时候,我们不需要过滤查询结果,只需要过滤聚合结果,这个怎么做呢?下面接着看一个示例:
查询method=GET
下的请求并计算出path=/api/post/3
下的请求平均耗时
- filter
GET req_log/_search { "query":{ "constant_score":{ "filter":{ "term":{ "method":"GET" } } } }, "aggs":{ "req_count":{ "aggs":{ "req_path_order":{ "terms":{ "field":"path" }, "aggs":{ "avg_times":{ "avg":{ "field":"times" } } } } }, "filter":{ "term":{ "path":"/api/post/3" } } } } }
来看返回结果:
{ "took" : 6, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 17, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "req_log", "_type" : "_doc", "_id" : "GUK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 80, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-09" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "GkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 30, "method" : "GET", "path" : "/api/post/2", "created" : "2023-02-07" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "G0K3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 20, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-08" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HEK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 120, "method" : "GET", "path" : "/api/post/20", "created" : "2023-02-06" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HUK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 150, "method" : "GET", "path" : "/api/post/1", "created" : "2023-02-05" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 80, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-04" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "H0K3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 960, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-03" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "IEK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 9000, "method" : "GET", "path" : "/api/post/8", "created" : "2023-02-02" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "IUK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 1300, "method" : "GET", "path" : "/api/post/6", "created" : "2023-02-01" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "IkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 400, "method" : "GET", "path" : "/api/post/4", "created" : "2023-02-10" } } ] }, "aggregations" : { "req_count" : { "doc_count" : 3, "req_path_order" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "/api/post/3", "doc_count" : 3, "avg_times" : { "value" : 63.0 } } ] } } } }
从结果来看,请求查询的结果并没有被过滤,只有聚合的结果被过滤了
查询过滤 & 聚合不过滤
与上相反,值过滤查询,不过滤聚合结果, 下面看个示例:
查询path=/api/post/3
并且method=GET
下的请求,并聚合结果各个path下的请求数
- post_filter
GET req_log/_search { "aggs": { "path_count": { "terms": { "field": "path" } } }, "post_filter": { "bool": { "must": [ { "term": { "path": { "value": "/api/post/3" } } }, { "term": { "method": { "value": "GET" } } } ] } } }
返回:
{ "took" : 4, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "req_log", "_type" : "_doc", "_id" : "G0K3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 20, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-08" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "HkK3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 80, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-04" } }, { "_index" : "req_log", "_type" : "_doc", "_id" : "I0K3NIYBdXrpvlCF01bz", "_score" : 1.0, "_source" : { "times" : 89, "method" : "GET", "path" : "/api/post/3", "created" : "2023-02-11" } } ] }, "aggregations" : { "path_count" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "/api/post/3", "doc_count" : 3 }, { "key" : "/api/post/6", "doc_count" : 3 }, { "key" : "/api/post/1", "doc_count" : 2 }, { "key" : "/api/post/2", "doc_count" : 2 }, { "key" : "/api/post/4", "doc_count" : 2 }, { "key" : "/api/post/10", "doc_count" : 1 }, { "key" : "/api/post/12", "doc_count" : 1 }, { "key" : "/api/post/20", "doc_count" : 1 }, { "key" : "/api/post/7", "doc_count" : 1 }, { "key" : "/api/post/8", "doc_count" : 1 } ] } } }
从结果来看,查询结果只有api/post/3
但是聚合结果并没有过滤这个条件
结束语
本节就到此结束了,还有一些聚合
内容放到下节讲,先消化这么多 ~
本着把自己知道的都告诉大家,如果本文对您有所帮助,点赞+关注
鼓励一下呗~
相关文章
- 利用docker搭建es集群
- 一起来学ElasticSearch(一)
- 一起来学ElasticSearch(二)
- 一起来学ElasticSearch(三)
- 一起来学ElasticSearch(四)
- 一起来学ElasticSearch(五)
- 一起来学ElasticSearch(六)
- 一起来学ElasticSearch(七)
- 一起来学ElasticSearch(八)
项目源码(源码已更新 欢迎star⭐️)
- spring-cloud-all
- SpringCloud整合 Oauth2+Gateway+Jwt+Nacos 实现授权码模式的服务认证(一)
- SpringCloud整合 Oauth2+Gateway+Jwt+Nacos 实现授权码模式的服务认证(二)
往期并发编程内容推荐
- Java多线程专题之线程与进程概述
- Java多线程专题之线程类和接口入门
- Java多线程专题之进阶学习Thread(含源码分析)
- Java多线程专题之Callable、Future与FutureTask(含源码分析)
- 面试官: 有了解过线程组和线程优先级吗
- 面试官: 说一下线程的生命周期过程
- 面试官: 说一下线程间的通信
- 面试官: 说一下Java的共享内存模型
- 面试官: 有了解过指令重排吗,什么是happens-before
- 面试官: 有了解过volatile关键字吗 说说看
- 面试官: 有了解过Synchronized吗 说说看
- Java多线程专题之Lock锁的使用
- 面试官: 有了解过ReentrantLock的底层实现吗?说说看
- 面试官: 有了解过CAS和原子操作吗?说说看
- Java多线程专题之线程池的基本使用
- 面试官: 有了解过线程池的工作原理吗?说说看
- 面试官: 线程池是如何做到线程复用的?有了解过吗,说说看
- 面试官: 阻塞队列有了解过吗?说说看
- 面试官: 阻塞队列的底层实现有了解过吗? 说说看
- 面试官: 同步容器和并发容器有用过吗? 说说看
- 面试官: CopyOnWrite容器有了解过吗? 说说看
- 面试官: Semaphore在项目中有使用过吗?说说看(源码剖析)
- 面试官: Exchanger在项目中有使用过吗?说说看(源码剖析)
- 面试官: CountDownLatch有了解过吗?说说看(源码剖析)
- 面试官: CyclicBarrier有了解过吗?说说看(源码剖析)
- 面试官: Phaser有了解过吗?说说看
- 面试官: Fork/Join 有了解过吗?说说看(含源码分析)
- 面试官: Stream并行流有了解过吗?说说看