前文参考如下:
9.工具使用:Elasticsearch从入门到放弃(1)-Elasticsearch概念篇
10.工具使用:Elasticsearch从入门到放弃(2)-相关性算法
11.工具使用:Elasticsearch从入门到放弃(3)-权重及打分
13.工具使用:Elasticsearch从入门到放弃(4)-项目实战
14.工具使用:Elasticsearch从入门到放弃(5)-聚合概述
15.工具使用:Elasticsearch从入门到放弃(6)-聚合方式排序及分页
16.工具使用:Elasticsearch从入门到放弃(7)-nested对象
参考文档;
Elasticsearch教程(32) ES 聚合查询后过滤 Distinct Group By Having功能
ElasticSearch实现类SQL的sum,count,group by,having功能
ES分组聚合Agg nested_华生cn的博客-CSDN博客_es nested 聚合
elasticsearch - Elasticsearch在inner_hits上聚合 - Thinbug
ElasticSearch---es之Post Filter,聚合后过滤
1.项目背景
这几天产品提出了一个新需求,提供一个民宿搜索的功能,有点类似于携程飞猪这些平台酒店民宿搜索,设计如图:
具体需求
基于当前平台已有民宿商品,提供民宿搜索。
- 默认定位宁波·余姚,可切换定位,底部商品列表需要更新为对应区域下的酒店民宿类商品
- 默认选中今天和明天搜索,最多搜索范围3个月(单个房间需要当前日期往后推三个月(90)之内的价格和库存,最大查询入住和离店在此范围内间隔两个月的房间)
- 提示文案改成“店名/地点/关键字”(店名商品名) 点击显示输入键盘,可输入搜索关键字,输入键盘有下家显示搜索按钮,点击带日期及关键字跳转酒店民宿搜索结果页
- 显示选中日期有房可订的房间。 (搜索时判断每天库存是否大于0)
- 默认按综合排序, 排序条件 选项包括:综合推荐(默认选中)、离我最近、低价优先
- 增加评分/价格筛选
2. 设计
再确认了需求之后,接下来就是分析难点,做具体设计。
首先,一开始的计划是,根据产品的需求将每个商品最近90天的日期价格和库存,以nested对象形式存储起来。这样在做时间价格的时候,可以将内部价格集合也筛选出来。参考:63.工具使用-Elasticsearch从入门到放弃-nested对象 - 掘金 (juejin.cn)
其次,这一块有一个需求是:用户可以选择入住和离开的时间,并可以筛选出这段时间内每天价格在预算范围,且每天都有库存的民宿。 这样这块就涉及到一个问题,由于90天价格列表是以nested对象列表的子集合存储的,那么判断每天的价格和库存的时候,就有可能遍历每天的库存和价格去做比较。如果选择的天数只有一两天,三四天,那么这样比较最多循环比较几次还是可以的。可是产品的需求偏偏最多可以选择到60天的民宿搜索,那么假设选择90天里面的60天,然后基于价格搜索,那么就得比较60次。这样显然特别消耗性能,也是不太合适的。
于是硬着头皮,找产品同学battle,想改一改需求,间隔时间可以调短一些。交涉的结果就是,就算时间间隔调短了,但是不排除后续用户需求可以搜索到一个月以上间隔的民宿,所以还是得想办法实现掉。
所以,只能硬着头皮往下做了。
想来想去,问题无非就是,如果我选择搜索的时间很长,怎么保证价格预算?以及每天都有房源?而且比较过程相对减少比较次数?
3. 方案
想来想去,最终决定方案如下:
1.每个商品最近90天的日期价格和库存,以nested对象形式存储起来 2.在选择时间范围搜索时,为了减少比较次数,对价格日期地区,搜索框条件进行过滤后,再对均价进行二次筛选,保证均价在搜索范围内,并且保证有库存的天数等于搜索的时间天数 (例如:12.1到12.30号这三十天价格范围100-200的民宿;那么30天均价在100-200即可;** 房间状态的逻辑是库存的判断,根据库存大于0的天数和查询时间范围的天数对比,天数相同则认为每天都有库存;综合条件就是均价在搜索范围内,有库存的天数等于搜索天数的民宿)
上述做的好处是:比较无需遍历每天价格和库存,只需要在汇总时做二次筛选。 上述做的弊端是:分页相对麻烦,二次筛选每页数据不固定。
4. 落地
基于上述方案,其实最终落地效果就类似于先按照时间,价格,搜索名称(不为空)搜索;然后基于商品分组;然后在基于分组后,每个民宿平均价格过滤。
如果用sql实现就类似于 select * from goods_index where time>=? and time<=? and price>=? and price<=? and good_name=? group by good_id having(agv(price)>=? and agv(price)<=?);
也就是如何在es搜索中实现 select where 和group by having?
数据结构
nested嵌套存储民俗价格
{ "goods_index" : { "mappings" : { "properties" : { "ancestryCategoryId" : { "type" : "long" }, "createTime" : { "type" : "long" }, "district" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "goodCommentRate" : { "type" : "long" }, "goodsAttribute" : { "properties" : { "netWeight" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "packaging" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } }, "goodsTags" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "hotelPrices" : { "type" : "nested", "properties" : { "goodsId" : { "type" : "long" }, "promPrice" : { "type" : "float" }, "sellPrice" : { "type" : "float" }, "specValue" : { "type" : "date" }, "stockQuantity" : { "type" : "long" } } }, "id" : { "type" : "long" }, "imageMainUrl" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "latitude" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "location" : { "type" : "geo_point" }, "longitude" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "name" : { "type" : "text" }, "price" : { "type" : "float" }, "province" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "city" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "salesSum" : { "type" : "long" }, "storeId" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "storeName" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "supplierId" : { "type" : "long" } } } } }
其中在商品索引中维护了nested类的民俗价格列表
"hotelPrices" : { "type" : "nested", "properties" : { "goodsId" : { "type" : "long" }, "promPrice" : { "type" : "float" }, "sellPrice" : { "type" : "float" }, "specValue" : { "type" : "date" }, "stockQuantity" : { "type" : "long" } } }
大概数据格式
{ "_index" : "goods_index", "_type" : "_doc", "_id" : "10027", "_score" : 1.0, "_source" : { "ancestryCategoryId" : 2, "city" : "3301", "cityName" : "杭州市", "createTime" : 1604371603000, "district" : "330122", "districtName" : "桐庐县", "goodsAttribute" : { }, "goodsTags" : "[打卡地, 可长住, 视野开阔, 环境安静]", "hotelPrices" : [ { "goodsId" : 10027, "promPrice" : 0.0, "sellPrice" : 1299.0, "specValue" : "2022-12-21", "stockQuantity" : 1 }, { "goodsId" : 10027, "promPrice" : 0.0, "sellPrice" : 1299.0, "specValue" : "2022-12-22", "stockQuantity" : 1 }, { "goodsId" : 10027, "promPrice" : 0.0, "sellPrice" : 1299.0, "specValue" : "2022-12-23", "stockQuantity" : 1 }, { "goodsId" : 10027, "promPrice" : 0.0, "sellPrice" : 1299.0, "specValue" : "2022-12-24", "stockQuantity" : 1 }, ...90天价格.... { "goodsId" : 10027, "promPrice" : 0.0, "sellPrice" : 1299.0, "specValue" : "2023-03-19", "stockQuantity" : 1 }, { "goodsId" : 10027, "promPrice" : 0.0, "sellPrice" : 1299.0, "specValue" : "2023-03-20", "stockQuantity" : 1 } ], "id" : 10027, "imageMainUrl" : "https://1111.png", "imageUrls" : "https://111.png", "latitude" : "29.684106", "location" : "29.684106,119.677711", "longitude" : "119.677711", "name" : "aaaa大床房", "price" : 1299.0, "province" : "3300", "provinceName" : "浙江省", "storeId" : "1", "storeName" : "测试店铺", "supplierId" : 1, "totalCommentStar" : 5.0, "trait" : "舒适", "videoUrl" : "https://111.mp4", "visitCount" : 94 } }
实现group by having效果
第一次尝试-不对
select * from goods_index where ancestryCategoryId=2 and hotelPrices.sellPrice>=200 and hotelPrices.sellPrice<=1000 and hotelPrices.stockQuantity>0 group by hotelPrices.goodsId
GET goods_index/_search { "from": 0, "size": 10, "query": { "bool": { "must": [ { "term": { "ancestryCategoryId": { "value": "2" } } }, { "nested": { "path": "hotelPrices", "ignore_unmapped": true, "score_mode": "none", "boost": 1, "inner_hits": { "ignore_unmapped": true, "from": 0, "size": 3, "version": false, "seq_no_primary_term": false, "explain": false, "track_scores": false }, "query": { "bool": { "must": [ { "range": { "hotelPrices.sellPrice": { "gte": 200, "lte": 1000 } } }, { "range": { "hotelPrices.stockQuantity": { "gt":0 } } } ] } } } } ] } }, "aggregations": { "salesNested": { "nested": { "path": "hotelPrices" }, "aggregations": { "group_by": { "terms": { "field": "hotelPrices.goodsId", "size": 10, "order": { "_key": "asc" } } } } } } }
查看结果,好像不太对,因为聚合计算的不是过滤之后的嵌套文档,即不是inner_hits,因为nested类型会将筛选之后的结果单独存在inner_hits,可以看到每个商品都是90天的价格列表,并不是按照时间价格过滤的数据,不合适。结果大致如下:
{ "took" : 4, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 11, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "goods_test2", "_type" : "_doc", "_id" : "12278", "_score" : 1.0, "_source" : { "ancestryCategoryId" : 2, "city" : "3301", "cityName" : "杭州市", "createTime" : 1607136062000, "district" : "330122", "districtName" : "桐庐县", "goodCommentRate" : 0, "goodsAttribute" : { }, "goodsTags" : "[免费云停车, 可长住]", "hotelPrices" : [ { "goodsId" : 12278, "promPrice" : 0.0, "sellPrice" : 258.0, "specValue" : "2022-12-07", "stockQuantity" : 10 }, //....90天价格... { "goodsId" : 12278, "promPrice" : 0.0, "sellPrice" : 258.0, "specValue" : "2023-03-06", "stockQuantity" : 10 } ], "id" : 12278, "imageMainUrl" : "https://wsnbh-img.hzanchu.com/acimg/93673e34353664f696e1dfadb9bceffc.jpeg", "imageUrls" : "https://wsnbh-img.hzanchu.com/acimg/93673e34353664f696e1dfadb9bceffc.jpeg", "isDistribute" : 0, "name" : "芦茨白云源 双源民宿双床房 落地窗", "onOff" : 1, "originImageUrls" : "https://wsnbh-img.hzanchu.com/acimg/93673e34353664f696e1dfadb9bceffc.jpeg,https://wsnbh-img.hzanchu.com/acimg/52fbc6815688984a6bed22df3373227f.jpeg,https://wsnbh-img.hzanchu.com/acimg/ce2024cc2f221c3f86f162bd970f71ed.jpeg,", "price" : 258.0, "storeId" : "1765", "supplierId" : 2242, "totalCommentStar" : 0.0, "trait" : "老板娘独自经营 舒适温馨", "videoUrl" : "", "visitCount" : 26 }, "inner_hits" : { "hotelPrices" : { "hits" : { "total" : { "value" : 90, "relation" : "eq" }, "max_score" : 2.0, "hits" : [ { "_index" : "goods_test2", "_type" : "_doc", "_id" : "12278", "_nested" : { "field" : "hotelPrices", "offset" : 0 }, "_score" : 2.0, "_source" : { "goodsId" : 12278, "specValue" : "2022-12-07", "stockQuantity" : 10, "sellPrice" : 258.0, "promPrice" : 0.0 } }, { "_index" : "goods_test2", "_type" : "_doc", "_id" : "12278", "_nested" : { "field" : "hotelPrices", "offset" : 1 }, "_score" : 2.0, "_source" : { "goodsId" : 12278, "specValue" : "2022-12-08", "stockQuantity" : 10, "sellPrice" : 258.0, "promPrice" : 0.0 } }, { "_index" : "goods_test2", "_type" : "_doc", "_id" : "12278", "_nested" : { "field" : "hotelPrices", "offset" : 2 }, "_score" : 2.0, "_source" : { "goodsId" : 12278, "specValue" : "2022-12-09", "stockQuantity" : 10, "sellPrice" : 258.0, "promPrice" : 0.0 } } ] } } } } ] }, "aggregations" : { "salesNested" : { "doc_count" : 990, "group_by" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 90, "buckets" : [ { "key" : 12278, "doc_count" : 90 }, { "key" : 12304, "doc_count" : 90 }, { "key" : 13759, "doc_count" : 90 }, { "key" : 13850, "doc_count" : 90 }, { "key" : 13856, "doc_count" : 90 }, { "key" : 17330, "doc_count" : 90 }, { "key" : 17331, "doc_count" : 90 }, { "key" : 17332, "doc_count" : 90 }, { "key" : 17333, "doc_count" : 90 }, { "key" : 17334, "doc_count" : 90 } ] } } } }
第二次尝试-在筛选中加入前置过滤
因为第一次尝试中聚合计算的不是 过滤之后的嵌套文档,即不是inner_hits,所以在聚合时也对nested里面的内容做了层层的过滤) select * from goods_index where ancestryCategoryId=2 and hotelPrices.sellPrice>=200 and hotelPrices.sellPrice<=1000 and hotelPrices.stockQuantity>0 group by hotelPrices.goodsId (但是聚合过滤后每个商品符合条件的天数还需要再过滤,目前没做having(count>天数)
这块涉及到一个概念 前置过滤器:有时需要对聚合条件进一步地过滤,但是又不能影响当前的查询条件,可以加入前置过滤器
后置过滤器: 在有些场景中,需要根据条件进行数据查询,但是聚合的结果集不受影响,可以加入后置过滤器
GET goods_test2/_search { "from": 0, "size": 1, "query": { "bool": { "must": [ { "term": { "ancestryCategoryId": { "value": "2" } } }, { "nested": { "path": "hotelPrices", "ignore_unmapped": true, "score_mode": "none", "boost": 1, "inner_hits": { "ignore_unmapped": true, "from": 0, "size": 90, "version": false, "seq_no_primary_term": false, "explain": false, "track_scores": false }, "query": { "bool": { "must": [ { "range": { "hotelPrices.sellPrice": { "gte": 200, "lte": 1000 } } }, { "range": { "hotelPrices.stockQuantity": { "gt": 0 } } } ] } } } } ] } }, "aggs": { "filtered_nested": { "nested": { "path": "hotelPrices" }, "aggs": { "where": { "range": { "field": "hotelPrices.sellPrice", "ranges": [ { "from": 200, "to": 1000 } ] }, "aggs": { "and_where": { "range": { "field": "hotelPrices.stockQuantity", "ranges": [ { "from": 0.1 } ] }, "aggs": { "group_by": { "terms": { "field": "hotelPrices.goodsId", "size": 10, "order": { "_key": "asc" } } } } } } } } } } }
结果如下:
{ "took" : 31, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 11, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "goods_test2", "_type" : "_doc", "_id" : "12278", "_score" : 1.0, "_source" : { "ancestryCategoryId" : 2, "atmosphereImage" : "https://wsnbh-img.hzanchu.com/test/168650774865616896.png", "backCatId" : 357, "city" : "3301", "cityName" : "杭州市", "createTime" : 1607136062000, "district" : "330122", "districtName" : "桐庐县", "frontendIds" : "_,2_157", "goodCommentRate" : 0, "goodsAttribute" : { }, "goodsTags" : "[免费云停车, 可长住]", "hotelPrices" : [ { "goodsId" : 12278, "promPrice" : 0.0, "sellPrice" : 258.0, "specValue" : "2022-12-07", "stockQuantity" : 10 }, { "goodsId" : 12278, "promPrice" : 0.0, "sellPrice" : 258.0, "specValue" : "2022-12-08", "stockQuantity" : 10 }, { "goodsId" : 12278, "promPrice" : 0.0, "sellPrice" : 258.0, "specValue" : "2022-12-09", "stockQuantity" : 10 }, //...90天民宿.... { "goodsId" : 12278, "promPrice" : 0.0, "sellPrice" : 258.0, "specValue" : "2023-03-06", "stockQuantity" : 10 } ], "id" : 12278, "imageMainUrl" : "https://wsnbh-img.hzanchu.com/acimg/93673e34353664f696e1dfadb9bceffc.jpeg", "imageUrls" : "https://wsnbh-img.hzanchu.com/acimg/93673e34353664f696e1dfadb9bceffc.jpeg", "isDistribute" : 0, "name" : "芦茨白云源 双源民宿双床房 落地窗", "onOff" : 1, "originImageUrls" : "https://wsnbh-img.hzanchu.com/acimg/93673e34353664f696e1dfadb9bceffc.jpeg,https://wsnbh-img.hzanchu.com/acimg/52fbc6815688984a6bed22df3373227f.jpeg,https://wsnbh-img.hzanchu.com/acimg/ce2024cc2f221c3f86f162bd970f71ed.jpeg,", "price" : 258.0, "projectCode" : 3300, "promType" : 0, "province" : "3300", "provinceName" : "浙江省", "salesSum" : 1, "salesSumTrue" : 1, "sellerCity" : 1, "sellerDistinct" : 17, "sellerProvince" : 43282904292360190, "siteName" : "杭州馆", "storeId" : "1765", "supplierId" : 2242, "totalCommentStar" : 0.0, "trait" : "老板娘独自经营 舒适温馨", "videoUrl" : "", "visitCount" : 26 }, "inner_hits" : { "hotelPrices" : { "hits" : { "total" : { "value" : 90, "relation" : "eq" }, "max_score" : 2.0, "hits" : [ { "_index" : "goods_test2", "_type" : "_doc", "_id" : "12278", "_nested" : { "field" : "hotelPrices", "offset" : 0 }, "_score" : 2.0, "_source" : { "goodsId" : 12278, "specValue" : "2022-12-07", "stockQuantity" : 10, "sellPrice" : 258.0, "promPrice" : 0.0 } }, //....90天民宿... { "_index" : "goods_test2", "_type" : "_doc", "_id" : "12278", "_nested" : { "field" : "hotelPrices", "offset" : 89 }, "_score" : 2.0, "_source" : { "goodsId" : 12278, "specValue" : "2023-03-06", "stockQuantity" : 10, "sellPrice" : 258.0, "promPrice" : 0.0 } } ] } } } } ] }, "aggregations" : { "filtered_nested" : { "doc_count" : 990, "where" : { "buckets" : [ { "key" : "200.0-1000.0", "from" : 200.0, "to" : 1000.0, "doc_count" : 904, "and_where" : { "buckets" : [ { "key" : "0.1-*", "from" : 0.1, "doc_count" : 903, "group_by" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 4, "buckets" : [ { "key" : 12278, "doc_count" : 90 }, { "key" : 12304, "doc_count" : 90 }, { "key" : 13759, "doc_count" : 89 }, { "key" : 13850, "doc_count" : 90 }, { "key" : 13856, "doc_count" : 90 }, { "key" : 17330, "doc_count" : 90 }, { "key" : 17331, "doc_count" : 90 }, { "key" : 17332, "doc_count" : 90 }, { "key" : 17333, "doc_count" : 90 }, { "key" : 17334, "doc_count" : 90 } ] } } ] } } ] } } } }
可以看到 13759 这个商品符合条件的只有89天符合条件
第三次尝试-group by having
select * from goods_index where ancestryCategoryId=2 and hotelPrices.specValue>? and hotelPrices.specValue? and avg(hotelPrices.sellPrice)
聚合过滤后对每个商品库存>0的天数还需要再过滤做having(count>天数);以及计算均价
GET goods_test2/_search { "from": 0, "size": 10, "query": { "bool": { "must": [ { "term": { "ancestryCategoryId": { "value": "2" } } }, { "nested": { "path": "hotelPrices", "ignore_unmapped": true, "score_mode": "none", "boost": 1, "inner_hits": { "ignore_unmapped": true, "from": 0, "size": 90, "version": false, "seq_no_primary_term": false, "explain": false, "track_scores": false }, "query": { "bool": { "must": [ { "range": { "hotelPrices.sellPrice": { "gte": 200, "lte": 1000 } } }, { "range": { "hotelPrices.stockQuantity": { "gt": 0 } } }, { "range": { "hotelPrices.specValue": { "gte": "2022-12-01", "lte": "2022-12-30" } } } ] } } } } ] } }, "aggs": { "filtered_nested": { "nested": { "path": "hotelPrices" }, "aggs": { "where": { "filter": { "bool": { "filter": [ { "range": { "hotelPrices.specValue": { "gte": "2022-12-08", "lte": "2022-12-30" } } }, { "range": { "hotelPrices.stockQuantity": { "gt": 0 } } } ] } }, "aggs": { "group_by": { "terms": { "field": "hotelPrices.goodsId", "size": 90, "order": { "_key": "asc" } }, "aggs": { "avg_price": { "avg": { "field": "hotelPrices.sellPrice" } }, "aggs": { //"having": { //基于筛选后的内容算平均值,且平均值大于600小于900 "bucket_selector": { "buckets_path": { "avgprice": "avg_price", "counts": "_count" }, "script": "params.avgprice>600 && params.avgprice<900 &¶ms.counts==22" } // } } } } } } } } } }
注意对于查询之后的均值做二次筛选,需要用到bucket_selector,avg_price和_count均为上述聚合统计算的结果,aggregations从外到里面会分别存各筛选条件的统计维度
最终尝试,带入排序等规则
2022-12-22入住,2022-12-23号离开,均价300-800的民宿,综合排序
GET goods_index/_search { "from": 0, "size": 10, "timeout": "5s", "query": { "function_score": { "query": { "bool": { "must": [ { "terms": { "ancestryCategoryId": [ "2" ], "boost": 1 } }, { "term": { "projectCode": { "value": 3300, "boost": 1 } } }, { "terms": { "city": [ "3301" ], "boost": 1 } }, { "range": { "totalCommentStar": { "from": 0, "to": null, "include_lower": true, "include_upper": true, "boost": 1 } } }, { "nested": { "query": { "bool": { "must": [ { "range": { "hotelPrices.specValue": { "from": "2022-12-21T16:00:00.000Z", "to": "2022-12-22T16:00:00.000Z", "include_lower": true, "include_upper": true, "boost": 1 } } }, { "range": { "hotelPrices.stockQuantity": { "from": 0, "to": null, "include_lower": false, "include_upper": true, "boost": 1 } } } ], "adjust_pure_negative": true, "boost": 1 } }, "path": "hotelPrices", "ignore_unmapped": true, "score_mode": "none", "boost": 1, "inner_hits": { "ignore_unmapped": true, "from": 0, "size": 90, "version": false, "seq_no_primary_term": false, "explain": false, "track_scores": false } } }, { "match_all": { "boost": 1 } } ], "adjust_pure_negative": true, "boost": 1 } } } }, "_source": { "includes": [ "supplierId", "storeId", "goodsTags", "price", "salesSum", "imageUrls", "name", "trait", "id", "ancestryCategoryId", "province", "city", "district" "imageMainUrl", "goodsAttributes", "hotelPrices", "storeName", "latitude", "longitude" ], "excludes": [ "backCatId" ] }, "sort": [ { "salesSum": { "order": "asc" } }, { "goodCommentRate": { "order": "desc" } } ], "aggregations": { "filtered_nested": { "nested": { "path": "hotelPrices" }, "aggregations": { "where": { "filter": { "bool": { "filter": [ { "range": { "hotelPrices.specValue": { "from": "2022-12-21T16:00:00.000Z", "to": "2022-12-22T16:00:00.000Z", "include_lower": true, "include_upper": true, "boost": 1 } } }, { "range": { "hotelPrices.stockQuantity": { "from": 0, "to": null, "include_lower": false, "include_upper": true, "boost": 1 } } } ], "adjust_pure_negative": true, "boost": 1 } }, "aggregations": { "group_by": { "terms": { "field": "hotelPrices.goodsId", "size": 100, "min_doc_count": 1, "shard_min_doc_count": 0, "show_term_doc_count_error": false, "order": [ { "_key": "asc" }, { "_key": "asc" } ] }, "aggregations": { "avg_price": { "avg": { "field": "hotelPrices.sellPrice" } }, "having": { "bucket_selector": { "buckets_path": { "counts": "_count", "avgprice": "avg_price" }, "script": { "source": "params.avgprice>=300 && params.avgprice<=800&¶ms.counts==1", "lang": "painless" }, "gap_policy": "skip" } } } } } } } } } }
结果如下:
{ "took" : 15, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1192, "relation" : "eq" }, "max_score" : null, "hits" : [ { "_index" : "goods_index", "_type" : "_doc", "_id" : "41378", "_score" : null, "_source" : { "supplierId" : 103149696022265856, "city" : "3301", "goodsAttribute" : { }, "latitude" : "30.409022", "siteName" : "杭州馆", "goodsTags" : "[老友相聚, 有KTV, 卫生优秀, 环境安静]", "cityName" : "杭州市", "province" : "3300", "projectCode" : 3300, "price" : 580, "trait" : "阳光房 棋牌室 K歌 烧烤 投影仪 花园", "storeName" : "xxx民宿", "ancestryCategoryId" : 2, "id" : 41378, "longitude" : "119.805612", "promType" : 0, "districtName" : "余杭区", "hotelPrices" : [ { "goodsId" : 41378, "specValue" : "2022-12-21", "stockQuantity" : 1, "sellPrice" : 580.0, "promPrice" : 0.0 }, //90天价格 { "goodsId" : 41378, "specValue" : "2023-03-20", "stockQuantity" : 1, "sellPrice" : 580.0, "promPrice" : 0.0 } ], "imageMainUrl" : "https://111.jpeg", "salesSum" : 0, "storeId" : "103887395157442560", "totalCommentStar" : 5.0, "district" : "330110", "imageUrls" : "https://111.jpeg", "name" : "xxx双床房", "provinceName" : "浙江省" }, "sort" : [ 0, 1 ], "inner_hits" : { "hotelPrices" : { "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 2.0, "hits" : [ { "_index" : "goods_pre", "_type" : "_doc", "_id" : "41378", "_nested" : { "field" : "hotelPrices", "offset" : 1 }, "_score" : 2.0, "_source" : { "goodsId" : 41378, "specValue" : "2022-12-22", "stockQuantity" : 1, "sellPrice" : 580.0, "promPrice" : 0.0 } } ] } } } }] } } } } ] }, "aggregations" : { "filtered_nested" : { "doc_count" : 107280, "where" : { "doc_count" : 1192, "group_by" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 1092, "buckets" : [ { "key" : 12262, "doc_count" : 1, "avg_price" : { "value" : 320.0 } }, //.......... { "key" : 19282, "doc_count" : 1, "avg_price" : { "value" : 780.0 } }, { "key" : 19283, "doc_count" : 1, "avg_price" : { "value" : 780.0 } } ] } } } } }
结果来看,aggregations从外到里面会分别存各筛选条件的统计维度;先嵌套filtered_nested前置过滤,再group_by分组,avg_price计算均值;bucket_selector基于均值和库存天数二次筛选。然后取bucket_selector结果集作为搜索结果列表
部分代码逻辑
// 设置查询条件 buildHotelSearchQuery(searchHotelRequest, sourceBuilder, boolQueryBuilder); // 设置排序规则 buildHotelSearchSort(searchHotelRequest, sourceBuilder); // 设置聚合 buildHotelSearchAggregation(searchHotelRequest,sourceBuilder); // 设置结果 buildHotelSearchResponse(searchHotelRequest,response);
1. 设置查询条件
private void buildHotelSearchQuery(SearchHotelRequest searchHotelRequest, SearchSourceBuilder sourceBuilder, BoolQueryBuilder boolQueryBuilder) { //顶级类目 if (searchHotelRequest.getAncesCategory() != null && searchHotelRequest.getAncesCategory() > 0) { Long ancesCateGory = searchHotelRequest.getAncesCategory(); TermsQueryBuilder supplierIdQueryBuilder = QueryBuilders.termsQuery("ancestryCategoryId", ancesCateGory.toString()); boolQueryBuilder.must(supplierIdQueryBuilder); } //省级行政码 if (searchHotelRequest.getProvince() != null && searchHotelRequest.getProvince() != 0L) { Long province = searchHotelRequest.getProvince(); TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("projectCode", province); boolQueryBuilder.must(termQueryBuilder); } //市行政码 if (StringUtils.isNotBlank(searchHotelRequest.getCityCode())) { TermsQueryBuilder cityCodeQueryBuilder = QueryBuilders.termsQuery("city", searchHotelRequest.getCityCode()); boolQueryBuilder.must(cityCodeQueryBuilder); } //县行政码 if (StringUtils.isNotBlank(searchHotelRequest.getDistrictCode())) { TermsQueryBuilder districtCodeQueryBuilder = QueryBuilders.termsQuery("district", searchHotelRequest.getDistrictCode()); boolQueryBuilder.must(districtCodeQueryBuilder); } //评分 if (ObjectUtil.isNotEmpty(searchHotelRequest.getTotalCommentStarRange())) { RangeQueryBuilder rangeQueryBuilder=QueryBuilders.rangeQuery("totalCommentStar"); //4.8分以上; 2:4.5分以上; 3:4分以上; 4:3.5分以上 switch (searchHotelRequest.getTotalCommentStarRange()){ case 1: rangeQueryBuilder.from(4.8); break; case 2: rangeQueryBuilder.from(4.5); break; case 3: rangeQueryBuilder.from(4); break; case 4: rangeQueryBuilder.from(3.5); break; default: rangeQueryBuilder.from(0); break; } boolQueryBuilder.must(rangeQueryBuilder); } //距离 if(ObjectUtil.isEmpty(searchHotelRequest.getDistance())) { searchHotelRequest.setDistance(50000L);//默认50000米 } if (ObjectUtil.isNotEmpty(searchHotelRequest.getLatitude()) && ObjectUtil.isNotEmpty(searchHotelRequest.getLongitude())) { double latitude = searchHotelRequest.getLatitude(); double longitude = searchHotelRequest.getLongitude(); GeoDistanceQueryBuilder geoDistanceQueryBuilder = QueryBuilders.geoDistanceQuery("location").point(latitude, longitude) .distance(searchHotelRequest.getDistance(), DistanceUnit.METERS); boolQueryBuilder.filter(geoDistanceQueryBuilder); } //嵌套文档条件:起止时间,价格,库存 BoolQueryBuilder boolQueryBuilder2 = QueryBuilders.boolQuery(); //入住起止时间 if (ObjectUtil.isNotEmpty(searchHotelRequest.getStartTime())&&ObjectUtil.isNotEmpty(searchHotelRequest.getEndTime())) { RangeQueryBuilder rangeQueryBuilder=QueryBuilders.rangeQuery("hotelPrices.specValue"); rangeQueryBuilder.from(searchHotelRequest.getStartTime()); rangeQueryBuilder.to(searchHotelRequest.getEndTime()); boolQueryBuilder2.must(rangeQueryBuilder); } //价格区间 if (ObjectUtil.isNotEmpty(searchHotelRequest.getPriceRange())) { RangeQueryBuilder rangeQueryBuilder=QueryBuilders.rangeQuery("hotelPrices.sellPrice"); //1:0-300; 2:300-800; 3:800-1500; 4:1500以上 switch (searchHotelRequest.getPriceRange()){ case 1: rangeQueryBuilder.from(0); rangeQueryBuilder.to(300); break; case 2: rangeQueryBuilder.from(300); rangeQueryBuilder.to(800); break; case 3: rangeQueryBuilder.from(800); rangeQueryBuilder.to(1500); break; case 4: rangeQueryBuilder.from(1500); break; default: rangeQueryBuilder.from(0); break; } boolQueryBuilder2.must(rangeQueryBuilder); } //库存大于0 RangeQueryBuilder rangeQueryBuilder=QueryBuilders.rangeQuery("hotelPrices.stockQuantity"); rangeQueryBuilder.gt(0); boolQueryBuilder2.must(rangeQueryBuilder); NestedQueryBuilder nestedQueryBuilder = QueryBuilders.nestedQuery("hotelPrices", boolQueryBuilder2, ScoreMode.None); InnerHitBuilder innerHitBuilder = new InnerHitBuilder(); innerHitBuilder.setIgnoreUnmapped(true).setSize(90); nestedQueryBuilder.ignoreUnmapped(true).innerHit(innerHitBuilder); boolQueryBuilder.must(nestedQueryBuilder); if (ObjectUtil.isNotEmpty(searchHotelRequest.getGoodsTags())) { MatchPhraseQueryBuilder matchPhraseQuery = QueryBuilders.matchPhraseQuery("goodsTags", searchHotelRequest.getGoodsTags()); boolQueryBuilder.must(matchPhraseQuery); } //搜索条件:商品名,店铺名 Map<String, Float> fields = new HashMap<>(); fields.put("name", 3f); fields.put("storeName", 2f); fields.put("trait", 1f); MultiMatchQueryBuilder multiMatchQueryBuilder = new MultiMatchQueryBuilder(searchHotelRequest.getSearchKey(), "name", "trait", "storeName").fields(fields); boolQueryBuilder.must(multiMatchQueryBuilder); }
2. 设置排序规则
private void buildHotelSearchSort(SearchHotelRequest searchHotelRequest, SearchSourceBuilder sourceBuilder) { GeoDistanceSortBuilder geoDistanceSortBuilder = null; if (ObjectUtil.isNotEmpty(searchHotelRequest.getLatitude()) && ObjectUtil.isNotEmpty(searchHotelRequest.getLongitude())) { double latitude = searchHotelRequest.getLatitude(); double longitude = searchHotelRequest.getLongitude(); geoDistanceSortBuilder = SortBuilders.geoDistanceSort("location", latitude, longitude) .point(latitude, longitude).unit(DistanceUnit.METERS); } if ("priceDESC".equals(searchHotelRequest.getSortType())) { sourceBuilder.sort("price", SortOrder.DESC); sourceBuilder.sort("goodCommentRate", SortOrder.DESC); // 第二排序规则 } else if ("priceASC".equals(searchHotelRequest.getSortType())) { sourceBuilder.sort("price", SortOrder.ASC); sourceBuilder.sort("goodCommentRate", SortOrder.DESC); // 第二排序规则 } else if ("soldDESC".equals(searchHotelRequest.getSortType())) { sourceBuilder.sort("salesSum", SortOrder.DESC); sourceBuilder.sort("goodCommentRate", SortOrder.DESC); // 第二排序规则 } else if ("soldASC".equals(searchHotelRequest.getSortType())) { sourceBuilder.sort("salesSum", SortOrder.ASC); sourceBuilder.sort("goodCommentRate", SortOrder.DESC); // 第二排序规则 } else if ("distinctASC".equals(searchHotelRequest.getSortType())) { if (ObjectUtil.isNotEmpty(geoDistanceSortBuilder)) { sourceBuilder.sort(geoDistanceSortBuilder.order(SortOrder.ASC)); } sourceBuilder.sort("goodCommentRate", SortOrder.DESC); // 第二排序规则 } else if (" distinctDESC".equals(searchHotelRequest.getSortType())) { if (ObjectUtil.isNotEmpty(geoDistanceSortBuilder)) { sourceBuilder.sort(geoDistanceSortBuilder.order(SortOrder.DESC)); } sourceBuilder.sort("goodCommentRate", SortOrder.DESC); // 第二排序规则 } }
3. 设置聚合 group by having
private void buildHotelSearchAggregation(SearchHotelRequest searchHotelRequest,SearchSourceBuilder sourceBuilder) { List<AggregationBuilder> aggregationBuilders=new ArrayList<>(); String nestedAggregationName = "filtered_nested"; String whereAggregationName = "where"; String groupByAggregationName = "group_by"; String avgPriceAggregationName = "avg_price"; String havingBucketSelectorName = "having"; String countName = "_count"; //filtered_nested NestedAggregationBuilder nestedAggregation = AggregationBuilders.nested(nestedAggregationName,"hotelPrices"); //where BoolQueryBuilder boolQueryBuilder =QueryBuilders.boolQuery(); boolQueryBuilder.filter(); //入住起止时间 Long days= null; if (ObjectUtil.isNotEmpty(searchHotelRequest.getStartTime())&&ObjectUtil.isNotEmpty(searchHotelRequest.getEndTime())) { RangeQueryBuilder rangeQueryBuilder=QueryBuilders.rangeQuery("hotelPrices.specValue"); rangeQueryBuilder.from(searchHotelRequest.getStartTime()); rangeQueryBuilder.to(searchHotelRequest.getEndTime()); boolQueryBuilder.filter(rangeQueryBuilder); days = DateUtil.betweenDay(searchHotelRequest.getStartTime(),searchHotelRequest.getEndTime(),true); } //库存大于0 RangeQueryBuilder rangeQueryBuilder=QueryBuilders.rangeQuery("hotelPrices.stockQuantity"); rangeQueryBuilder.gt(0); boolQueryBuilder.filter(rangeQueryBuilder); FilterAggregationBuilder whereAggregation = AggregationBuilders.filter(whereAggregationName,boolQueryBuilder); //group_by true asc;false desc BucketOrder groupByAggregationOrder = BucketOrder.aggregation("_key",true); TermsAggregationBuilder groupByAggregation = AggregationBuilders.terms(groupByAggregationName). field("hotelPrices.goodsId"). size(searchHotelRequest.getPageSize()).order(groupByAggregationOrder); //avg_price AvgAggregationBuilder avgPriceAggregation = AggregationBuilders.avg(avgPriceAggregationName) .field("hotelPrices.sellPrice"); //having avg_price bucket_selector Map<String, String> bucketsPathsMap = new HashMap<>(2); bucketsPathsMap.put("avgprice", avgPriceAggregationName); bucketsPathsMap.put("counts", countName); //计算起始天数差 &¶ms.counts==22 String scriptCmd = "params.avgprice>=0"; //平均价格在价格区间 及天数差 if (ObjectUtil.isNotEmpty(searchHotelRequest.getPriceRange())) { //1:0-300; 2:300-800; 3:800-1500; 4:1500以上 switch (searchHotelRequest.getPriceRange()){ case 1: scriptCmd ="params.avgprice>=0 && params.avgprice<=300"; break; case 2: scriptCmd ="params.avgprice>=300 && params.avgprice<=800"; break; case 3: scriptCmd ="params.avgprice>=800 && params.avgprice<=1500"; break; case 4: scriptCmd ="params.avgprice>=1500"; break; default: break; } } if(ObjectUtil.isNotEmpty(days)) { scriptCmd+="&¶ms.counts==" + days; } Script script = new Script(scriptCmd); BucketSelectorPipelineAggregationBuilder havingBucketSelector = PipelineAggregatorBuilders.bucketSelector(havingBucketSelectorName, bucketsPathsMap, script); //avgPriceAggregation嵌套在groupByAggregation中 groupByAggregation .subAggregation(avgPriceAggregation) .subAggregation(havingBucketSelector) //注意这里的havingBucketSelector是紧跟在avgPriceAggregation的后面,非嵌套 //.order() .size(searchHotelRequest.getPageSize()); //groupByAggregation嵌套在whereAggregation中 whereAggregation.subAggregation(groupByAggregation); //whereAggregation嵌套在nestedAggregation中 nestedAggregation.subAggregation(whereAggregation); sourceBuilder.aggregation(nestedAggregation); }
4. 设置结果集,反过来解析第三步的聚合桶
private SearchGoodsResponse buildHotelSearchResponse(SearchHotelRequest searchHotelRequest,SearchResponse searchResponse) { String nestedAggregationName = "filtered_nested"; String whereAggregationName = "where"; String groupByAggregationName = "group_by"; String avgPriceAggregationName = "avg_price"; String havingBucketSelectorName = "having"; String countName = "_count"; SearchGoodsResponse searchGoodsResponse = new SearchGoodsResponse(); if (ObjectUtil.isEmpty(searchResponse)) { return null; } Aggregations aggregations = searchResponse.getAggregations(); ParsedNested nestedAggregation = aggregations.get(nestedAggregationName); ParsedFilter whereAggregation = nestedAggregation.getAggregations().get(whereAggregationName); Terms groupByAggregation=whereAggregation.getAggregations().get(groupByAggregationName); List<? extends Terms.Bucket> avgBucketList = groupByAggregation.getBuckets();groupByAggregation.getBuckets(); //获取符合条件的商品id列表 List <String> goodsIds =new ArrayList<>(); avgBucketList.forEach(v -> { if (ObjectUtil.isNotEmpty(v.getKey())) { goodsIds.add(v.getKey().toString()); } }); Map res = new HashMap(); SearchHits searchHits = searchResponse.getHits(); ArrayList<Map> searchResult = new ArrayList<>(); if (searchHits != null) { for (SearchHit hit : searchHits) { String id= hit.getId(); if(ObjectUtil.isNotEmpty(searchHotelRequest.getStartTime()) &&ObjectUtil.isNotEmpty(searchHotelRequest.getEndTime())) { if (goodsIds.contains(id)) { //计算活动价 //距离优先,距离指标在排序的第一位;商品评分优先,距离指标在排序的第二位 //@TODO 距离指标在排序的第一位 if ((ObjectUtil.isNotEmpty(searchHotelRequest.getLatitude()) && ObjectUtil.isNotEmpty(searchHotelRequest.getLongitude())) && ("distinctASC".equals(searchHotelRequest.getSortType()) || "distinctDESC".equals(searchHotelRequest.getSortType()))) { result.put("distance", new BigDecimal(String.valueOf(hit.getSortValues()[0])).setScale(6, RoundingMode.HALF_UP).doubleValue()); } else { result.put("distance", Double.valueOf(0.00)); } result.put("hotelPrices", null); Object goodsTags = hit.getSourceAsMap().get("goodsTags"); if (ObjectUtil.isNotEmpty(goodsTags)) { result.put("goodsTags", goodsTags.toString().replace("[", "").replace("]", "")); } searchResult.add(result); } } //起止时间首页不传需要取所有结果,不做筛选 if(ObjectUtil.isEmpty(searchHotelRequest.getStartTime()) &&ObjectUtil.isEmpty(searchHotelRequest.getEndTime())){ //计算活动价 //距离优先,距离指标在排序的第一位;商品评分优先,距离指标在排序的第二位 //@TODO 距离指标在排序的第一位 if ((ObjectUtil.isNotEmpty(searchHotelRequest.getLatitude()) && ObjectUtil.isNotEmpty(searchHotelRequest.getLongitude())) && ("distinctASC".equals(searchHotelRequest.getSortType()) || "distinctDESC".equals(searchHotelRequest.getSortType()))) { result.put("distance", new BigDecimal(String.valueOf(hit.getSortValues()[0])).setScale(6, RoundingMode.HALF_UP).doubleValue()); } else { result.put("distance", Double.valueOf(0.00)); } result.put("hotelPrices", null); Object goodsTags = hit.getSourceAsMap().get("goodsTags"); if (ObjectUtil.isNotEmpty(goodsTags)) { result.put("goodsTags", goodsTags.toString().replace("[", "").replace("]", "")); } searchResult.add(result); } } } res.put("searchResult", searchResult); searchGoodsResponse.setData(res); return searchGoodsResponse; }