概述
继续跟中华石杉老师学习ES,第八篇
课程地址: https://www.roncoo.com/view/55
boost
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-boost.html
知识点:
如果给某个字段设置boost 为2 ,则意味着改字段的权重比其他的值的权重大一倍 。权重值默认为1
The boost is applied only for term queries (prefix, range and fuzzy queries are not boosted).
示例
数据如下:
{ "_index": "forum", "_type": "article", "_id": "5", "_score": 1, "_source": { "articleID": "DHJK-B-1395-#Ky5", "userID": 3, "hidden": false, "postDate": "2019-05-01", "tag": [ "elasticsearch" ], "tag_cnt": 1, "view_cnt": 10, "title": "this is spark blog" } }, { "_index": "forum", "_type": "article", "_id": "2", "_score": 1, "_source": { "articleID": "KDKE-B-9947-#kL5", "userID": 1, "hidden": false, "postDate": "2017-01-02", "tag": [ "java" ], "tag_cnt": 1, "view_cnt": 50, "title": "this is java blog" } }, { "_index": "forum", "_type": "article", "_id": "4", "_score": 1, "_source": { "articleID": "QQPX-R-3956-#aD8", "userID": 2, "hidden": true, "postDate": "2017-01-02", "tag": [ "java", "elasticsearch" ], "tag_cnt": 2, "view_cnt": 80, "title": "this is java, elasticsearch, hadoop blog" } }, { "_index": "forum", "_type": "article", "_id": "1", "_score": 1, "_source": { "articleID": "XHDK-A-1293-#fJ3", "userID": 1, "hidden": false, "postDate": "2017-01-01", "tag": [ "java", "hadoop" ], "tag_cnt": 2, "view_cnt": 30, "title": "this is java and elasticsearch blog" } }, { "_index": "forum", "_type": "article", "_id": "3", "_score": 1, "_source": { "articleID": "JODL-X-1937-#pV7", "userID": 2, "hidden": false, "postDate": "2017-01-01", "tag": [ "hadoop" ], "tag_cnt": 1, "view_cnt": 100, "title": "this is elasticsearch blog" } }
需求: 搜索标题中必须包含blog的帖子,同时如果标题中包含java或elasticsearch或hadoop或spark也要搜索出来,同时如果一个帖子包含spark,包含spark的帖子要优先其他帖子搜索出来
需求实现DSL如下:
GET /forum/article/_search { "query": { "bool": { "must": { "match": { "title": "blog" } }, "should": [ { "match": { "title": { "query": "java" } } }, { "match": { "title": { "query": "elasticsearch" } } }, { "match": { "title": { "query": "hadoop" } } }, { "match": { "title": { "query": "spark", "boost": 5 } } } ] } } }
返回结果 :
{ "took": 5, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 5, "max_score": 1.7260925, "hits": [ { "_index": "forum", "_type": "article", "_id": "5", "_score": 1.7260925, "_source": { "articleID": "DHJK-B-1395-#Ky5", "userID": 3, "hidden": false, "postDate": "2019-05-01", "tag": [ "elasticsearch" ], "tag_cnt": 1, "view_cnt": 10, "title": "this is spark blog" } }, { "_index": "forum", "_type": "article", "_id": "4", "_score": 1.6185135, "_source": { "articleID": "QQPX-R-3956-#aD8", "userID": 2, "hidden": true, "postDate": "2017-01-02", "tag": [ "java", "elasticsearch" ], "tag_cnt": 2, "view_cnt": 80, "title": "this is java, elasticsearch, hadoop blog" } }, { "_index": "forum", "_type": "article", "_id": "1", "_score": 0.8630463, "_source": { "articleID": "XHDK-A-1293-#fJ3", "userID": 1, "hidden": false, "postDate": "2017-01-01", "tag": [ "java", "hadoop" ], "tag_cnt": 2, "view_cnt": 30, "title": "this is java and elasticsearch blog" } }, { "_index": "forum", "_type": "article", "_id": "3", "_score": 0.5753642, "_source": { "articleID": "JODL-X-1937-#pV7", "userID": 2, "hidden": false, "postDate": "2017-01-01", "tag": [ "hadoop" ], "tag_cnt": 1, "view_cnt": 100, "title": "this is elasticsearch blog" } }, { "_index": "forum", "_type": "article", "_id": "2", "_score": 0.3971361, "_source": { "articleID": "KDKE-B-9947-#kL5", "userID": 1, "hidden": false, "postDate": "2017-01-02", "tag": [ "java" ], "tag_cnt": 1, "view_cnt": 50, "title": "this is java blog" } } ] } }
可以看到spark的帖子,相关度得分最高,排在了第一位。
搜索条件的权重,boost,可以将某个搜索条件的权重加大,此时当匹配这个搜索条件和匹配另一个搜索条件的document,计算relevance score时,匹配权重更大的搜索条件的document,relevance score会更高,当然也就会优先被返回回来
我们如果把boost去掉会怎样呢? 来看下
GET /forum/article/_search { "query": { "bool": { "must": { "match": { "title": "blog" } }, "should": [ { "match": { "title": { "query": "java" } } }, { "match": { "title": { "query": "elasticsearch" } } }, { "match": { "title": { "query": "hadoop" } } }, { "match": { "title": { "query": "spark" } } } ] } } }
返回:
{ "took": 11, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 5, "max_score": 1.6185135, "hits": [ { "_index": "forum", "_type": "article", "_id": "4", "_score": 1.6185135, "_source": { "articleID": "QQPX-R-3956-#aD8", "userID": 2, "hidden": true, "postDate": "2017-01-02", "tag": [ "java", "elasticsearch" ], "tag_cnt": 2, "view_cnt": 80, "title": "this is java, elasticsearch, hadoop blog" } }, { "_index": "forum", "_type": "article", "_id": "1", "_score": 0.8630463, "_source": { "articleID": "XHDK-A-1293-#fJ3", "userID": 1, "hidden": false, "postDate": "2017-01-01", "tag": [ "java", "hadoop" ], "tag_cnt": 2, "view_cnt": 30, "title": "this is java and elasticsearch blog" } }, { "_index": "forum", "_type": "article", "_id": "5", "_score": 0.5753642, "_source": { "articleID": "DHJK-B-1395-#Ky5", "userID": 3, "hidden": false, "postDate": "2019-05-01", "tag": [ "elasticsearch" ], "tag_cnt": 1, "view_cnt": 10, "title": "this is spark blog" } }, { "_index": "forum", "_type": "article", "_id": "3", "_score": 0.5753642, "_source": { "articleID": "JODL-X-1937-#pV7", "userID": 2, "hidden": false, "postDate": "2017-01-01", "tag": [ "hadoop" ], "tag_cnt": 1, "view_cnt": 100, "title": "this is elasticsearch blog" } }, { "_index": "forum", "_type": "article", "_id": "2", "_score": 0.3971361, "_source": { "articleID": "KDKE-B-9947-#kL5", "userID": 1, "hidden": false, "postDate": "2017-01-02", "tag": [ "java" ], "tag_cnt": 1, "view_cnt": 50, "title": "this is java blog" } } ] } }
spark的帖子并没有优先展示出来 ,可见boost权重确实起了作用。