《Elastic Stack 实战手册》——三、产品能力——3.4.入门篇——3.4.2.Elasticsearch基础应用——3.4.2.17.Text analysis, settings 及 mappings——3.4.2.17.3.全文搜索/精确搜索(11) https://developer.aliyun.com/article/1229930
现在进行 match_phrase 查询 "this is test"。
GET my-index-000001/_search { "query": { "match_phrase": { "message": { "query": "this is test", "slop": 0 } } } } # 返回结果只有文档 1 { ······ "hits" : [ { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "1", "_score" : 0.26582208, "_source" : { "message" : "this is test" } } ] } }
在这个查询中,被匹配返回的文档 1 符合了下面的条件:
1、this、is、test 三个词项都出现在文档中;
2、is 的位置比 test 的位置大 1 ,两者的 slop 为 0;
3、test 的位置比 is 的位置大 1 。
# 将 slop 设置为 1,词项之间的位置差小于等于 2 即可匹配返回。 GET my-index-000001/_search { "query": { "match_phrase": { "message": { "query": "this is test", "slop": 1 } } } } # 返回文档 1 和 2 { ······ "hits" : [ { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "1", "_score" : 0.26582208, "_source" : { "message" : "this is test" } }, { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "2", "_score" : 0.16089231, "_source" : { "message" : "this is a test" } } ] } } # 将 slop 参数设置为4 GET my-index-000001/_search { "query": { "match_phrase": { "message": { "query": "this is test", "slop": 4 } } } } # 返回了全部 5 个文档 { ······ "hits" : [ { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "1", "_score" : 0.26582208, "_source" : { "message" : "this is test" } }, { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "2", "_score" : 0.16089231, "_source" : { "message" : "this is a test" } }, { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "3", "_score" : 0.106328845, "_source" : { "message" : "this is not a test" } }, { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "5", "_score" : 0.07501727, "_source" : { "message" : "this a is not a test" } }, { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "4", "_score" : 0.055580944, "_source" : { "message" : "this a or is not a test" } } ] } }
对于词项换位( transposed terms )的情况,即则 slop 需要设置成原词项距离加 2 。
比如下面的查询:
PUT my-index-000001 {"mappings":{"properties":{"message":{"type":"text"}}}} POST my-index-000001/_bulk { "index": { "_id": 1 }} { "message": "test this" } { "index": { "_id": 2 }} { "message": "this test" } { "index": { "_id": 3 }} { "message": "test in this" }
想要使用 ”this test“ 匹配到文档 1,则需要将 slop 设置为 2 ;而想匹配到文档 3 ,则需要将 slop 设置为 3.
GET my-index-000001/_search { "query": { "match_phrase": { "message": { "query": "this test", "slop": 2 } } } } # 查询结果为文档 1 和 2 { ...... "hits" : [ { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "2", "_score" : 0.28363907, "_source" : { "message" : "this test" } }, { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "1", "_score" : 0.13941583, "_source" : { "message" : "test this" } } ] } } GET my-index-000001/_search { "query": { "match_phrase": { "message": { "query": "this test", "slop": 3 } } } } # 查询结果中有文档 3 { ...... "hits" : [ { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "3", "_score" : 0.08604115, "_source" : { "message" : "test in this" } } ] } }
match_phrase 的查询结果也随着分词器的不同而变化。
《Elastic Stack 实战手册》——三、产品能力——3.4.入门篇——3.4.2.Elasticsearch基础应用——3.4.2.17.Text analysis, settings 及 mappings——3.4.2.17.3.全文搜索/精确搜索(13) https://developer.aliyun.com/article/1229926