《Elastic Stack 实战手册》——三、产品能力——3.4.入门篇——3.4.2.Elasticsearch基础应用——3.4.2.17.Text analysis, settings 及 mappings——3.4.2.17.3.全文搜索/精确搜索(3) https://developer.aliyun.com/article/1229941
三、基于词项的查询方法
Term 类查询主要有 term/terms/terms_set/wildcard/range/fuzzy/prefix/regexp/ids/exists 十个主要查询方法。
3.1 term
日常用法如下:
PUT term-query POST term-query/_mapping {"properties":{"user.id":{"type":"text"}}} POST term-query/_bulk { "index": { "_id": 1 }} { "user.id": "kimchy" } { "index": { "_id": 2 }} { "user.id": "elkbee" } GET term-query/_search { "query": { "term": { "user.id": { "value": "kimchy", "boost": 1.0 } } } }
主要参数:
boost:用于减少或增加查询的相关分数的浮点数值。默认为1.0。可以使用 boost 参数来调整包含两个或多个查询的搜索的相关性分数。Boost 值相对于默认值1.0。0到1.0之间的升压值会降低相关性得分。大于1.0的值会增加相关分数.
利用缓存查询
使用建议:可以通过 Constant Score 将查询转换成⼀个 Filter,避免算分,并利⽤缓存,提⾼性能。
# 原查询 GET my-index-000001/_search { "explain": true, "query": { "term": { "full_text": "foxes" } } } # 相关返回 { ...... "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 0.18232156, "hits" : [ { "_shard" : "[my-index-000001][0]", "_node" : "egEhfXrCTrycdbAwEy9z3Q", "_index" : "my-index-000001", "_type" : "_doc", "_id" : "1", "_score" : 0.18232156, "_source" : { "full_text" : "quick brown foxes!" }, "_explanation" : { "value" : 0.18232156, "description" : "weight(full_text:foxes in 0) [PerFieldSimilarity], result of:", "details" : [ { "value" : 0.18232156, "description" : "score(freq=1.0), computed as boost * idf * tf from:", "details" : [ { "value" : 2.2, "description" : "boost", "details" : [ ] }, { "value" : 0.18232156, "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:", "details" : [ { "value" : 2, "description" : "n, number of documents containing term", "details" : [ ] }, { "value" : 2, "description" : "N, total number of documents with field", "details" : [ ] } ] }, { "value" : 0.45454544, "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:", "details" : [ { "value" : 1.0, "description" : "freq, occurrences of term within document", "details" : [ ] }, { "value" : 1.2, "description" : "k1, term saturation parameter", "details" : [ ] }, { "value" : 0.75, "description" : "b, length normalization parameter", "details" : [ ] }, { "value" : 3.0, "description" : "dl, length of field", "details" : [ ] }, { "value" : 3.0, "description" : "avgdl, average length of field", "details" : [ ] } ] } ] } ] } }, { "_shard" : "[my-index-000001][0]", "_node" : "egEhfXrCTrycdbAwEy9z3Q", "_index" : "my-index-000001", "_type" : "_doc", "_id" : "2", "_score" : 0.18232156, "_source" : { "full_text" : "Quick Foxes Brown !" }, "_explanation" : { ...... } ] } ] } } ] } } # 利用 constant score filter POST my-index-000001/_search { "explain": true, "query": { "constant_score": { "filter": { "term": { "full_text": "foxes" } } } } } # 相关返回 { ...... "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_shard" : "[my-index-000001][0]", "_node" : "egEhfXrCTrycdbAwEy9z3Q", "_index" : "my-index-000001", "_type" : "_doc", "_id" : "1", "_score" : 1.0, "_source" : { "full_text" : "quick brown foxes!" }, "_explanation" : { "value" : 1.0, "description" : "ConstantScore(full_text:foxes)", "details" : [ ] } }, { "_shard" : "[my-index-000001][0]", "_node" : "egEhfXrCTrycdbAwEy9z3Q", "_index" : "my-index-000001", "_type" : "_doc", "_id" : "2", "_score" : 1.0, "_source" : { "full_text" : "Quick Foxes Brown !" }, "_explanation" : { "value" : 1.0, "description" : "ConstantScore(full_text:foxes)", "details" : [ ] } } ] } }
可以明显看到在 _explanation 中,使用constant score 进行 filter 的算分逻辑明显简单了很多。
3.2 terms
terms 查询与 term 查询其实是一样的,只是 terms 查询可以对一个字段同时查询多个词项。
相关使用如下:
GET term-query/_search { "query": { "terms": { "user.id": [ "kimchy", "elkbee" ], "boost": 1.0 } } }
terms lookup
Term lookup 是一种参照某个索引文档的某一字段内容去搜索拥有同样值的文档方法。它可以获取现有文档的字段值,然后 ES 使用这些值作为搜索词去 terms 搜索。
它有两个使用限制:
1、使用 terms lookup ,_source 设置为 enabled(默认是开启的)。
2、不能在跨集群搜索上进行 terms lookup。
使用方法:
GET _search?pretty { "query": { "terms": { "color" : { "index" : "my-index-000001", "id" : "2", "path" : "color" } } } }
其中 path 是被参照文档的具体字段,在一些对象字段或者 nested 字段中,可以以“field.subfield” 的形式查询。
实践:
# 创建 my-index-000001 并设置 color 字段属性为 keyword。 PUT my-index-000001 { "mappings": { "properties": { "color": { "type": "keyword" } } } } # 创建测试文档,注意文档 3 与其它两个文档并没有内容交集。 PUT my-index-000001/_doc/1 { "color": ["blue", "green"] } PUT my-index-000001/_doc/2 { "color": "blue" } PUT my-index-000001/_doc/3 { "color": "red" } # 测试,使用文档 2 进行 terms lookup 查询,理论上会将文档 1 和 2 返回。 GET my-index-000001/_search?pretty { "query": { "terms": { "color" : { "index" : "my-index-000001", "id" : "2", "path" : "color" } } } } # 返回结果 { "took" : 24, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "1", "_score" : 1.0, "_source" : { "color" : [ "blue", "green" ] } }, { "_index" : "my-index-000001", "_type" : "_doc", "_id" : "2", "_score" : 1.0, "_source" : { "color" : "blue" } } ] } }
《Elastic Stack 实战手册》——三、产品能力——3.4.入门篇——3.4.2.Elasticsearch基础应用——3.4.2.17.Text analysis, settings 及 mappings——3.4.2.17.3.全文搜索/精确搜索(5) https://developer.aliyun.com/article/1229939