【Elastic Engineering】Elasticsearch:Index boost

本文涉及的产品
检索分析服务 Elasticsearch 版,2核4GB开发者规格 1个月
简介: Elasticsearch:Index boost

作者:刘晓国


搜索多个索引时,你可以使用 indices_boost 参数来提升一个或多个指定索引的结果。 当来自某些索引的命中比来自其他索引的命中更重要时,这很有用。


注意:你不能对数据流使用 indices_boost。

下面,我来用一个例子来展示如何使用 indices_boost 来针对一些索引进行 boostimage.png

例子


在今天的例子中,我们使用一个 twitter 的索引来进行展示。由于这个索引含有位置信息,所有,我们必须首先定义一个关于这个索引 bookdb_index 的 mapping,这样便于我们在导入数据时,location 是我们正确需要的 geo_point 数据类型:

PUT twitter
{
  "mappings": {
    "properties": {
      "location": {
        "type": "geo_point"
      }
    }
  }
}

通过上面的命令,我们就创建了一个叫做 bookdb_index 的索引。我们接着使用 bulk API 来导入我们的数据:

POST _bulk
{ "index" : { "_index" : "twitter", "_id": 1} }
{"user":"双榆树-张三","message":"今儿天气不错啊,出去转转去","uid":2,"age":20,"city":"北京","province":"北京","country":"中国","address":"中国北京市海淀区","location":{"lat":"39.970718","lon":"116.325747"}}
{ "index" : { "_index" : "twitter", "_id": 2} }
{"user":"虹桥-老吴","message":"好友来了都今天我生日,好友来了,什么 birthday happy 就成!","uid":2,"age":90,"city":"上海","province":"上海","country":"中国","address":"中国上海市闵行区","location":{"lat":"31.175927","lon":"121.383328"}}
{ "index" : { "_index" : "twitter", "_id": 3} }
{"user":"东城区-李四","message":"happy birthday!","uid":4,"age":30,"city":"北京","province":"北京","country":"中国","address":"中国北京市东城区","location":{"lat":"39.893801","lon":"116.408986"}}

在上面, 我使用了 3 个索引数据。为了方便,我们使用 reindex API 来把上面的 twitter 索引导入到另外一个叫做 twitter1 的索引中。

PUT twitter1
{
  "mappings": {
    "properties": {
      "location": {
        "type": "geo_point"
      }
    }
  }
}
POST _reindex
{
  "source": {
    "index": "twitter"
  },
  "dest": {
    "index": "twitter1"
  }
}

这样 twitter1 里含有和 twitter 一模一样的三个文档。


接着我们,做如下的搜索:

GET twitter*/_search
{
  "indices_boost": [
    {
      "twitter": 10.0
    },
    {
      "twitter": 2.0
    }
  ]
}

在上面, 我们给 twitter 索引加权 10.0,而对 twitter1 的索引加权为 2.0。上面的搜索结果为:

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 10.0,
        "_source" : {
          "user" : "双榆树-张三",
          "message" : "今儿天气不错啊,出去转转去",
          "uid" : 2,
          "age" : 20,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市海淀区",
          "location" : {
            "lat" : "39.970718",
            "lon" : "116.325747"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 10.0,
        "_source" : {
          "user" : "虹桥-老吴",
          "message" : "好友来了都今天我生日,好友来了,什么 birthday happy 就成!",
          "uid" : 2,
          "age" : 90,
          "city" : "上海",
          "province" : "上海",
          "country" : "中国",
          "address" : "中国上海市闵行区",
          "location" : {
            "lat" : "31.175927",
            "lon" : "121.383328"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 10.0,
        "_source" : {
          "user" : "东城区-李四",
          "message" : "happy birthday!",
          "uid" : 4,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区",
          "location" : {
            "lat" : "39.893801",
            "lon" : "116.408986"
          }
        }
      },
      {
        "_index" : "twitter1",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "user" : "双榆树-张三",
          "message" : "今儿天气不错啊,出去转转去",
          "uid" : 2,
          "age" : 20,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市海淀区",
          "location" : {
            "lat" : "39.970718",
            "lon" : "116.325747"
          }
        }
      },
      {
        "_index" : "twitter1",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "user" : "虹桥-老吴",
          "message" : "好友来了都今天我生日,好友来了,什么 birthday happy 就成!",
          "uid" : 2,
          "age" : 90,
          "city" : "上海",
          "province" : "上海",
          "country" : "中国",
          "address" : "中国上海市闵行区",
          "location" : {
            "lat" : "31.175927",
            "lon" : "121.383328"
          }
        }
      },
      {
        "_index" : "twitter1",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "user" : "东城区-李四",
          "message" : "happy birthday!",
          "uid" : 4,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区",
          "location" : {
            "lat" : "39.893801",
            "lon" : "116.408986"
          }
        }
      }
    ]

从上面的结果中,我们可以看出来所有 twitter 中的文档都排在前面,而 twitter1 中的文档排在后面。


另外,也可以使用别名和索引模式。我们来创建如下的别名:

PUT twitter/_alias/city_shanghai
{
  "filter": [
    {
      "term": {
        "city.keyword": "上海"
      }
    }
  ]
}

上面定义了一个叫做 city_shanghai 的别名。我们接下来做如下的搜索:

GET twitter*/_search
{
  "indices_boost": [
    {
      "city_shanghai": 10.0
    },
    {
      "twitter1": 2.0
    }
  ],
  "query": {
    "match": {
      "country": "中国"
    }
  }
}

上面的搜索结果是:

    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 2.6706278,
        "_source" : {
          "user" : "双榆树-张三",
          "message" : "今儿天气不错啊,出去转转去",
          "uid" : 2,
          "age" : 20,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市海淀区",
          "location" : {
            "lat" : "39.970718",
            "lon" : "116.325747"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 2.6706278,
        "_source" : {
          "user" : "虹桥-老吴",
          "message" : "好友来了都今天我生日,好友来了,什么 birthday happy 就成!",
          "uid" : 2,
          "age" : 90,
          "city" : "上海",
          "province" : "上海",
          "country" : "中国",
          "address" : "中国上海市闵行区",
          "location" : {
            "lat" : "31.175927",
            "lon" : "121.383328"
          }
        }
      },
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 2.6706278,
        "_source" : {
          "user" : "东城区-李四",
          "message" : "happy birthday!",
          "uid" : 4,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区",
          "location" : {
            "lat" : "39.893801",
            "lon" : "116.408986"
          }
        }
      },
      {
        "_index" : "twitter1",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.53412557,
        "_source" : {
          "user" : "双榆树-张三",
          "message" : "今儿天气不错啊,出去转转去",
          "uid" : 2,
          "age" : 20,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市海淀区",
          "location" : {
            "lat" : "39.970718",
            "lon" : "116.325747"
          }
        }
      },
      {
        "_index" : "twitter1",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.53412557,
        "_source" : {
          "user" : "虹桥-老吴",
          "message" : "好友来了都今天我生日,好友来了,什么 birthday happy 就成!",
          "uid" : 2,
          "age" : 90,
          "city" : "上海",
          "province" : "上海",
          "country" : "中国",
          "address" : "中国上海市闵行区",
          "location" : {
            "lat" : "31.175927",
            "lon" : "121.383328"
          }
        }
      },
      {
        "_index" : "twitter1",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.53412557,
        "_source" : {
          "user" : "东城区-李四",
          "message" : "happy birthday!",
          "uid" : 4,
          "age" : 30,
          "city" : "北京",
          "province" : "北京",
          "country" : "中国",
          "address" : "中国北京市东城区",
          "location" : {
            "lat" : "39.893801",
            "lon" : "116.408986"
          }
        }
      }
    ]

如果找到多个匹配项,将使用第一个匹配项。 例如,如果一个索引包含在 别名 中并且与 twitter* 模式匹配,则应用 10.0 的提升值。


相关实践学习
使用阿里云Elasticsearch体验信息检索加速
通过创建登录阿里云Elasticsearch集群,使用DataWorks将MySQL数据同步至Elasticsearch,体验多条件检索效果,简单展示数据同步和信息检索加速的过程和操作。
ElasticSearch 入门精讲
ElasticSearch是一个开源的、基于Lucene的、分布式、高扩展、高实时的搜索与数据分析引擎。根据DB-Engines的排名显示,Elasticsearch是最受欢迎的企业搜索引擎,其次是Apache Solr(也是基于Lucene)。 ElasticSearch的实现原理主要分为以下几个步骤: 用户将数据提交到Elastic Search 数据库中 通过分词控制器去将对应的语句分词,将其权重和分词结果一并存入数据 当用户搜索数据时候,再根据权重将结果排名、打分 将返回结果呈现给用户 Elasticsearch可以用于搜索各种文档。它提供可扩展的搜索,具有接近实时的搜索,并支持多租户。
相关文章
|
2月前
|
API 索引
Elasticsearch Index Shard Allocation 索引分片分配策略
Elasticsearch Index Shard Allocation 索引分片分配策略
75 1
|
10月前
|
存储 缓存 自然语言处理
Elasticsearch倒排索引(二)深入Term Index
Elasticsearch倒排索引(二)深入Term Index
226 0
|
11月前
|
API
Elasticsearch - cluster_block_exception [FORBIDDEN/12/index read-only / allow delete (api)]
Elasticsearch - cluster_block_exception [FORBIDDEN/12/index read-only / allow delete (api)]
84 0
Elasticsearch - cluster_block_exception [FORBIDDEN/12/index read-only / allow delete (api)]
|
11月前
|
分布式计算 Java Hadoop
白话Elasticsearch08-深度探秘搜索技术之基于boost的细粒度搜索条件权重控制
白话Elasticsearch08-深度探秘搜索技术之基于boost的细粒度搜索条件权重控制
114 0
|
监控 Java 关系型数据库
Elasticsearch之索引管理API(Index management)
Elasticsearch之索引管理API(Index management)
Elasticsearch之索引管理API(Index management)
|
运维 Linux API
【ElasticSearch实战】——ElasticSearch6 报错FORBIDDEN/12/index read-only / allow delete (api)
【ElasticSearch实战】——ElasticSearch6 报错FORBIDDEN/12/index read-only / allow delete (api)
192 0
|
存储 缓存 监控
Elasticsearch Index Monitoring(索引监控)之Index Stats API详解
Elasticsearch Index Monitoring(索引监控)之Index Stats API详解
Elasticsearch Index Monitoring(索引监控)之Index Stats API详解
|
Java API 索引
Elasticsearch Index Templates(索引模板)
Elasticsearch Index Templates(索引模板)
|
存储 Java API
Elasticsearch Index Aliases详解
Elasticsearch Index Aliases详解
Elasticsearch Index Aliases详解

相关产品

  • 检索分析服务 Elasticsearch版