白日梦的Elasticsearch系列笔记（一）基础篇-- 快手上手ES （三）-阿里云开发者社区

白日梦的Elasticsearch系列笔记（一）基础篇-- 快手上手ES （三）

2022-05-14 212

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

RDS MySQL Serverless 基础系列，0.5-2RCU 50GB

云数据库 RDS MySQL，集群系列 2核4GB

检索分析服务 Elasticsearch 版，2核4GB开发者规格 1个月

简介： 白日梦的Elasticsearch系列笔记（一）基础篇-- 快手上手ES （三）

四、document api#

4.1、search#

检索所有索引下面的所有数据

/_search

搜索指定索引下的所有数据

/index/_search

更多模式

/index1/index2/_search
/*1/*2/_search
/index1/index2/type1/type2/_search
/_all/type1/type2/_search

4.2、_mget api 批量查询#

mget是ES为我们提供的批量查询的API，我们只需要制定好 index、type、id。ES会将命中的记录批量返回给我们。

在docs中指定_index，_type，_id

GET /_mget
{
    "docs" : [
        {
            "_index" : "test"，
            "_type" : "_doc"，
            "_id" : "1"
        }，
        {
            "_index" : "test"，
            "_type" : "_doc"，
            "_id" : "2"
        }
    ]
}

在URL中指定index

GET /test/_mget
{
    "docs" : [
        {
            "_type" : "_doc"，
            "_id" : "1"
        }，
        {
            "_type" : "_doc"，
            "_id" : "2"
        }
    ]
}

在URL中指定 index和type

GET /test/type/_mget
{
    "docs" : [
        {
            "_id" : "1"
        }，
        {
            "_id" : "2"
        }

在URL中指定index和type，并使用ids指定id范围

GET /test/type/_mget
{
    "ids" : ["1"， "2"]
}

为不同的doc指定不同的过滤规则

GET /_mget
{
    "docs" : [
        {
            "_index" : "test"，
            "_type" : "_doc"，
            "_id" : "1"，
            "_source" : false
        }，
        {
            "_index" : "test"，
            "_type" : "_doc"，
            "_id" : "2"，
            "_source" : ["field3"， "field4"]
        }，
        {
            "_index" : "test"，
            "_type" : "_doc"，
            "_id" : "3"，
            "_source" : {
                "include": ["user"]，
                "exclude": ["user.location"]
            }
        }
    ]
}

4.3、_bulk api 批量增删改#

4.3.1、基本语法#

{"action":{"metadata"}}\n
{"data"}\n

存在哪些类型的操作可以执行呢?

delete: 删除文档。
create: _create 强制创建。
index: 表示普通的put操作，可以是创建文档也可以是全量替换文档。
update: 局部替换。

上面的语法中并不是人们习惯阅读的json格式，但是这种单行形式的json更具备高效的优势。

ES如何处理普通的json如下:

将json数组转换为JSONArray对象，这就意味着内存中会出现一份一模一样的拷贝，一份是json文本，一份是JSONArray对象。

但是如果上面的单行JSON，ES直接进行切割使用，不会在内存中整一个数据拷贝出来。

4.3.2、delete#

delete比较好看仅仅需要一行json就ok

{ "delete" : { "_index" : "test"， "_type" : "_doc"， "_id" : "2" } }

4.3.3、create#

两行json，第一行指明我们要创建的json的index，type以及id

第二行指明我们要创建的doc的数据

{ "create" : { "_index" : "test"， "_type" : "_doc"， "_id" : "3" } }
{ "field1" : "value3" }

4.3.4、index#

相当于是PUT，可以实现新建或者是全量替换，同样是两行json。

第一行表示将要新建或者是全量替换的json的index type 以及 id。

第二行是具体的数据。

{ "index" : { "_index" : "test"， "_type" : "_doc"， "_id" : "1" } }
{ "field1" : "value1" }

4.3.5、update#

表示 parcial update，局部替换。

它可以指定一个retry_on_conflict的特性，表示可以重试3次。

POST _bulk
{ "update" : {"_id" : "1"， "_type" : "_doc"， "_index" : "index1"， "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"} }
{ "update" : { "_id" : "0"， "_type" : "_doc"， "_index" : "index1"， "retry_on_conflict" : 3} }
{ "script" : { "source": "ctx._source.counter += params.param1"， "lang" : "painless"， "params" : {"param1" : 1}}， "upsert" : {"counter" : 1}}
{ "update" : {"_id" : "2"， "_type" : "_doc"， "_index" : "index1"， "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"}， "doc_as_upsert" : true }
{ "update" : {"_id" : "3"， "_type" : "_doc"， "_index" : "index1"， "_source" : true} }
{ "doc" : {"field" : "value"} }
{ "update" : {"_id" : "4"， "_type" : "_doc"， "_index" : "index1"} }
{ "doc" : {"field" : "value"}， "_source": true}

4.4、滚动查询技术#

如果你想一次性查询好几万条数据，这么庞大的数据量，ES性能肯定会受到影响。这时可以选择使用滚动查询（scroll）。一批一批的查询，直到所有的数据被查询完成。也就是说它会先搜索一批数据再搜索一批数据。

示例如下：每次发送一次scroll请求，我们还需要指定一个scroll需要的参数：一个时间窗口，每次搜索只要在这个时间窗口内完成就ok。

GET /index/type/_search?scroll=1m
{
    "query":{
        "match_all":{}
    }，
    "sort":["_doc"]，
    "size":3
}

响应

{
  "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAACNFlJmWHZLTkFhU0plbzlHX01LU2VzUXcAAAAAAAAAkRZSZlh2S05BYVNKZW85R19NS1Nlc1F3AAAAAAAAAI8WUmZYdktOQWFTSmVvOUdfTUtTZXNRdwAAAAAAAACQFlJmWHZLTkFhU0plbzlHX01LU2VzUXcAAAAAAAAAjhZSZlh2S05BYVNKZW85R19NS1Nlc1F3"，
  "took": 9，
  "timed_out": false，
  "_shards": {
    "total": 5，
    "successful": 5，
    "skipped": 0，
    "failed": 0
  }，
  "hits": {
    "total": 2，
    "max_score": null，
    "hits": [
      {
        "_index": "my_index"，
        "_type": "_doc"，
        "_id": "2"，
        "_score": null，
        "_source": {
          "title": "This is another document"，
          "body": "This document has a body"
        }，
        "sort": [
          0
        ]
      }，
      {
        "_index": "my_index"，
        "_type": "_doc"，
        "_id": "1"，
        "_score": null，
        "_source": {
          "title": "This is a document"
        }，
        "sort": [
          0
        ]
      }
·    ]
  }
}

查询下一批数据时，需要携带上一次scroll返回给我们的_scroll_id再次滚动查询

GET /_search/scroll
{
    "scroll":"1m"，
    "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAACNFlJmWHZLTkFhU0plbzlHX01LU2VzUXcAAAAAAAAAkRZSZlh2S05BYVNKZW85R19NS1Nlc1F3AAAAAAAAAI8WUmZYdktOQWFTSmVvOUdfTUtTZXNRdwAAAAAAAACQFlJmWHZLTkFhU0plbzlHX01LU2VzUXcAAAAAAAAAjhZSZlh2S05BYVNKZW85R19NS1Nlc1F3"
}

滚动查询时，如果采用基于_doc的排序方式会获得较高的性能。

一、_search api 搜索api

1.1、query string search

1.2、query dsl 20个查询案例

1.3、其它辅助API

1.4、聚合分析

1.4.1、filter aggregate

1.4.2、嵌套聚合-广度优先

1.4.3、global aggregation

1.4.4、Cardinality Aggregate 基数聚合

1.4.5、控制聚合的升降序

1.4.6、Percentiles Aggregation

二、优化相关性得分与查询技巧

2.1、优化技巧1

2.2、优化技巧2

2.3、优化技巧3

2.4、优化技巧4

2.5、优化技巧5

2.6、优化技巧6

2.7、优化技巧7

白日梦的Elasticsearch系列笔记（一）基础篇-- 快手上手ES （三）

四、document api#

4.1、search#

4.2、_mget api 批量查询#

4.3、_bulk api 批量增删改#

4.3.1、基本语法#

4.3.2、delete#

4.3.3、create#

4.3.4、index#

4.3.5、update#

4.4、滚动查询技术#

五、下一篇目录：#

推荐阅读（公众号首发，欢迎关注白日梦）#

热门文章

最新文章

相关课程

相关电子书

相关实验场景

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

白日梦的Elasticsearch系列笔记（一）基础篇-- 快手上手ES （三）

四、document api#

4.1、search#

4.2、_mget api 批量查询#

4.3、_bulk api 批量增删改#

4.3.1、基本语法#

4.3.2、delete#

4.3.3、create#

4.3.4、index#

4.3.5、update#

4.4、滚动查询技术#

五、下一篇目录：#

推荐阅读（公众号首发，欢迎关注白日梦）#

热门文章

最新文章

相关课程

相关电子书

相关实验场景